BRIDGING THE GAP BETWEEN IMAGE CODING FOR MACHINES AND HUMANS

Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu
Nokia Technologies, Tampere University
IEEE International Conference in Image Processing (ICIP) 2022

Below is a table of the output samples from different codecs:
  1. Base model: The end-to-end trained model without finetuning
  2. Gaussian post-processing filter: The decoded image is processed using a Gaussian filter (requires extra component)
  3. Bilateral post-processing filter: The decoded image is processed using a Bilateral filter (requires extra component)
  4. Finetuned model: The main proposal of this paper, which is the base codec finetuned using a GANs-based approach
  5. Limitedly finetuned model: Using the same approach as the finetuned codec, but with attenuated dynamics from the adversarial component
The input images are from the Open Images dataset.
Base model
Gaussian post-processing filter
Bilateral post-processing filter
Finetuned model
Limitedly finetuned model
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI
base
gaussian
bilateral
gan
gan_LI

BibTeX: @INPROCEEDINGS{le2022_bridging, author={Le, Nam and Zhang, Honglei and Cricri, Francesco and Youvalari, Ramin G. and Rezazadegan Tavakoli, Hamed and Aksu, Emre and Hannuksela, Miska M. and Rahtu, Esa}, booktitle={2022 IEEE International Conference on Image Processing (ICIP)}, title={Bridging the Gap Between Image Coding for Machines and Humans}, year={2022}, pages={3411-3415}, doi={10.1109/ICIP46576.2022.9897916}}