Eigen-distortions of hierarchical representations

Alexander Berardino, Johannes Ballé, Valero Laparra, Eero Simoncelli

Research output: Contribution to journalConference article

Abstract

We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.

Original languageEnglish (US)
Pages (from-to)3531-3540
Number of pages10
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - Jan 1 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

Fingerprint

Fisher information matrix
Neural networks
Gain control
Object recognition
Eigenvalues and eigenfunctions
Image quality
Processing
Deep neural networks

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Eigen-distortions of hierarchical representations. / Berardino, Alexander; Ballé, Johannes; Laparra, Valero; Simoncelli, Eero.

In: Advances in Neural Information Processing Systems, Vol. 2017-December, 01.01.2017, p. 3531-3540.

Research output: Contribution to journalConference article

Berardino, A, Ballé, J, Laparra, V & Simoncelli, E 2017, 'Eigen-distortions of hierarchical representations', Advances in Neural Information Processing Systems, vol. 2017-December, pp. 3531-3540.
Berardino, Alexander ; Ballé, Johannes ; Laparra, Valero ; Simoncelli, Eero. / Eigen-distortions of hierarchical representations. In: Advances in Neural Information Processing Systems. 2017 ; Vol. 2017-December. pp. 3531-3540.
@article{356fb666d6d540a9b9657d62f031b616,
title = "Eigen-distortions of hierarchical representations",
abstract = "We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.",
author = "Alexander Berardino and Johannes Ball{\'e} and Valero Laparra and Eero Simoncelli",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "2017-December",
pages = "3531--3540",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

TY - JOUR

T1 - Eigen-distortions of hierarchical representations

AU - Berardino, Alexander

AU - Ballé, Johannes

AU - Laparra, Valero

AU - Simoncelli, Eero

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.

AB - We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity in humans. Specifically, we utilize Fisher information to establish a model-derived prediction of sensitivity to local perturbations of an image. For a given image, we compute the eigenvectors of the Fisher information matrix with largest and smallest eigenvalues, corresponding to the model-predicted most- and least-noticeable image distortions, respectively. For human subjects, we then measure the amount of each distortion that can be reliably detected when added to the image. We use this method to test the ability of a variety of representations to mimic human perceptual sensitivity. We find that the early layers of VGG16, a deep neural network optimized for object recognition, provide a better match to human perception than later layers, and a better match than a 4-stage convolutional neural network (CNN) trained on a database of human ratings of distorted image quality. On the other hand, we find that simple models of early visual processing, incorporating one or more stages of local gain control, trained on the same database of distortion ratings, provide substantially better predictions of human sensitivity than either the CNN, or any combination of layers of VGG16.

UR - http://www.scopus.com/inward/record.url?scp=85047000082&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047000082&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85047000082

VL - 2017-December

SP - 3531

EP - 3540

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -