### Abstract

We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2-stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear transformations is chosen according to perceptual criteria: a center-surround filter that extracts local contrast, and a filter designed to select visually relevant contrast according to the Standard Spatial Observer. For the second, the linear transformations are chosen based on statistical criterion, so as to eliminate correlations estimated from responses to a set of natural images. For both models, the parameters that govern the scale of the linear filters and the properties of the nonlinear normalization operation, are chosen to achieve minimal/maximal subjective discriminability of pairs of images that have been optimized to minimize/maximize the model, respectively (we refer to this as MAximum Differentiation, or "MAD", Optimization). We find that both representations substantially reduce redundancy (mutual information), with a larger reduction occurring in the second (statistically optimized) model. We also find that both models are highly correlated with subjective scores from the TID2008 database, with slightly better performance seen in the first (perceptually chosen) model. Finally, we use a foveated version of the perceptual model to synthesize visual metamers. Specifically, we generate an example of a distorted image that is optimized so as to minimize the perceptual error over receptive fields that scale with eccentricity, demonstrating that the errors are barely visible despite a substantial MSE relative to the original image.

Original language | English (US) |
---|---|

Title of host publication | Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XX |

Publisher | SPIE |

Volume | 9394 |

ISBN (Print) | 9781628414844 |

DOIs | |

State | Published - 2015 |

Event | Human Vision and Electronic Imaging XX - San Francisco, United States Duration: Feb 9 2015 → Feb 12 2015 |

### Other

Other | Human Vision and Electronic Imaging XX |
---|---|

Country | United States |

City | San Francisco |

Period | 2/9/15 → 2/12/15 |

### Fingerprint

### Keywords

- Image quality metrics
- MAximum Differentiation
- Multi-layer networks
- Redundancy reduction
- Vision models
- Visual metamers

### ASJC Scopus subject areas

- Applied Mathematics
- Computer Science Applications
- Electrical and Electronic Engineering
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics

### Cite this

*Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XX*(Vol. 9394). [93940L] SPIE. https://doi.org/10.1117/12.2085653

**Geometrical and statistical properties of vision models obtained via MAximum Differentiation.** / Malo, Jesús; Simoncelli, Eero.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XX.*vol. 9394, 93940L, SPIE, Human Vision and Electronic Imaging XX, San Francisco, United States, 2/9/15. https://doi.org/10.1117/12.2085653

}

TY - GEN

T1 - Geometrical and statistical properties of vision models obtained via MAximum Differentiation

AU - Malo, Jesús

AU - Simoncelli, Eero

PY - 2015

Y1 - 2015

N2 - We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2-stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear transformations is chosen according to perceptual criteria: a center-surround filter that extracts local contrast, and a filter designed to select visually relevant contrast according to the Standard Spatial Observer. For the second, the linear transformations are chosen based on statistical criterion, so as to eliminate correlations estimated from responses to a set of natural images. For both models, the parameters that govern the scale of the linear filters and the properties of the nonlinear normalization operation, are chosen to achieve minimal/maximal subjective discriminability of pairs of images that have been optimized to minimize/maximize the model, respectively (we refer to this as MAximum Differentiation, or "MAD", Optimization). We find that both representations substantially reduce redundancy (mutual information), with a larger reduction occurring in the second (statistically optimized) model. We also find that both models are highly correlated with subjective scores from the TID2008 database, with slightly better performance seen in the first (perceptually chosen) model. Finally, we use a foveated version of the perceptual model to synthesize visual metamers. Specifically, we generate an example of a distorted image that is optimized so as to minimize the perceptual error over receptive fields that scale with eccentricity, demonstrating that the errors are barely visible despite a substantial MSE relative to the original image.

AB - We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2-stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear transformations is chosen according to perceptual criteria: a center-surround filter that extracts local contrast, and a filter designed to select visually relevant contrast according to the Standard Spatial Observer. For the second, the linear transformations are chosen based on statistical criterion, so as to eliminate correlations estimated from responses to a set of natural images. For both models, the parameters that govern the scale of the linear filters and the properties of the nonlinear normalization operation, are chosen to achieve minimal/maximal subjective discriminability of pairs of images that have been optimized to minimize/maximize the model, respectively (we refer to this as MAximum Differentiation, or "MAD", Optimization). We find that both representations substantially reduce redundancy (mutual information), with a larger reduction occurring in the second (statistically optimized) model. We also find that both models are highly correlated with subjective scores from the TID2008 database, with slightly better performance seen in the first (perceptually chosen) model. Finally, we use a foveated version of the perceptual model to synthesize visual metamers. Specifically, we generate an example of a distorted image that is optimized so as to minimize the perceptual error over receptive fields that scale with eccentricity, demonstrating that the errors are barely visible despite a substantial MSE relative to the original image.

KW - Image quality metrics

KW - MAximum Differentiation

KW - Multi-layer networks

KW - Redundancy reduction

KW - Vision models

KW - Visual metamers

UR - http://www.scopus.com/inward/record.url?scp=84928473518&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928473518&partnerID=8YFLogxK

U2 - 10.1117/12.2085653

DO - 10.1117/12.2085653

M3 - Conference contribution

SN - 9781628414844

VL - 9394

BT - Proceedings of SPIE-IS and T Electronic Imaging - Human Vision and Electronic Imaging XX

PB - SPIE

ER -