Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna Estrach, Dumitru Erhan, Ian Goodfellow, Robert Fergus

Research output: Contribution to conferencePaper

Abstract

Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

Original languageEnglish (US)
StatePublished - Jan 1 2014
Event2nd International Conference on Learning Representations, ICLR 2014 - Banff, Canada
Duration: Apr 14 2014Apr 16 2014

Conference

Conference2nd International Conference on Learning Representations, ICLR 2014
CountryCanada
CityBanff
Period4/14/144/16/14

Fingerprint

neural network
Neural networks
cause
Semantics
artifact
semantics
learning
performance
Deep neural networks
Neural Networks
Causes
Artifact
Performance Art
Prediction
Semantic Information
Layer
Expressive
Expressiveness

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Education
  • Computer Science Applications

Cite this

Szegedy, C., Zaremba, W., Sutskever, I., Bruna Estrach, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. Paper presented at 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada.

Intriguing properties of neural networks. / Szegedy, Christian; Zaremba, Wojciech; Sutskever, Ilya; Bruna Estrach, Joan; Erhan, Dumitru; Goodfellow, Ian; Fergus, Robert.

2014. Paper presented at 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada.

Research output: Contribution to conferencePaper

Szegedy, C, Zaremba, W, Sutskever, I, Bruna Estrach, J, Erhan, D, Goodfellow, I & Fergus, R 2014, 'Intriguing properties of neural networks' Paper presented at 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada, 4/14/14 - 4/16/14, .
Szegedy C, Zaremba W, Sutskever I, Bruna Estrach J, Erhan D, Goodfellow I et al. Intriguing properties of neural networks. 2014. Paper presented at 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada.
Szegedy, Christian ; Zaremba, Wojciech ; Sutskever, Ilya ; Bruna Estrach, Joan ; Erhan, Dumitru ; Goodfellow, Ian ; Fergus, Robert. / Intriguing properties of neural networks. Paper presented at 2nd International Conference on Learning Representations, ICLR 2014, Banff, Canada.
@conference{759851e20d2e47aaad2a560211f6a126,
title = "Intriguing properties of neural networks",
abstract = "Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.",
author = "Christian Szegedy and Wojciech Zaremba and Ilya Sutskever and {Bruna Estrach}, Joan and Dumitru Erhan and Ian Goodfellow and Robert Fergus",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
note = "2nd International Conference on Learning Representations, ICLR 2014 ; Conference date: 14-04-2014 Through 16-04-2014",

}

TY - CONF

T1 - Intriguing properties of neural networks

AU - Szegedy, Christian

AU - Zaremba, Wojciech

AU - Sutskever, Ilya

AU - Bruna Estrach, Joan

AU - Erhan, Dumitru

AU - Goodfellow, Ian

AU - Fergus, Robert

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

AB - Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

UR - http://www.scopus.com/inward/record.url?scp=85070854365&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070854365&partnerID=8YFLogxK

M3 - Paper

ER -