Unsupervised learning of spatiotemporally coherent metrics

Ross Goroshin, Joan Bruna Estrach, Jonathan Tompson, David Eigen, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Current state-of-the-art classification and detection algorithms train deep convolutional networks using labeled data. In this work we study unsupervised feature learning with convolutional networks in the context of temporally coherent unlabeled data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity priors. We establish a connection between slow feature learning and metric learning. Using this connection we define "temporal coherence" - a criterion which can be used to set hyper-parameters in a principled and automated manner. In a transfer learning experiment, we show that the resulting encoder can be used to define a more semantically coherent metric without the use of labels.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4086-4093
Number of pages8
Volume11-18-December-2015
ISBN (Electronic)9781467383912
DOIs
StatePublished - Feb 17 2016
Event15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile
Duration: Dec 11 2015Dec 18 2015

Other

Other15th IEEE International Conference on Computer Vision, ICCV 2015
CountryChile
CitySantiago
Period12/11/1512/18/15

Fingerprint

Unsupervised learning
Labels
Experiments

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Goroshin, R., Bruna Estrach, J., Tompson, J., Eigen, D., & LeCun, Y. (2016). Unsupervised learning of spatiotemporally coherent metrics. In Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015 (Vol. 11-18-December-2015, pp. 4086-4093). [7410822] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV.2015.465

Unsupervised learning of spatiotemporally coherent metrics. / Goroshin, Ross; Bruna Estrach, Joan; Tompson, Jonathan; Eigen, David; LeCun, Yann.

Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers Inc., 2016. p. 4086-4093 7410822.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Goroshin, R, Bruna Estrach, J, Tompson, J, Eigen, D & LeCun, Y 2016, Unsupervised learning of spatiotemporally coherent metrics. in Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. vol. 11-18-December-2015, 7410822, Institute of Electrical and Electronics Engineers Inc., pp. 4086-4093, 15th IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 12/11/15. https://doi.org/10.1109/ICCV.2015.465
Goroshin R, Bruna Estrach J, Tompson J, Eigen D, LeCun Y. Unsupervised learning of spatiotemporally coherent metrics. In Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 4086-4093. 7410822 https://doi.org/10.1109/ICCV.2015.465
Goroshin, Ross ; Bruna Estrach, Joan ; Tompson, Jonathan ; Eigen, David ; LeCun, Yann. / Unsupervised learning of spatiotemporally coherent metrics. Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers Inc., 2016. pp. 4086-4093
@inproceedings{055b5a0fa7174cc8a20469547e70378d,
title = "Unsupervised learning of spatiotemporally coherent metrics",
abstract = "Current state-of-the-art classification and detection algorithms train deep convolutional networks using labeled data. In this work we study unsupervised feature learning with convolutional networks in the context of temporally coherent unlabeled data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity priors. We establish a connection between slow feature learning and metric learning. Using this connection we define {"}temporal coherence{"} - a criterion which can be used to set hyper-parameters in a principled and automated manner. In a transfer learning experiment, we show that the resulting encoder can be used to define a more semantically coherent metric without the use of labels.",
author = "Ross Goroshin and {Bruna Estrach}, Joan and Jonathan Tompson and David Eigen and Yann LeCun",
year = "2016",
month = "2",
day = "17",
doi = "10.1109/ICCV.2015.465",
language = "English (US)",
volume = "11-18-December-2015",
pages = "4086--4093",
booktitle = "Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Unsupervised learning of spatiotemporally coherent metrics

AU - Goroshin, Ross

AU - Bruna Estrach, Joan

AU - Tompson, Jonathan

AU - Eigen, David

AU - LeCun, Yann

PY - 2016/2/17

Y1 - 2016/2/17

N2 - Current state-of-the-art classification and detection algorithms train deep convolutional networks using labeled data. In this work we study unsupervised feature learning with convolutional networks in the context of temporally coherent unlabeled data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity priors. We establish a connection between slow feature learning and metric learning. Using this connection we define "temporal coherence" - a criterion which can be used to set hyper-parameters in a principled and automated manner. In a transfer learning experiment, we show that the resulting encoder can be used to define a more semantically coherent metric without the use of labels.

AB - Current state-of-the-art classification and detection algorithms train deep convolutional networks using labeled data. In this work we study unsupervised feature learning with convolutional networks in the context of temporally coherent unlabeled data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pooling auto-encoder regularized by slowness and sparsity priors. We establish a connection between slow feature learning and metric learning. Using this connection we define "temporal coherence" - a criterion which can be used to set hyper-parameters in a principled and automated manner. In a transfer learning experiment, we show that the resulting encoder can be used to define a more semantically coherent metric without the use of labels.

UR - http://www.scopus.com/inward/record.url?scp=84973902378&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84973902378&partnerID=8YFLogxK

U2 - 10.1109/ICCV.2015.465

DO - 10.1109/ICCV.2015.465

M3 - Conference contribution

AN - SCOPUS:84973902378

VL - 11-18-December-2015

SP - 4086

EP - 4093

BT - Proceedings - 2015 IEEE International Conference on Computer Vision, ICCV 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -