Predicting Deeper into the Future of Semantic Segmentation

Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages648-657
Number of pages10
Volume2017-October
ISBN (Electronic)9781538610329
DOIs
StatePublished - Dec 22 2017
Event16th IEEE International Conference on Computer Vision, ICCV 2017 - Venice, Italy
Duration: Oct 22 2017Oct 29 2017

Other

Other16th IEEE International Conference on Computer Vision, ICCV 2017
CountryItaly
CityVenice
Period10/22/1710/29/17

Fingerprint

Semantics
Optical flows
Real time systems
Robotics
Decision making
Pixels
Neural networks

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Luc, P., Neverova, N., Couprie, C., Verbeek, J., & LeCun, Y. (2017). Predicting Deeper into the Future of Semantic Segmentation. In Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017 (Vol. 2017-October, pp. 648-657). [8237339] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV.2017.77

Predicting Deeper into the Future of Semantic Segmentation. / Luc, Pauline; Neverova, Natalia; Couprie, Camille; Verbeek, Jakob; LeCun, Yann.

Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Vol. 2017-October Institute of Electrical and Electronics Engineers Inc., 2017. p. 648-657 8237339.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Luc, P, Neverova, N, Couprie, C, Verbeek, J & LeCun, Y 2017, Predicting Deeper into the Future of Semantic Segmentation. in Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. vol. 2017-October, 8237339, Institute of Electrical and Electronics Engineers Inc., pp. 648-657, 16th IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 10/22/17. https://doi.org/10.1109/ICCV.2017.77
Luc P, Neverova N, Couprie C, Verbeek J, LeCun Y. Predicting Deeper into the Future of Semantic Segmentation. In Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Vol. 2017-October. Institute of Electrical and Electronics Engineers Inc. 2017. p. 648-657. 8237339 https://doi.org/10.1109/ICCV.2017.77
Luc, Pauline ; Neverova, Natalia ; Couprie, Camille ; Verbeek, Jakob ; LeCun, Yann. / Predicting Deeper into the Future of Semantic Segmentation. Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017. Vol. 2017-October Institute of Electrical and Electronics Engineers Inc., 2017. pp. 648-657
@inproceedings{bd274d2e97314edfa3a01a450d06ed96,
title = "Predicting Deeper into the Future of Semantic Segmentation",
abstract = "The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.",
author = "Pauline Luc and Natalia Neverova and Camille Couprie and Jakob Verbeek and Yann LeCun",
year = "2017",
month = "12",
day = "22",
doi = "10.1109/ICCV.2017.77",
language = "English (US)",
volume = "2017-October",
pages = "648--657",
booktitle = "Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Predicting Deeper into the Future of Semantic Segmentation

AU - Luc, Pauline

AU - Neverova, Natalia

AU - Couprie, Camille

AU - Verbeek, Jakob

AU - LeCun, Yann

PY - 2017/12/22

Y1 - 2017/12/22

N2 - The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.

AB - The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.

UR - http://www.scopus.com/inward/record.url?scp=85041902695&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041902695&partnerID=8YFLogxK

U2 - 10.1109/ICCV.2017.77

DO - 10.1109/ICCV.2017.77

M3 - Conference contribution

AN - SCOPUS:85041902695

VL - 2017-October

SP - 648

EP - 657

BT - Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -