Predicting future instance segmentation by forecasting convolutional features

Pauline Luc, Camille Couprie, Yann LeCun, Jakob Verbeek

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the “detection head” of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over strong baselines based on optical flow and repurposed instance segmentation architectures.

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsMartial Hebert, Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss
PublisherSpringer-Verlag
Pages593-608
Number of pages16
ISBN (Print)9783030012397
DOIs
StatePublished - Jan 1 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: Sep 8 2018Sep 14 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11213 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period9/8/189/14/18

Fingerprint

Forecasting
Segmentation
Masks
Semantics
Optical flows
Mask
Labels
Optical Flow
Predictive Model
Baseline
Predict
Experiments
Output
Experiment
Model

Keywords

  • Convolutional neural networks
  • Deep learning
  • Instance segmentation
  • Video prediction

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Luc, P., Couprie, C., LeCun, Y., & Verbeek, J. (2018). Predicting future instance segmentation by forecasting convolutional features. In M. Hebert, V. Ferrari, C. Sminchisescu, & Y. Weiss (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 593-608). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-030-01240-3_36

Predicting future instance segmentation by forecasting convolutional features. / Luc, Pauline; Couprie, Camille; LeCun, Yann; Verbeek, Jakob.

Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. ed. / Martial Hebert; Vittorio Ferrari; Cristian Sminchisescu; Yair Weiss. Springer-Verlag, 2018. p. 593-608 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Luc, P, Couprie, C, LeCun, Y & Verbeek, J 2018, Predicting future instance segmentation by forecasting convolutional features. in M Hebert, V Ferrari, C Sminchisescu & Y Weiss (eds), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11213 LNCS, Springer-Verlag, pp. 593-608, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 9/8/18. https://doi.org/10.1007/978-3-030-01240-3_36
Luc P, Couprie C, LeCun Y, Verbeek J. Predicting future instance segmentation by forecasting convolutional features. In Hebert M, Ferrari V, Sminchisescu C, Weiss Y, editors, Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer-Verlag. 2018. p. 593-608. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-01240-3_36
Luc, Pauline ; Couprie, Camille ; LeCun, Yann ; Verbeek, Jakob. / Predicting future instance segmentation by forecasting convolutional features. Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. editor / Martial Hebert ; Vittorio Ferrari ; Cristian Sminchisescu ; Yair Weiss. Springer-Verlag, 2018. pp. 593-608 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{08036afe701243e597dbd07f75103089,
title = "Predicting future instance segmentation by forecasting convolutional features",
abstract = "Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the “detection head” of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over strong baselines based on optical flow and repurposed instance segmentation architectures.",
keywords = "Convolutional neural networks, Deep learning, Instance segmentation, Video prediction",
author = "Pauline Luc and Camille Couprie and Yann LeCun and Jakob Verbeek",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-01240-3_36",
language = "English (US)",
isbn = "9783030012397",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "593--608",
editor = "Martial Hebert and Vittorio Ferrari and Cristian Sminchisescu and Yair Weiss",
booktitle = "Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings",

}

TY - GEN

T1 - Predicting future instance segmentation by forecasting convolutional features

AU - Luc, Pauline

AU - Couprie, Camille

AU - LeCun, Yann

AU - Verbeek, Jakob

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the “detection head” of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over strong baselines based on optical flow and repurposed instance segmentation architectures.

AB - Anticipating future events is an important prerequisite towards intelligent behavior. Video forecasting has been studied as a proxy task towards this goal. Recent work has shown that to predict semantic segmentation of future frames, forecasting at the semantic level is more effective than forecasting RGB frames and then segmenting these. In this paper we consider the more challenging problem of future instance segmentation, which additionally segments out individual objects. To deal with a varying number of output labels per image, we develop a predictive model in the space of fixed-sized convolutional features of the Mask R-CNN instance segmentation model. We apply the “detection head” of Mask R-CNN on the predicted features to produce the instance segmentation of future frames. Experiments show that this approach significantly improves over strong baselines based on optical flow and repurposed instance segmentation architectures.

KW - Convolutional neural networks

KW - Deep learning

KW - Instance segmentation

KW - Video prediction

UR - http://www.scopus.com/inward/record.url?scp=85055085511&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055085511&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01240-3_36

DO - 10.1007/978-3-030-01240-3_36

M3 - Conference contribution

AN - SCOPUS:85055085511

SN - 9783030012397

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 593

EP - 608

BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings

A2 - Hebert, Martial

A2 - Ferrari, Vittorio

A2 - Sminchisescu, Cristian

A2 - Weiss, Yair

PB - Springer-Verlag

ER -