Semantic road segmentation via multi-scale ensembles of learned features

Jose M. Alvarez, Yann LeCun, Theo Gevers, Antonio M. Lopez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand-designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process. Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state-of-the-art methods using other sources of information such as depth, motion or stereo.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings
Pages586-595
Number of pages10
Volume7584 LNCS
EditionPART 2
DOIs
StatePublished - 2012
Event12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy
Duration: Oct 7 2012Oct 13 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7584 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th European Conference on Computer Vision, ECCV 2012
CountryItaly
CityFlorence
Period10/7/1210/13/12

Fingerprint

Labels
Ensemble
Segmentation
Scale Invariant Feature Transform
Semantics
Labeling
Conditional Random Fields
Railroad cars
Texture Feature
Textures
Pixels
Diversification
Local Features
Unary
Color
Neural networks
Random Field
Linear Combination
Likelihood
Pixel

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Alvarez, J. M., LeCun, Y., Gevers, T., & Lopez, A. M. (2012). Semantic road segmentation via multi-scale ensembles of learned features. In Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings (PART 2 ed., Vol. 7584 LNCS, pp. 586-595). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7584 LNCS, No. PART 2). https://doi.org/10.1007/978-3-642-33868-7_58

Semantic road segmentation via multi-scale ensembles of learned features. / Alvarez, Jose M.; LeCun, Yann; Gevers, Theo; Lopez, Antonio M.

Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings. Vol. 7584 LNCS PART 2. ed. 2012. p. 586-595 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7584 LNCS, No. PART 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Alvarez, JM, LeCun, Y, Gevers, T & Lopez, AM 2012, Semantic road segmentation via multi-scale ensembles of learned features. in Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings. PART 2 edn, vol. 7584 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 2, vol. 7584 LNCS, pp. 586-595, 12th European Conference on Computer Vision, ECCV 2012, Florence, Italy, 10/7/12. https://doi.org/10.1007/978-3-642-33868-7_58
Alvarez JM, LeCun Y, Gevers T, Lopez AM. Semantic road segmentation via multi-scale ensembles of learned features. In Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings. PART 2 ed. Vol. 7584 LNCS. 2012. p. 586-595. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2). https://doi.org/10.1007/978-3-642-33868-7_58
Alvarez, Jose M. ; LeCun, Yann ; Gevers, Theo ; Lopez, Antonio M. / Semantic road segmentation via multi-scale ensembles of learned features. Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings. Vol. 7584 LNCS PART 2. ed. 2012. pp. 586-595 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2).
@inproceedings{285bf3b429ce4a6389dc381a2b5bc0b4,
title = "Semantic road segmentation via multi-scale ensembles of learned features",
abstract = "Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand-designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process. Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state-of-the-art methods using other sources of information such as depth, motion or stereo.",
author = "Alvarez, {Jose M.} and Yann LeCun and Theo Gevers and Lopez, {Antonio M.}",
year = "2012",
doi = "10.1007/978-3-642-33868-7_58",
language = "English (US)",
isbn = "9783642338670",
volume = "7584 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 2",
pages = "586--595",
booktitle = "Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings",
edition = "PART 2",

}

TY - GEN

T1 - Semantic road segmentation via multi-scale ensembles of learned features

AU - Alvarez, Jose M.

AU - LeCun, Yann

AU - Gevers, Theo

AU - Lopez, Antonio M.

PY - 2012

Y1 - 2012

N2 - Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand-designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process. Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state-of-the-art methods using other sources of information such as depth, motion or stereo.

AB - Semantic segmentation refers to the process of assigning an object label (e.g., building, road, sidewalk, car, pedestrian) to every pixel in an image. Common approaches formulate the task as a random field labeling problem modeling the interactions between labels by combining local and contextual features such as color, depth, edges, SIFT or HoG. These models are trained to maximize the likelihood of the correct classification given a training set. However, these approaches rely on hand-designed features (e.g., texture, SIFT or HoG) and a higher computational time required in the inference process. Therefore, in this paper, we focus on estimating the unary potentials of a conditional random field via ensembles of learned features. We propose an algorithm based on convolutional neural networks to learn local features from training data at different scales and resolutions. Then, diversification between these features is exploited using a weighted linear combination. Experiments on a publicly available database show the effectiveness of the proposed method to perform semantic road scene segmentation in still images. The algorithm outperforms appearance based methods and its performance is similar compared to state-of-the-art methods using other sources of information such as depth, motion or stereo.

UR - http://www.scopus.com/inward/record.url?scp=84867692186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867692186&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-33868-7_58

DO - 10.1007/978-3-642-33868-7_58

M3 - Conference contribution

AN - SCOPUS:84867692186

SN - 9783642338670

VL - 7584 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 586

EP - 595

BT - Computer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings

ER -