Abstract
Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding. In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on-board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off-line) and current (on-line) information are combined to detect road areas in single images. From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined.
Original language | English (US) |
---|---|
Title of host publication | Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings |
Pages | 376-389 |
Number of pages | 14 |
Volume | 7578 LNCS |
Edition | PART 7 |
DOIs | |
State | Published - 2012 |
Event | 12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy Duration: Oct 7 2012 → Oct 13 2012 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Number | PART 7 |
Volume | 7578 LNCS |
ISSN (Print) | 03029743 |
ISSN (Electronic) | 16113349 |
Other
Other | 12th European Conference on Computer Vision, ECCV 2012 |
---|---|
Country | Italy |
City | Florence |
Period | 10/7/12 → 10/13/12 |
Fingerprint
ASJC Scopus subject areas
- Computer Science(all)
- Theoretical Computer Science
Cite this
Road scene segmentation from a single image. / Alvarez, Jose M.; Gevers, Theo; LeCun, Yann; Lopez, Antonio M.
Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. Vol. 7578 LNCS PART 7. ed. 2012. p. 376-389 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7578 LNCS, No. PART 7).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - Road scene segmentation from a single image
AU - Alvarez, Jose M.
AU - Gevers, Theo
AU - LeCun, Yann
AU - Lopez, Antonio M.
PY - 2012
Y1 - 2012
N2 - Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding. In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on-board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off-line) and current (on-line) information are combined to detect road areas in single images. From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined.
AB - Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding. In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on-board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off-line) and current (on-line) information are combined to detect road areas in single images. From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined.
UR - http://www.scopus.com/inward/record.url?scp=84867871457&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867871457&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33786-4_28
DO - 10.1007/978-3-642-33786-4_28
M3 - Conference contribution
AN - SCOPUS:84867871457
SN - 9783642337857
VL - 7578 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 376
EP - 389
BT - Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings
ER -