End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression

Li Wan, David Eigen, Robert Fergus

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deformable Parts Models and Convolutional Networks each have achieved notable performance in object detection. Yet these two approaches find their strengths in complementary areas: DPMs are well-versed in object composition, modeling fine-grained spatial relationships between parts; likewise, ConvNets are adept at producing powerful image features, having been discriminatively trained directly on the pixels. In this paper, we propose a new model that combines these two approaches, obtaining the advantages of each. We train this model using a new structured loss function that considers all bounding boxes within an image, rather than isolated object instances. This enables the non-maximal suppression (NMS) operation, previously treated as a separate post-processing stage, to be integrated into the model. This allows for discriminative training of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results on both benchmarks.

Original languageEnglish (US)
Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PublisherIEEE Computer Society
Pages851-859
Number of pages9
Volume07-12-June-2015
ISBN (Print)9781467369640
DOIs
StatePublished - Oct 14 2015
EventIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
Duration: Jun 7 2015Jun 12 2015

Other

OtherIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
CountryUnited States
CityBoston
Period6/7/156/12/15

Fingerprint

Volatile organic compounds
Pixels
Processing
Chemical analysis
Object detection

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Wan, L., Eigen, D., & Fergus, R. (2015). End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 (Vol. 07-12-June-2015, pp. 851-859). [7298686] IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298686

End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression. / Wan, Li; Eigen, David; Fergus, Robert.

IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. Vol. 07-12-June-2015 IEEE Computer Society, 2015. p. 851-859 7298686.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wan, L, Eigen, D & Fergus, R 2015, End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression. in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. vol. 07-12-June-2015, 7298686, IEEE Computer Society, pp. 851-859, IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, United States, 6/7/15. https://doi.org/10.1109/CVPR.2015.7298686
Wan L, Eigen D, Fergus R. End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. Vol. 07-12-June-2015. IEEE Computer Society. 2015. p. 851-859. 7298686 https://doi.org/10.1109/CVPR.2015.7298686
Wan, Li ; Eigen, David ; Fergus, Robert. / End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression. IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. Vol. 07-12-June-2015 IEEE Computer Society, 2015. pp. 851-859
@inproceedings{a8f6f0d745454882820d55f8ce332f1a,
title = "End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression",
abstract = "Deformable Parts Models and Convolutional Networks each have achieved notable performance in object detection. Yet these two approaches find their strengths in complementary areas: DPMs are well-versed in object composition, modeling fine-grained spatial relationships between parts; likewise, ConvNets are adept at producing powerful image features, having been discriminatively trained directly on the pixels. In this paper, we propose a new model that combines these two approaches, obtaining the advantages of each. We train this model using a new structured loss function that considers all bounding boxes within an image, rather than isolated object instances. This enables the non-maximal suppression (NMS) operation, previously treated as a separate post-processing stage, to be integrated into the model. This allows for discriminative training of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results on both benchmarks.",
author = "Li Wan and David Eigen and Robert Fergus",
year = "2015",
month = "10",
day = "14",
doi = "10.1109/CVPR.2015.7298686",
language = "English (US)",
isbn = "9781467369640",
volume = "07-12-June-2015",
pages = "851--859",
booktitle = "IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression

AU - Wan, Li

AU - Eigen, David

AU - Fergus, Robert

PY - 2015/10/14

Y1 - 2015/10/14

N2 - Deformable Parts Models and Convolutional Networks each have achieved notable performance in object detection. Yet these two approaches find their strengths in complementary areas: DPMs are well-versed in object composition, modeling fine-grained spatial relationships between parts; likewise, ConvNets are adept at producing powerful image features, having been discriminatively trained directly on the pixels. In this paper, we propose a new model that combines these two approaches, obtaining the advantages of each. We train this model using a new structured loss function that considers all bounding boxes within an image, rather than isolated object instances. This enables the non-maximal suppression (NMS) operation, previously treated as a separate post-processing stage, to be integrated into the model. This allows for discriminative training of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results on both benchmarks.

AB - Deformable Parts Models and Convolutional Networks each have achieved notable performance in object detection. Yet these two approaches find their strengths in complementary areas: DPMs are well-versed in object composition, modeling fine-grained spatial relationships between parts; likewise, ConvNets are adept at producing powerful image features, having been discriminatively trained directly on the pixels. In this paper, we propose a new model that combines these two approaches, obtaining the advantages of each. We train this model using a new structured loss function that considers all bounding boxes within an image, rather than isolated object instances. This enables the non-maximal suppression (NMS) operation, previously treated as a separate post-processing stage, to be integrated into the model. This allows for discriminative training of our combined Convnet + DPM + NMS model in end-to-end fashion. We evaluate our system on PASCAL VOC 2007 and 2011 datasets, achieving competitive results on both benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=84959203164&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959203164&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2015.7298686

DO - 10.1109/CVPR.2015.7298686

M3 - Conference contribution

AN - SCOPUS:84959203164

SN - 9781467369640

VL - 07-12-June-2015

SP - 851

EP - 859

BT - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015

PB - IEEE Computer Society

ER -