Weakly supervised scale-invariant learning of models for visual recognition

R. Fergus, P. Perona, A. Zisserman

Research output: Contribution to journalArticle

Abstract

We investigate a method for learning object categories in a weakly supervised manner. Given a set of images known to contain the target category from a similar viewpoint, learning is translation and scale-invariant; does not require alignment or correspondence between the training images, and is robust to clutter and occlusion. Category models are probabilistic constellations of parts, and their parameters are estimated by maximizing the likelihood of the training data. The appearance of the parts, as well as their mutual position, relative scale and probability of detection are explicitly described in the model. Recognition takes place in two stages. First, a feature-finder identifies promising locations for the model"s parts. Second, the category model is used to compare the likelihood that the observed features are generated by the category model, or are generated by background clutter. The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).

Original languageEnglish (US)
Pages (from-to)273-303
Number of pages31
JournalInternational Journal of Computer Vision
Volume71
Issue number3
DOIs
StatePublished - Mar 2007

Fingerprint

Animals
Railroad cars
Statistical Models

Keywords

  • Constellation model
  • Object recognition
  • Parts and structure model
  • Semi-supervised learning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Cite this

Weakly supervised scale-invariant learning of models for visual recognition. / Fergus, R.; Perona, P.; Zisserman, A.

In: International Journal of Computer Vision, Vol. 71, No. 3, 03.2007, p. 273-303.

Research output: Contribution to journalArticle

@article{0afa225e409e4d449a7d58d4c7aa6216,
title = "Weakly supervised scale-invariant learning of models for visual recognition",
abstract = "We investigate a method for learning object categories in a weakly supervised manner. Given a set of images known to contain the target category from a similar viewpoint, learning is translation and scale-invariant; does not require alignment or correspondence between the training images, and is robust to clutter and occlusion. Category models are probabilistic constellations of parts, and their parameters are estimated by maximizing the likelihood of the training data. The appearance of the parts, as well as their mutual position, relative scale and probability of detection are explicitly described in the model. Recognition takes place in two stages. First, a feature-finder identifies promising locations for the model{"}s parts. Second, the category model is used to compare the likelihood that the observed features are generated by the category model, or are generated by background clutter. The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).",
keywords = "Constellation model, Object recognition, Parts and structure model, Semi-supervised learning",
author = "R. Fergus and P. Perona and A. Zisserman",
year = "2007",
month = "3",
doi = "10.1007/s11263-006-8707-x",
language = "English (US)",
volume = "71",
pages = "273--303",
journal = "International Journal of Computer Vision",
issn = "0920-5691",
publisher = "Springer Netherlands",
number = "3",

}

TY - JOUR

T1 - Weakly supervised scale-invariant learning of models for visual recognition

AU - Fergus, R.

AU - Perona, P.

AU - Zisserman, A.

PY - 2007/3

Y1 - 2007/3

N2 - We investigate a method for learning object categories in a weakly supervised manner. Given a set of images known to contain the target category from a similar viewpoint, learning is translation and scale-invariant; does not require alignment or correspondence between the training images, and is robust to clutter and occlusion. Category models are probabilistic constellations of parts, and their parameters are estimated by maximizing the likelihood of the training data. The appearance of the parts, as well as their mutual position, relative scale and probability of detection are explicitly described in the model. Recognition takes place in two stages. First, a feature-finder identifies promising locations for the model"s parts. Second, the category model is used to compare the likelihood that the observed features are generated by the category model, or are generated by background clutter. The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).

AB - We investigate a method for learning object categories in a weakly supervised manner. Given a set of images known to contain the target category from a similar viewpoint, learning is translation and scale-invariant; does not require alignment or correspondence between the training images, and is robust to clutter and occlusion. Category models are probabilistic constellations of parts, and their parameters are estimated by maximizing the likelihood of the training data. The appearance of the parts, as well as their mutual position, relative scale and probability of detection are explicitly described in the model. Recognition takes place in two stages. First, a feature-finder identifies promising locations for the model"s parts. Second, the category model is used to compare the likelihood that the observed features are generated by the category model, or are generated by background clutter. The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).

KW - Constellation model

KW - Object recognition

KW - Parts and structure model

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=33750397657&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750397657&partnerID=8YFLogxK

U2 - 10.1007/s11263-006-8707-x

DO - 10.1007/s11263-006-8707-x

M3 - Article

VL - 71

SP - 273

EP - 303

JO - International Journal of Computer Vision

JF - International Journal of Computer Vision

SN - 0920-5691

IS - 3

ER -