Discovering the hidden structure of house prices with a non-parametric latent manifold model

Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the "intrinsic" price of the house from its description. The second one is a smooth, non-parametric model of the latent "desirability" manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.

Original languageEnglish (US)
Title of host publicationKDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages173-182
Number of pages10
DOIs
StatePublished - 2007
EventKDD-2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - San Jose, CA, United States
Duration: Aug 12 2007Aug 15 2007

Other

OtherKDD-2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CitySan Jose, CA
Period8/12/078/15/07

Keywords

  • Energy-based models
  • Expectation maximization
  • Latent manifold models
  • Structured prediction

ASJC Scopus subject areas

  • Information Systems

Cite this

Chopra, S., Thampy, T., Leahy, J., Caplin, A., & LeCun, Y. (2007). Discovering the hidden structure of house prices with a non-parametric latent manifold model. In KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 173-182) https://doi.org/10.1145/1281192.1281214

Discovering the hidden structure of house prices with a non-parametric latent manifold model. / Chopra, Sumit; Thampy, Trivikraman; Leahy, John; Caplin, Andrew; LeCun, Yann.

KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2007. p. 173-182.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chopra, S, Thampy, T, Leahy, J, Caplin, A & LeCun, Y 2007, Discovering the hidden structure of house prices with a non-parametric latent manifold model. in KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 173-182, KDD-2007: 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, United States, 8/12/07. https://doi.org/10.1145/1281192.1281214
Chopra S, Thampy T, Leahy J, Caplin A, LeCun Y. Discovering the hidden structure of house prices with a non-parametric latent manifold model. In KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2007. p. 173-182 https://doi.org/10.1145/1281192.1281214
Chopra, Sumit ; Thampy, Trivikraman ; Leahy, John ; Caplin, Andrew ; LeCun, Yann. / Discovering the hidden structure of house prices with a non-parametric latent manifold model. KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2007. pp. 173-182
@inproceedings{1b967c0192f34dcab5ff4b00346c4181,
title = "Discovering the hidden structure of house prices with a non-parametric latent manifold model",
abstract = "In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the {"}intrinsic{"} price of the house from its description. The second one is a smooth, non-parametric model of the latent {"}desirability{"} manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.",
keywords = "Energy-based models, Expectation maximization, Latent manifold models, Structured prediction",
author = "Sumit Chopra and Trivikraman Thampy and John Leahy and Andrew Caplin and Yann LeCun",
year = "2007",
doi = "10.1145/1281192.1281214",
language = "English (US)",
isbn = "1595936092",
pages = "173--182",
booktitle = "KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",

}

TY - GEN

T1 - Discovering the hidden structure of house prices with a non-parametric latent manifold model

AU - Chopra, Sumit

AU - Thampy, Trivikraman

AU - Leahy, John

AU - Caplin, Andrew

AU - LeCun, Yann

PY - 2007

Y1 - 2007

N2 - In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the "intrinsic" price of the house from its description. The second one is a smooth, non-parametric model of the latent "desirability" manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.

AB - In many regression problems, the variable to be predicted depends not only on a sample-specific feature vector, but also on an unknown (latent) manifold that must satisfy known constraints. An example is house prices, which depend on the characteristics of the house, and on the desirability of the neighborhood, which is not directly measurable. The proposed method comprises two trainable components. The first one is a parametric model that predicts the "intrinsic" price of the house from its description. The second one is a smooth, non-parametric model of the latent "desirability" manifold. The predicted price of a house is the product of its intrinsic price and desirability. The two components are trained simultaneously using a deterministic form of the EM algorithm. The model was trained on a large dataset of houses from Los Angeles county. It produces better predictions than pure parametric and non-parametric models. It also produces useful estimates of the desirability surface at each location.

KW - Energy-based models

KW - Expectation maximization

KW - Latent manifold models

KW - Structured prediction

UR - http://www.scopus.com/inward/record.url?scp=36849089102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36849089102&partnerID=8YFLogxK

U2 - 10.1145/1281192.1281214

DO - 10.1145/1281192.1281214

M3 - Conference contribution

AN - SCOPUS:36849089102

SN - 1595936092

SN - 9781595936097

SP - 173

EP - 182

BT - KDD-2007: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ER -