A practical algorithm for topic modeling with provable guarantees

Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

Original languageEnglish (US)
Title of host publication30th International Conference on Machine Learning, ICML 2013
PublisherInternational Machine Learning Society (IMLS)
Pages939-947
Number of pages9
EditionPART 2
StatePublished - 2013
Event30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States
Duration: Jun 16 2013Jun 21 2013

Other

Other30th International Conference on Machine Learning, ICML 2013
CountryUnited States
CityAtlanta, GA
Period6/16/136/21/13

Fingerprint

guarantee
model learning
Maximum likelihood
data analysis
learning

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Sociology and Political Science

Cite this

Arora, S., Ge, R., Halpern, Y., Mimno, D., Moitra, A., Sontag, D., ... Zhu, M. (2013). A practical algorithm for topic modeling with provable guarantees. In 30th International Conference on Machine Learning, ICML 2013 (PART 2 ed., pp. 939-947). International Machine Learning Society (IMLS).

A practical algorithm for topic modeling with provable guarantees. / Arora, Sanjeev; Ge, Rong; Halpern, Yoni; Mimno, David; Moitra, Ankur; Sontag, David; Wu, Yichen; Zhu, Michael.

30th International Conference on Machine Learning, ICML 2013. PART 2. ed. International Machine Learning Society (IMLS), 2013. p. 939-947.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Arora, S, Ge, R, Halpern, Y, Mimno, D, Moitra, A, Sontag, D, Wu, Y & Zhu, M 2013, A practical algorithm for topic modeling with provable guarantees. in 30th International Conference on Machine Learning, ICML 2013. PART 2 edn, International Machine Learning Society (IMLS), pp. 939-947, 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, United States, 6/16/13.
Arora S, Ge R, Halpern Y, Mimno D, Moitra A, Sontag D et al. A practical algorithm for topic modeling with provable guarantees. In 30th International Conference on Machine Learning, ICML 2013. PART 2 ed. International Machine Learning Society (IMLS). 2013. p. 939-947
Arora, Sanjeev ; Ge, Rong ; Halpern, Yoni ; Mimno, David ; Moitra, Ankur ; Sontag, David ; Wu, Yichen ; Zhu, Michael. / A practical algorithm for topic modeling with provable guarantees. 30th International Conference on Machine Learning, ICML 2013. PART 2. ed. International Machine Learning Society (IMLS), 2013. pp. 939-947
@inproceedings{4f52f92be9ec43d29d9194abc049380c,
title = "A practical algorithm for topic modeling with provable guarantees",
abstract = "Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.",
author = "Sanjeev Arora and Rong Ge and Yoni Halpern and David Mimno and Ankur Moitra and David Sontag and Yichen Wu and Michael Zhu",
year = "2013",
language = "English (US)",
pages = "939--947",
booktitle = "30th International Conference on Machine Learning, ICML 2013",
publisher = "International Machine Learning Society (IMLS)",
edition = "PART 2",

}

TY - GEN

T1 - A practical algorithm for topic modeling with provable guarantees

AU - Arora, Sanjeev

AU - Ge, Rong

AU - Halpern, Yoni

AU - Mimno, David

AU - Moitra, Ankur

AU - Sontag, David

AU - Wu, Yichen

AU - Zhu, Michael

PY - 2013

Y1 - 2013

N2 - Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

AB - Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

UR - http://www.scopus.com/inward/record.url?scp=84897550363&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897550363&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84897550363

SP - 939

EP - 947

BT - 30th International Conference on Machine Learning, ICML 2013

PB - International Machine Learning Society (IMLS)

ER -