Complexity of inference in Latent Dirichlet Allocation

David Sontag, Daniel M. Roy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the computational complexity of probabilistic inference in Latent Dirichlet Allocation (LDA). First, we study the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document's topic distribution is integrated out. We show that, when the effective number of topics per document is small, exact inference takes polynomial time. In contrast, we show that, when a document has a large number of topics, finding the MAP assignment of topics to words in LDA is NP-hard. Next, we consider the problem of finding the MAP topic distribution for a document, where the topic-word assignments are integrated out. We show that this problem is also NP-hard. Finally, we briefly discuss the problem of sampling from the posterior, showing that this is NP-hard in one restricted setting, but leaving open the general question.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
StatePublished - 2011
Event25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 - Granada, Spain
Duration: Dec 12 2011Dec 14 2011

Other

Other25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
CountrySpain
CityGranada
Period12/12/1112/14/11

Fingerprint

Computational complexity
Polynomials
Sampling

ASJC Scopus subject areas

  • Information Systems

Cite this

Sontag, D., & Roy, D. M. (2011). Complexity of inference in Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Complexity of inference in Latent Dirichlet Allocation. / Sontag, David; Roy, Daniel M.

Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sontag, D & Roy, DM 2011, Complexity of inference in Latent Dirichlet Allocation. in Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011, Granada, Spain, 12/12/11.
Sontag D, Roy DM. Complexity of inference in Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011
Sontag, David ; Roy, Daniel M. / Complexity of inference in Latent Dirichlet Allocation. Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. 2011.
@inproceedings{6cf1da2cff524515bedea678c6dada43,
title = "Complexity of inference in Latent Dirichlet Allocation",
abstract = "We consider the computational complexity of probabilistic inference in Latent Dirichlet Allocation (LDA). First, we study the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document's topic distribution is integrated out. We show that, when the effective number of topics per document is small, exact inference takes polynomial time. In contrast, we show that, when a document has a large number of topics, finding the MAP assignment of topics to words in LDA is NP-hard. Next, we consider the problem of finding the MAP topic distribution for a document, where the topic-word assignments are integrated out. We show that this problem is also NP-hard. Finally, we briefly discuss the problem of sampling from the posterior, showing that this is NP-hard in one restricted setting, but leaving open the general question.",
author = "David Sontag and Roy, {Daniel M.}",
year = "2011",
language = "English (US)",
isbn = "9781618395993",
booktitle = "Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011",

}

TY - GEN

T1 - Complexity of inference in Latent Dirichlet Allocation

AU - Sontag, David

AU - Roy, Daniel M.

PY - 2011

Y1 - 2011

N2 - We consider the computational complexity of probabilistic inference in Latent Dirichlet Allocation (LDA). First, we study the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document's topic distribution is integrated out. We show that, when the effective number of topics per document is small, exact inference takes polynomial time. In contrast, we show that, when a document has a large number of topics, finding the MAP assignment of topics to words in LDA is NP-hard. Next, we consider the problem of finding the MAP topic distribution for a document, where the topic-word assignments are integrated out. We show that this problem is also NP-hard. Finally, we briefly discuss the problem of sampling from the posterior, showing that this is NP-hard in one restricted setting, but leaving open the general question.

AB - We consider the computational complexity of probabilistic inference in Latent Dirichlet Allocation (LDA). First, we study the problem of finding the maximum a posteriori (MAP) assignment of topics to words, where the document's topic distribution is integrated out. We show that, when the effective number of topics per document is small, exact inference takes polynomial time. In contrast, we show that, when a document has a large number of topics, finding the MAP assignment of topics to words in LDA is NP-hard. Next, we consider the problem of finding the MAP topic distribution for a document, where the topic-word assignments are integrated out. We show that this problem is also NP-hard. Finally, we briefly discuss the problem of sampling from the posterior, showing that this is NP-hard in one restricted setting, but leaving open the general question.

UR - http://www.scopus.com/inward/record.url?scp=84860610265&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860610265&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781618395993

BT - Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

ER -