Variational inference via χ upper bound minimization

Adji B. Dieng, Dustin Tran, Rajesh Ranganath, John Paisley, David M. Blei

Research output: Contribution to journalConference article

Abstract

Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions q and finds the closest member to the exact posterior p. Closeness is usually measured via a divergence D(q\ \p) from q to p. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior variance. In this paper we propose CHIVI, a black-box variational inference algorithm that minimizes Dx(p\\q), the χ-divergence fromp to q. CHIVI minimizes an upper bound of the model evidence, which we term the χ upper bound (CUBO). Minimizing the CUBO leads to improved posterior uncertainty, and it can also be used with the classical VI lower bound (ELBO) to provide a sandwich estimate of the model evidence. We study CHIVI on three models: probit regression, Gaussian process classification, and a Cox process model of basketball plays. When compared to expectation propagation and classical VI, CHIVI produces better error rates and more accurate estimates of posterior variance.

Original languageEnglish (US)
Pages (from-to)2733-2742
Number of pages10
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - Jan 1 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

Fingerprint

Markov processes
Uncertainty

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Dieng, A. B., Tran, D., Ranganath, R., Paisley, J., & Blei, D. M. (2017). Variational inference via χ upper bound minimization. Advances in Neural Information Processing Systems, 2017-December, 2733-2742.

Variational inference via χ upper bound minimization. / Dieng, Adji B.; Tran, Dustin; Ranganath, Rajesh; Paisley, John; Blei, David M.

In: Advances in Neural Information Processing Systems, Vol. 2017-December, 01.01.2017, p. 2733-2742.

Research output: Contribution to journalConference article

Dieng, AB, Tran, D, Ranganath, R, Paisley, J & Blei, DM 2017, 'Variational inference via χ upper bound minimization', Advances in Neural Information Processing Systems, vol. 2017-December, pp. 2733-2742.
Dieng AB, Tran D, Ranganath R, Paisley J, Blei DM. Variational inference via χ upper bound minimization. Advances in Neural Information Processing Systems. 2017 Jan 1;2017-December:2733-2742.
Dieng, Adji B. ; Tran, Dustin ; Ranganath, Rajesh ; Paisley, John ; Blei, David M. / Variational inference via χ upper bound minimization. In: Advances in Neural Information Processing Systems. 2017 ; Vol. 2017-December. pp. 2733-2742.
@article{69a5277c1bec42559fedfdd548fac2ff,
title = "Variational inference via χ upper bound minimization",
abstract = "Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions q and finds the closest member to the exact posterior p. Closeness is usually measured via a divergence D(q\ \p) from q to p. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior variance. In this paper we propose CHIVI, a black-box variational inference algorithm that minimizes Dx(p\\q), the χ-divergence fromp to q. CHIVI minimizes an upper bound of the model evidence, which we term the χ upper bound (CUBO). Minimizing the CUBO leads to improved posterior uncertainty, and it can also be used with the classical VI lower bound (ELBO) to provide a sandwich estimate of the model evidence. We study CHIVI on three models: probit regression, Gaussian process classification, and a Cox process model of basketball plays. When compared to expectation propagation and classical VI, CHIVI produces better error rates and more accurate estimates of posterior variance.",
author = "Dieng, {Adji B.} and Dustin Tran and Rajesh Ranganath and John Paisley and Blei, {David M.}",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "2017-December",
pages = "2733--2742",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

TY - JOUR

T1 - Variational inference via χ upper bound minimization

AU - Dieng, Adji B.

AU - Tran, Dustin

AU - Ranganath, Rajesh

AU - Paisley, John

AU - Blei, David M.

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions q and finds the closest member to the exact posterior p. Closeness is usually measured via a divergence D(q\ \p) from q to p. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior variance. In this paper we propose CHIVI, a black-box variational inference algorithm that minimizes Dx(p\\q), the χ-divergence fromp to q. CHIVI minimizes an upper bound of the model evidence, which we term the χ upper bound (CUBO). Minimizing the CUBO leads to improved posterior uncertainty, and it can also be used with the classical VI lower bound (ELBO) to provide a sandwich estimate of the model evidence. We study CHIVI on three models: probit regression, Gaussian process classification, and a Cox process model of basketball plays. When compared to expectation propagation and classical VI, CHIVI produces better error rates and more accurate estimates of posterior variance.

AB - Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions q and finds the closest member to the exact posterior p. Closeness is usually measured via a divergence D(q\ \p) from q to p. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior variance. In this paper we propose CHIVI, a black-box variational inference algorithm that minimizes Dx(p\\q), the χ-divergence fromp to q. CHIVI minimizes an upper bound of the model evidence, which we term the χ upper bound (CUBO). Minimizing the CUBO leads to improved posterior uncertainty, and it can also be used with the classical VI lower bound (ELBO) to provide a sandwich estimate of the model evidence. We study CHIVI on three models: probit regression, Gaussian process classification, and a Cox process model of basketball plays. When compared to expectation propagation and classical VI, CHIVI produces better error rates and more accurate estimates of posterior variance.

UR - http://www.scopus.com/inward/record.url?scp=85047018952&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047018952&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85047018952

VL - 2017-December

SP - 2733

EP - 2742

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -