Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods

Seth Flaxman, Andrew Gordon Wilson, Daniel Neill, Hannes Nickisch, Alexander J. Smola

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.

Original languageEnglish (US)
Title of host publication32nd International Conference on Machine Learning, ICML 2015
EditorsFrancis Bach, David Blei
PublisherInternational Machine Learning Society (IMLS)
Pages607-616
Number of pages10
Volume1
ISBN (Electronic)9781510810587
StatePublished - Jan 1 2015
Event32nd International Conference on Machine Learning, ICML 2015 - Lile, France
Duration: Jul 6 2015Jul 11 2015

Other

Other32nd International Conference on Machine Learning, ICML 2015
CountryFrance
CityLile
Period7/6/157/11/15

Fingerprint

Crime
Covariance matrix
Scalability
Interpolation
Statistics

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Science Applications

Cite this

Flaxman, S., Wilson, A. G., Neill, D., Nickisch, H., & Smola, A. J. (2015). Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods. In F. Bach, & D. Blei (Eds.), 32nd International Conference on Machine Learning, ICML 2015 (Vol. 1, pp. 607-616). International Machine Learning Society (IMLS).

Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods. / Flaxman, Seth; Wilson, Andrew Gordon; Neill, Daniel; Nickisch, Hannes; Smola, Alexander J.

32nd International Conference on Machine Learning, ICML 2015. ed. / Francis Bach; David Blei. Vol. 1 International Machine Learning Society (IMLS), 2015. p. 607-616.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Flaxman, S, Wilson, AG, Neill, D, Nickisch, H & Smola, AJ 2015, Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods. in F Bach & D Blei (eds), 32nd International Conference on Machine Learning, ICML 2015. vol. 1, International Machine Learning Society (IMLS), pp. 607-616, 32nd International Conference on Machine Learning, ICML 2015, Lile, France, 7/6/15.
Flaxman S, Wilson AG, Neill D, Nickisch H, Smola AJ. Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods. In Bach F, Blei D, editors, 32nd International Conference on Machine Learning, ICML 2015. Vol. 1. International Machine Learning Society (IMLS). 2015. p. 607-616
Flaxman, Seth ; Wilson, Andrew Gordon ; Neill, Daniel ; Nickisch, Hannes ; Smola, Alexander J. / Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods. 32nd International Conference on Machine Learning, ICML 2015. editor / Francis Bach ; David Blei. Vol. 1 International Machine Learning Society (IMLS), 2015. pp. 607-616
@inproceedings{fc68a89c616743a593b1528625a82e1e,
title = "Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods",
abstract = "Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.",
author = "Seth Flaxman and Wilson, {Andrew Gordon} and Daniel Neill and Hannes Nickisch and Smola, {Alexander J.}",
year = "2015",
month = "1",
day = "1",
language = "English (US)",
volume = "1",
pages = "607--616",
editor = "Francis Bach and David Blei",
booktitle = "32nd International Conference on Machine Learning, ICML 2015",
publisher = "International Machine Learning Society (IMLS)",

}

TY - GEN

T1 - Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods

AU - Flaxman, Seth

AU - Wilson, Andrew Gordon

AU - Neill, Daniel

AU - Nickisch, Hannes

AU - Smola, Alexander J.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.

AB - Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.

UR - http://www.scopus.com/inward/record.url?scp=84969506826&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84969506826&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84969506826

VL - 1

SP - 607

EP - 616

BT - 32nd International Conference on Machine Learning, ICML 2015

A2 - Bach, Francis

A2 - Blei, David

PB - International Machine Learning Society (IMLS)

ER -