Gaussian processes for independence tests with non-iid data in causal inference

Seth R. Flaxman, Daniel Neill, Alexander J. Smola

Research output: Contribution to journalArticle

Abstract

In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, "nearby" observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert- Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.

Original languageEnglish (US)
Article number22
JournalACM Transactions on Intelligent Systems and Technology
Volume7
Issue number2
DOIs
StatePublished - Dec 1 2015

Fingerprint

Independence Test
Causal Inference
Gaussian Process
Testing
Hilbert
Regression
kernel
Test of Independence
Partial Correlation
Structure Learning
Conditional Independence
Error term
Likely
Sufficient
Predict
Formulation
Output
Graph in graph theory
Estimate
Demonstrate

Keywords

  • Causal inference
  • causal structure learning
  • Gaussian process
  • Reproducing kernel Hilbert space

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Artificial Intelligence

Cite this

Gaussian processes for independence tests with non-iid data in causal inference. / Flaxman, Seth R.; Neill, Daniel; Smola, Alexander J.

In: ACM Transactions on Intelligent Systems and Technology, Vol. 7, No. 2, 22, 01.12.2015.

Research output: Contribution to journalArticle

@article{80af3b21d7fd400997f6fab0f9a5314f,
title = "Gaussian processes for independence tests with non-iid data in causal inference",
abstract = "In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, {"}nearby{"} observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert- Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.",
keywords = "Causal inference, causal structure learning, Gaussian process, Reproducing kernel Hilbert space",
author = "Flaxman, {Seth R.} and Daniel Neill and Smola, {Alexander J.}",
year = "2015",
month = "12",
day = "1",
doi = "10.1145/2806892",
language = "English (US)",
volume = "7",
journal = "ACM Transactions on Intelligent Systems and Technology",
issn = "2157-6904",
publisher = "Association for Computing Machinery (ACM)",
number = "2",

}

TY - JOUR

T1 - Gaussian processes for independence tests with non-iid data in causal inference

AU - Flaxman, Seth R.

AU - Neill, Daniel

AU - Smola, Alexander J.

PY - 2015/12/1

Y1 - 2015/12/1

N2 - In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, "nearby" observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert- Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.

AB - In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, "nearby" observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert- Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.

KW - Causal inference

KW - causal structure learning

KW - Gaussian process

KW - Reproducing kernel Hilbert space

UR - http://www.scopus.com/inward/record.url?scp=84952932167&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84952932167&partnerID=8YFLogxK

U2 - 10.1145/2806892

DO - 10.1145/2806892

M3 - Article

AN - SCOPUS:84952932167

VL - 7

JO - ACM Transactions on Intelligent Systems and Technology

JF - ACM Transactions on Intelligent Systems and Technology

SN - 2157-6904

IS - 2

M1 - 22

ER -