Multiple testing of causal hypotheses

Samantha Kleinberg, Bud Mishra

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

Original languageEnglish (US)
Title of host publicationCausality in the Sciences
PublisherOxford University Press
ISBN (Print)9780191728921, 9780199574131
DOIs
StatePublished - Sep 22 2011

Fingerprint

Multiple Testing
Model Checking
Multiple Hypothesis Testing
Causal Inference
Temporal Logic
Hypothesis Testing
Causality
Microarray
Gene Expression
Test Statistic
Null
Relationships
Logic
Testing
Formulation
Dependent
Computing
Arbitrary

Keywords

  • Causal significance
  • False discovery rate
  • Multiple testing
  • Temporal logic

ASJC Scopus subject areas

  • Mathematics(all)

Cite this

Kleinberg, S., & Mishra, B. (2011). Multiple testing of causal hypotheses. In Causality in the Sciences Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199574131.003.0031

Multiple testing of causal hypotheses. / Kleinberg, Samantha; Mishra, Bud.

Causality in the Sciences. Oxford University Press, 2011.

Research output: Chapter in Book/Report/Conference proceedingChapter

Kleinberg, S & Mishra, B 2011, Multiple testing of causal hypotheses. in Causality in the Sciences. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199574131.003.0031
Kleinberg S, Mishra B. Multiple testing of causal hypotheses. In Causality in the Sciences. Oxford University Press. 2011 https://doi.org/10.1093/acprof:oso/9780199574131.003.0031
Kleinberg, Samantha ; Mishra, Bud. / Multiple testing of causal hypotheses. Causality in the Sciences. Oxford University Press, 2011.
@inbook{2b29aa3a711b4ede96634890e1cecbfd,
title = "Multiple testing of causal hypotheses",
abstract = "A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formul{\ae}, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formul{\ae} with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.",
keywords = "Causal significance, False discovery rate, Multiple testing, Temporal logic",
author = "Samantha Kleinberg and Bud Mishra",
year = "2011",
month = "9",
day = "22",
doi = "10.1093/acprof:oso/9780199574131.003.0031",
language = "English (US)",
isbn = "9780191728921",
booktitle = "Causality in the Sciences",
publisher = "Oxford University Press",

}

TY - CHAP

T1 - Multiple testing of causal hypotheses

AU - Kleinberg, Samantha

AU - Mishra, Bud

PY - 2011/9/22

Y1 - 2011/9/22

N2 - A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

AB - A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

KW - Causal significance

KW - False discovery rate

KW - Multiple testing

KW - Temporal logic

UR - http://www.scopus.com/inward/record.url?scp=84855929161&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84855929161&partnerID=8YFLogxK

U2 - 10.1093/acprof:oso/9780199574131.003.0031

DO - 10.1093/acprof:oso/9780199574131.003.0031

M3 - Chapter

AN - SCOPUS:84855929161

SN - 9780191728921

SN - 9780199574131

BT - Causality in the Sciences

PB - Oxford University Press

ER -