### Abstract

A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

Original language | English (US) |
---|---|

Title of host publication | Causality in the Sciences |

Publisher | Oxford University Press |

ISBN (Print) | 9780191728921, 9780199574131 |

DOIs | |

State | Published - Sep 22 2011 |

### Fingerprint

### Keywords

- Causal significance
- False discovery rate
- Multiple testing
- Temporal logic

### ASJC Scopus subject areas

- Mathematics(all)

### Cite this

*Causality in the Sciences*Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199574131.003.0031

**Multiple testing of causal hypotheses.** / Kleinberg, Samantha; Mishra, Bud.

Research output: Chapter in Book/Report/Conference proceeding › Chapter

*Causality in the Sciences.*Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199574131.003.0031

}

TY - CHAP

T1 - Multiple testing of causal hypotheses

AU - Kleinberg, Samantha

AU - Mishra, Bud

PY - 2011/9/22

Y1 - 2011/9/22

N2 - A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

AB - A primary problem in causal inference is the following: From a set of time course data, such as that generated by gene expression microarrays, is it possible to infer all significant causal relationships between the elements described by this data? In prior work (Kleinberg and Mishra, 2009), a framework has been proposed that combines notions of causality in philosophy, with algorithmic approaches built on model checking and statistical techniques for significance testing. The causal relationships can then be described in terms of temporal logic formulæ, thus reframing the problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. Borrowing from philosophy, prima facie causes are define in terms of probability raising, and then determine whether a causal relationship is significant by computing the average difference a prima facie cause makes to the occurrence of its effect, given each of the other prima facie causes of that effect. However, this method faces many interesting issues confronted in statistical theories of hypothesis testing, namely, given these causal formulæ with their associated probabilities and our average computed differences, instead of choosing an arbitrary threshold, how do we decide which are 'significant'? To address this problem rigorously, the chapter uses the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. In particular, the chapter applies the empirical Bayesian formulation proposed by Efron (2004). This method uses an empirical rather than theoretical null, which has been shown to be better equipped for cases where the test statistics are dependent - as may be true in the case of complexcausal structures. The general approach may be used with many of the traditional philosophical theories where thresholds for significance must be identified.

KW - Causal significance

KW - False discovery rate

KW - Multiple testing

KW - Temporal logic

UR - http://www.scopus.com/inward/record.url?scp=84855929161&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84855929161&partnerID=8YFLogxK

U2 - 10.1093/acprof:oso/9780199574131.003.0031

DO - 10.1093/acprof:oso/9780199574131.003.0031

M3 - Chapter

AN - SCOPUS:84855929161

SN - 9780191728921

SN - 9780199574131

BT - Causality in the Sciences

PB - Oxford University Press

ER -