Analysis of familial aggregation in the presence of varying family sizes

Abigail G. Matthews, Dianne M. Finkelstein, Rebecca Betensky

Research output: Contribution to journalArticle

Abstract

Family studies are frequently undertaken as the first step in the search for genetic and/or environmental determinants of disease. Significant familial aggregation of disease is suggestive of a genetic aetiology for the disease and may lead to more focused genetic analysis. Of course, it may also be due to shared environmental factors. Many methods have been proposed in the literature for the analysis of family studies. One model that is appealing for the simplicity of its computation and the conditional interpretation of its parameters is the quadratic exponential model. However, a limiting factor in its application is that it is not reproducible, meaning that all families must be of the same size. To increase the applicability of this model, we propose a hybrid approach in which analysis is based on the assumption of the quadratic exponential model for a selected family size and combines a missing data approach for smaller families with a marginalization approach for larger families. We apply our approach to a family study of colorectal cancer that was sponsored by the Cancer Genetics Network of the National Institutes of Health. We investigate the properties of our approach in simulation studies. Our approach applies more generally to clustered binary data.

Original languageEnglish (US)
Pages (from-to)847-862
Number of pages16
JournalJournal of the Royal Statistical Society. Series C: Applied Statistics
Volume54
Issue number5
DOIs
StatePublished - Nov 7 2005

Fingerprint

Aggregation
Exponential Model
Colorectal Cancer
Genetic Network
Clustered Data
Binary Data
Environmental Factors
Hybrid Approach
Family
Family size
Missing Data
Simplicity
Cancer
Determinant
Health
Limiting
Simulation Study
Model

Keywords

  • Clustered binary data
  • Missing data
  • Quadratic exponential model
  • Reproducibility

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Analysis of familial aggregation in the presence of varying family sizes. / Matthews, Abigail G.; Finkelstein, Dianne M.; Betensky, Rebecca.

In: Journal of the Royal Statistical Society. Series C: Applied Statistics, Vol. 54, No. 5, 07.11.2005, p. 847-862.

Research output: Contribution to journalArticle

@article{2c9d4864305745feb84818b14c88db52,
title = "Analysis of familial aggregation in the presence of varying family sizes",
abstract = "Family studies are frequently undertaken as the first step in the search for genetic and/or environmental determinants of disease. Significant familial aggregation of disease is suggestive of a genetic aetiology for the disease and may lead to more focused genetic analysis. Of course, it may also be due to shared environmental factors. Many methods have been proposed in the literature for the analysis of family studies. One model that is appealing for the simplicity of its computation and the conditional interpretation of its parameters is the quadratic exponential model. However, a limiting factor in its application is that it is not reproducible, meaning that all families must be of the same size. To increase the applicability of this model, we propose a hybrid approach in which analysis is based on the assumption of the quadratic exponential model for a selected family size and combines a missing data approach for smaller families with a marginalization approach for larger families. We apply our approach to a family study of colorectal cancer that was sponsored by the Cancer Genetics Network of the National Institutes of Health. We investigate the properties of our approach in simulation studies. Our approach applies more generally to clustered binary data.",
keywords = "Clustered binary data, Missing data, Quadratic exponential model, Reproducibility",
author = "Matthews, {Abigail G.} and Finkelstein, {Dianne M.} and Rebecca Betensky",
year = "2005",
month = "11",
day = "7",
doi = "10.1111/j.1467-9876.2005.00521.x",
language = "English (US)",
volume = "54",
pages = "847--862",
journal = "Journal of the Royal Statistical Society. Series C: Applied Statistics",
issn = "0035-9254",
publisher = "Wiley-Blackwell",
number = "5",

}

TY - JOUR

T1 - Analysis of familial aggregation in the presence of varying family sizes

AU - Matthews, Abigail G.

AU - Finkelstein, Dianne M.

AU - Betensky, Rebecca

PY - 2005/11/7

Y1 - 2005/11/7

N2 - Family studies are frequently undertaken as the first step in the search for genetic and/or environmental determinants of disease. Significant familial aggregation of disease is suggestive of a genetic aetiology for the disease and may lead to more focused genetic analysis. Of course, it may also be due to shared environmental factors. Many methods have been proposed in the literature for the analysis of family studies. One model that is appealing for the simplicity of its computation and the conditional interpretation of its parameters is the quadratic exponential model. However, a limiting factor in its application is that it is not reproducible, meaning that all families must be of the same size. To increase the applicability of this model, we propose a hybrid approach in which analysis is based on the assumption of the quadratic exponential model for a selected family size and combines a missing data approach for smaller families with a marginalization approach for larger families. We apply our approach to a family study of colorectal cancer that was sponsored by the Cancer Genetics Network of the National Institutes of Health. We investigate the properties of our approach in simulation studies. Our approach applies more generally to clustered binary data.

AB - Family studies are frequently undertaken as the first step in the search for genetic and/or environmental determinants of disease. Significant familial aggregation of disease is suggestive of a genetic aetiology for the disease and may lead to more focused genetic analysis. Of course, it may also be due to shared environmental factors. Many methods have been proposed in the literature for the analysis of family studies. One model that is appealing for the simplicity of its computation and the conditional interpretation of its parameters is the quadratic exponential model. However, a limiting factor in its application is that it is not reproducible, meaning that all families must be of the same size. To increase the applicability of this model, we propose a hybrid approach in which analysis is based on the assumption of the quadratic exponential model for a selected family size and combines a missing data approach for smaller families with a marginalization approach for larger families. We apply our approach to a family study of colorectal cancer that was sponsored by the Cancer Genetics Network of the National Institutes of Health. We investigate the properties of our approach in simulation studies. Our approach applies more generally to clustered binary data.

KW - Clustered binary data

KW - Missing data

KW - Quadratic exponential model

KW - Reproducibility

UR - http://www.scopus.com/inward/record.url?scp=27344443871&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27344443871&partnerID=8YFLogxK

U2 - 10.1111/j.1467-9876.2005.00521.x

DO - 10.1111/j.1467-9876.2005.00521.x

M3 - Article

AN - SCOPUS:27344443871

VL - 54

SP - 847

EP - 862

JO - Journal of the Royal Statistical Society. Series C: Applied Statistics

JF - Journal of the Royal Statistical Society. Series C: Applied Statistics

SN - 0035-9254

IS - 5

ER -