Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation

Biao Zeng, Luke R. Lloyd-Jones, Grant W. Montgomery, Andres Metspalu, Tonu Esko, Lude Franke, Urmo Vosa, Annique Claringbould, Kenneth L. Brigham, Arshed A. Quyyumi, Youssef Idaghdhour, Jian Yang, Peter M. Visscher, Joseph E. Powell, Greg Gibson

    Research output: Contribution to journalArticle

    Abstract

    Expression QTL (eQTL) detection has emerged as an important tool for unraveling the relationship between genetic risk factors and disease or clinical phenotypes. Most studies are predicated on the assumption that only a single causal variant explains the association signal in each interval. This greatly simplifies the statistical modeling, but is liable to biases in scenarios where multiple local causal-variants are responsible. Here, our primary goal was to address the prevalence of secondary cis-eQTL signals regulating peripheral blood gene expression locally, utilizing two large human cohort studies, each >2500 samples with accompanying whole genome genotypes. The CAGE (Consortium for the Architecture of Gene Expression) dataset is a compendium of Illumina microarray studies, and the Framingham Heart Study is a two-generation Affymetrix dataset. We also describe Bayesian colocalization analysis of the extent of sharing of cis-eQTL detected in both studies as well as with the BIOS RNAseq dataset. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ∼40% of over 3500 eGenes in both microarray datasets, and that the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. Although <20% of the peak signals across platforms fine map to the same credible interval, the colocalization analysis finds that as many as 50-60% of the primary eQTL are actually shared. Subsequently, colocalization of eQTL signals with GWAS hits detected 1349 genes whose expression in peripheral blood is associated with 591 human phenotype traits or diseases, including enrichment for genes with regulatory functions. At least 10%, and possibly as many as 40%, of eQTL-trait colocalized signals are due to nonprimary cis-eQTL peaks, but just one-quarter of these colocalization signals replicated across the gene expression datasets. Our results are provided as a web-based resource for visualization of multi-site regulation of gene expression and its association with human complex traits and disease states.

    Original languageEnglish (US)
    Pages (from-to)905-918
    Number of pages14
    JournalGenetics
    Volume212
    Issue number3
    DOIs
    StatePublished - Jul 1 2019

    Fingerprint

    Genome-Wide Association Study
    Gene Expression
    Phenotype
    Bayes Theorem
    Gene Expression Regulation
    Regulator Genes
    Cohort Studies
    Genotype
    Datasets
    Genome

    Keywords

    • colocalization
    • conditional association
    • fine mapping
    • gene regulation
    • linkage disequilibrium
    • PolyQTL

    ASJC Scopus subject areas

    • Genetics

    Cite this

    Zeng, B., Lloyd-Jones, L. R., Montgomery, G. W., Metspalu, A., Esko, T., Franke, L., ... Gibson, G. (2019). Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation. Genetics, 212(3), 905-918. https://doi.org/10.1534/genetics.119.302091

    Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation. / Zeng, Biao; Lloyd-Jones, Luke R.; Montgomery, Grant W.; Metspalu, Andres; Esko, Tonu; Franke, Lude; Vosa, Urmo; Claringbould, Annique; Brigham, Kenneth L.; Quyyumi, Arshed A.; Idaghdhour, Youssef; Yang, Jian; Visscher, Peter M.; Powell, Joseph E.; Gibson, Greg.

    In: Genetics, Vol. 212, No. 3, 01.07.2019, p. 905-918.

    Research output: Contribution to journalArticle

    Zeng, B, Lloyd-Jones, LR, Montgomery, GW, Metspalu, A, Esko, T, Franke, L, Vosa, U, Claringbould, A, Brigham, KL, Quyyumi, AA, Idaghdhour, Y, Yang, J, Visscher, PM, Powell, JE & Gibson, G 2019, 'Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation', Genetics, vol. 212, no. 3, pp. 905-918. https://doi.org/10.1534/genetics.119.302091
    Zeng B, Lloyd-Jones LR, Montgomery GW, Metspalu A, Esko T, Franke L et al. Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation. Genetics. 2019 Jul 1;212(3):905-918. https://doi.org/10.1534/genetics.119.302091
    Zeng, Biao ; Lloyd-Jones, Luke R. ; Montgomery, Grant W. ; Metspalu, Andres ; Esko, Tonu ; Franke, Lude ; Vosa, Urmo ; Claringbould, Annique ; Brigham, Kenneth L. ; Quyyumi, Arshed A. ; Idaghdhour, Youssef ; Yang, Jian ; Visscher, Peter M. ; Powell, Joseph E. ; Gibson, Greg. / Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation. In: Genetics. 2019 ; Vol. 212, No. 3. pp. 905-918.
    @article{6773c620483145459d221bb4f48d5d67,
    title = "Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation",
    abstract = "Expression QTL (eQTL) detection has emerged as an important tool for unraveling the relationship between genetic risk factors and disease or clinical phenotypes. Most studies are predicated on the assumption that only a single causal variant explains the association signal in each interval. This greatly simplifies the statistical modeling, but is liable to biases in scenarios where multiple local causal-variants are responsible. Here, our primary goal was to address the prevalence of secondary cis-eQTL signals regulating peripheral blood gene expression locally, utilizing two large human cohort studies, each >2500 samples with accompanying whole genome genotypes. The CAGE (Consortium for the Architecture of Gene Expression) dataset is a compendium of Illumina microarray studies, and the Framingham Heart Study is a two-generation Affymetrix dataset. We also describe Bayesian colocalization analysis of the extent of sharing of cis-eQTL detected in both studies as well as with the BIOS RNAseq dataset. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ∼40{\%} of over 3500 eGenes in both microarray datasets, and that the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. Although <20{\%} of the peak signals across platforms fine map to the same credible interval, the colocalization analysis finds that as many as 50-60{\%} of the primary eQTL are actually shared. Subsequently, colocalization of eQTL signals with GWAS hits detected 1349 genes whose expression in peripheral blood is associated with 591 human phenotype traits or diseases, including enrichment for genes with regulatory functions. At least 10{\%}, and possibly as many as 40{\%}, of eQTL-trait colocalized signals are due to nonprimary cis-eQTL peaks, but just one-quarter of these colocalization signals replicated across the gene expression datasets. Our results are provided as a web-based resource for visualization of multi-site regulation of gene expression and its association with human complex traits and disease states.",
    keywords = "colocalization, conditional association, fine mapping, gene regulation, linkage disequilibrium, PolyQTL",
    author = "Biao Zeng and Lloyd-Jones, {Luke R.} and Montgomery, {Grant W.} and Andres Metspalu and Tonu Esko and Lude Franke and Urmo Vosa and Annique Claringbould and Brigham, {Kenneth L.} and Quyyumi, {Arshed A.} and Youssef Idaghdhour and Jian Yang and Visscher, {Peter M.} and Powell, {Joseph E.} and Greg Gibson",
    year = "2019",
    month = "7",
    day = "1",
    doi = "10.1534/genetics.119.302091",
    language = "English (US)",
    volume = "212",
    pages = "905--918",
    journal = "Genetics",
    issn = "0016-6731",
    publisher = "Genetics Society of America",
    number = "3",

    }

    TY - JOUR

    T1 - Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation

    AU - Zeng, Biao

    AU - Lloyd-Jones, Luke R.

    AU - Montgomery, Grant W.

    AU - Metspalu, Andres

    AU - Esko, Tonu

    AU - Franke, Lude

    AU - Vosa, Urmo

    AU - Claringbould, Annique

    AU - Brigham, Kenneth L.

    AU - Quyyumi, Arshed A.

    AU - Idaghdhour, Youssef

    AU - Yang, Jian

    AU - Visscher, Peter M.

    AU - Powell, Joseph E.

    AU - Gibson, Greg

    PY - 2019/7/1

    Y1 - 2019/7/1

    N2 - Expression QTL (eQTL) detection has emerged as an important tool for unraveling the relationship between genetic risk factors and disease or clinical phenotypes. Most studies are predicated on the assumption that only a single causal variant explains the association signal in each interval. This greatly simplifies the statistical modeling, but is liable to biases in scenarios where multiple local causal-variants are responsible. Here, our primary goal was to address the prevalence of secondary cis-eQTL signals regulating peripheral blood gene expression locally, utilizing two large human cohort studies, each >2500 samples with accompanying whole genome genotypes. The CAGE (Consortium for the Architecture of Gene Expression) dataset is a compendium of Illumina microarray studies, and the Framingham Heart Study is a two-generation Affymetrix dataset. We also describe Bayesian colocalization analysis of the extent of sharing of cis-eQTL detected in both studies as well as with the BIOS RNAseq dataset. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ∼40% of over 3500 eGenes in both microarray datasets, and that the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. Although <20% of the peak signals across platforms fine map to the same credible interval, the colocalization analysis finds that as many as 50-60% of the primary eQTL are actually shared. Subsequently, colocalization of eQTL signals with GWAS hits detected 1349 genes whose expression in peripheral blood is associated with 591 human phenotype traits or diseases, including enrichment for genes with regulatory functions. At least 10%, and possibly as many as 40%, of eQTL-trait colocalized signals are due to nonprimary cis-eQTL peaks, but just one-quarter of these colocalization signals replicated across the gene expression datasets. Our results are provided as a web-based resource for visualization of multi-site regulation of gene expression and its association with human complex traits and disease states.

    AB - Expression QTL (eQTL) detection has emerged as an important tool for unraveling the relationship between genetic risk factors and disease or clinical phenotypes. Most studies are predicated on the assumption that only a single causal variant explains the association signal in each interval. This greatly simplifies the statistical modeling, but is liable to biases in scenarios where multiple local causal-variants are responsible. Here, our primary goal was to address the prevalence of secondary cis-eQTL signals regulating peripheral blood gene expression locally, utilizing two large human cohort studies, each >2500 samples with accompanying whole genome genotypes. The CAGE (Consortium for the Architecture of Gene Expression) dataset is a compendium of Illumina microarray studies, and the Framingham Heart Study is a two-generation Affymetrix dataset. We also describe Bayesian colocalization analysis of the extent of sharing of cis-eQTL detected in both studies as well as with the BIOS RNAseq dataset. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ∼40% of over 3500 eGenes in both microarray datasets, and that the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. Although <20% of the peak signals across platforms fine map to the same credible interval, the colocalization analysis finds that as many as 50-60% of the primary eQTL are actually shared. Subsequently, colocalization of eQTL signals with GWAS hits detected 1349 genes whose expression in peripheral blood is associated with 591 human phenotype traits or diseases, including enrichment for genes with regulatory functions. At least 10%, and possibly as many as 40%, of eQTL-trait colocalized signals are due to nonprimary cis-eQTL peaks, but just one-quarter of these colocalization signals replicated across the gene expression datasets. Our results are provided as a web-based resource for visualization of multi-site regulation of gene expression and its association with human complex traits and disease states.

    KW - colocalization

    KW - conditional association

    KW - fine mapping

    KW - gene regulation

    KW - linkage disequilibrium

    KW - PolyQTL

    UR - http://www.scopus.com/inward/record.url?scp=85069626111&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85069626111&partnerID=8YFLogxK

    U2 - 10.1534/genetics.119.302091

    DO - 10.1534/genetics.119.302091

    M3 - Article

    VL - 212

    SP - 905

    EP - 918

    JO - Genetics

    JF - Genetics

    SN - 0016-6731

    IS - 3

    ER -