Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality

Xinyu Zhang, Ying Hu, Bradley E. Aouizerat, Gang Peng, Vincent C. Marconi, Michael J. Corley, Todd Hulgan, Kendall J. Bryant, Hongyu Zhao, John H. Krystal, Amy C. Justice, Ke Xu

Research output: Contribution to journalArticle

Abstract

Background: The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population. Results: We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E-07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E-05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion. Conclusion: We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.

Original languageEnglish (US)
Article number155
JournalClinical Epigenetics
Volume10
Issue number1
DOIs
StatePublished - Dec 13 2018

    Fingerprint

Keywords

  • DNA methylation
  • Ensemble machine learning
  • HIV frailty
  • Mortality
  • Tobacco smoking

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics
  • Developmental Biology
  • Genetics(clinical)

Cite this

Zhang, X., Hu, Y., Aouizerat, B. E., Peng, G., Marconi, V. C., Corley, M. J., Hulgan, T., Bryant, K. J., Zhao, H., Krystal, J. H., Justice, A. C., & Xu, K. (2018). Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality. Clinical Epigenetics, 10(1), [155]. https://doi.org/10.1186/s13148-018-0591-z