Predicting childhood obesity using electronic health records and publicly available data

Robert Hammond, Rodoniki Athanasiadou, Silvia Curado, Yindalon Aphinyanaphongs, Courtney Abrams, Mary Jo Messito, Rachel Gross, Michelle Katzow, Melanie Jay, Narges Razavian, Brian D. Elbel

Research output: Contribution to journalArticle

Abstract

Background Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. Methods and findings We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7% for girls and 76.1% for boys. Conclusions We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.

Original languageEnglish (US)
Article numbere0215571
JournalPLoS ONE
Volume14
Issue number4
DOIs
StatePublished - Apr 1 2019

Fingerprint

childhood obesity
Electronic Health Records
Pediatric Obesity
electronics
obesity
Obesity
Health
body mass index
Learning systems
Body Mass Index
artificial intelligence
Pediatrics
Area Under Curve
Cohort Studies
Learning algorithms
Aptitude
prediction
operator regions
Decision making
adulthood

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Hammond, R., Athanasiadou, R., Curado, S., Aphinyanaphongs, Y., Abrams, C., Messito, M. J., ... Elbel, B. D. (2019). Predicting childhood obesity using electronic health records and publicly available data. PLoS ONE, 14(4), [e0215571]. https://doi.org/10.1371/journal.pone.0215571

Predicting childhood obesity using electronic health records and publicly available data. / Hammond, Robert; Athanasiadou, Rodoniki; Curado, Silvia; Aphinyanaphongs, Yindalon; Abrams, Courtney; Messito, Mary Jo; Gross, Rachel; Katzow, Michelle; Jay, Melanie; Razavian, Narges; Elbel, Brian D.

In: PLoS ONE, Vol. 14, No. 4, e0215571, 01.04.2019.

Research output: Contribution to journalArticle

Hammond, R, Athanasiadou, R, Curado, S, Aphinyanaphongs, Y, Abrams, C, Messito, MJ, Gross, R, Katzow, M, Jay, M, Razavian, N & Elbel, BD 2019, 'Predicting childhood obesity using electronic health records and publicly available data', PLoS ONE, vol. 14, no. 4, e0215571. https://doi.org/10.1371/journal.pone.0215571
Hammond R, Athanasiadou R, Curado S, Aphinyanaphongs Y, Abrams C, Messito MJ et al. Predicting childhood obesity using electronic health records and publicly available data. PLoS ONE. 2019 Apr 1;14(4). e0215571. https://doi.org/10.1371/journal.pone.0215571
Hammond, Robert ; Athanasiadou, Rodoniki ; Curado, Silvia ; Aphinyanaphongs, Yindalon ; Abrams, Courtney ; Messito, Mary Jo ; Gross, Rachel ; Katzow, Michelle ; Jay, Melanie ; Razavian, Narges ; Elbel, Brian D. / Predicting childhood obesity using electronic health records and publicly available data. In: PLoS ONE. 2019 ; Vol. 14, No. 4.
@article{ae3f830d7d394ad9a45096bb35bcc436,
title = "Predicting childhood obesity using electronic health records and publicly available data",
abstract = "Background Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. Methods and findings We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7{\%} for girls and 76.1{\%} for boys. Conclusions We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.",
author = "Robert Hammond and Rodoniki Athanasiadou and Silvia Curado and Yindalon Aphinyanaphongs and Courtney Abrams and Messito, {Mary Jo} and Rachel Gross and Michelle Katzow and Melanie Jay and Narges Razavian and Elbel, {Brian D.}",
year = "2019",
month = "4",
day = "1",
doi = "10.1371/journal.pone.0215571",
language = "English (US)",
volume = "14",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "4",

}

TY - JOUR

T1 - Predicting childhood obesity using electronic health records and publicly available data

AU - Hammond, Robert

AU - Athanasiadou, Rodoniki

AU - Curado, Silvia

AU - Aphinyanaphongs, Yindalon

AU - Abrams, Courtney

AU - Messito, Mary Jo

AU - Gross, Rachel

AU - Katzow, Michelle

AU - Jay, Melanie

AU - Razavian, Narges

AU - Elbel, Brian D.

PY - 2019/4/1

Y1 - 2019/4/1

N2 - Background Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. Methods and findings We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7% for girls and 76.1% for boys. Conclusions We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.

AB - Background Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. Methods and findings We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7% for girls and 76.1% for boys. Conclusions We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.

UR - http://www.scopus.com/inward/record.url?scp=85064808939&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064808939&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0215571

DO - 10.1371/journal.pone.0215571

M3 - Article

C2 - 31009509

AN - SCOPUS:85064808939

VL - 14

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 4

M1 - e0215571

ER -