A concept-wide association study of clinical notes to discover new predictors of kidney failure

Karandeep Singh, Rebecca Betensky, Adam Wright, Gary C. Curhan, David W. Bates, Sushrut S. Waikar

Research output: Contribution to journalArticle

Abstract

Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5%threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95%confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.

Original languageEnglish (US)
Pages (from-to)2150-2158
Number of pages9
JournalClinical Journal of the American Society of Nephrology
Volume11
Issue number12
DOIs
StatePublished - Jan 1 2016

Fingerprint

Chronic Kidney Failure
Renal Insufficiency
Nephrology
Confidence Intervals
Fast Foods
Electronic Health Records
Kidney Diseases
Ambulatory Care Facilities
Tertiary Care Centers
Ascorbic Acid
Disease Progression
Linear Models
Creatinine
Cohort Studies
Outpatients
Language
Transplants
Clinical Studies
Nephrologists

Keywords

  • Adult
  • Ascorbic Acid
  • Chronic kidney disease
  • Cohort Studies
  • Creatinine
  • Disease Progression
  • Electronic health record
  • Electronic Health Records
  • End stage kidney disease
  • Fast foods
  • Follow-Up Studies
  • Humans
  • Informatics
  • Kidney
  • Kidney Diseases
  • Kidney Failure, Chronic
  • Natural language processing
  • Natural Language Processing
  • Nephrology
  • Outpatients
  • Renal Insufficiency
  • Retrospective Studies
  • Risk factors
  • Tertiary Care Centers

ASJC Scopus subject areas

  • Epidemiology
  • Critical Care and Intensive Care Medicine
  • Nephrology
  • Transplantation

Cite this

A concept-wide association study of clinical notes to discover new predictors of kidney failure. / Singh, Karandeep; Betensky, Rebecca; Wright, Adam; Curhan, Gary C.; Bates, David W.; Waikar, Sushrut S.

In: Clinical Journal of the American Society of Nephrology, Vol. 11, No. 12, 01.01.2016, p. 2150-2158.

Research output: Contribution to journalArticle

Singh, Karandeep ; Betensky, Rebecca ; Wright, Adam ; Curhan, Gary C. ; Bates, David W. ; Waikar, Sushrut S. / A concept-wide association study of clinical notes to discover new predictors of kidney failure. In: Clinical Journal of the American Society of Nephrology. 2016 ; Vol. 11, No. 12. pp. 2150-2158.
@article{f803276d6432481eaa5d7dc624ed486f,
title = "A concept-wide association study of clinical notes to discover new predictors of kidney failure",
abstract = "Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5{\%}threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95{\%}confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95{\%} confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.",
keywords = "Adult, Ascorbic Acid, Chronic kidney disease, Cohort Studies, Creatinine, Disease Progression, Electronic health record, Electronic Health Records, End stage kidney disease, Fast foods, Follow-Up Studies, Humans, Informatics, Kidney, Kidney Diseases, Kidney Failure, Chronic, Natural language processing, Natural Language Processing, Nephrology, Outpatients, Renal Insufficiency, Retrospective Studies, Risk factors, Tertiary Care Centers",
author = "Karandeep Singh and Rebecca Betensky and Adam Wright and Curhan, {Gary C.} and Bates, {David W.} and Waikar, {Sushrut S.}",
year = "2016",
month = "1",
day = "1",
doi = "10.2215/CJN.02420316",
language = "English (US)",
volume = "11",
pages = "2150--2158",
journal = "Clinical journal of the American Society of Nephrology : CJASN",
issn = "1555-9041",
publisher = "American Society of Nephrology",
number = "12",

}

TY - JOUR

T1 - A concept-wide association study of clinical notes to discover new predictors of kidney failure

AU - Singh, Karandeep

AU - Betensky, Rebecca

AU - Wright, Adam

AU - Curhan, Gary C.

AU - Bates, David W.

AU - Waikar, Sushrut S.

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5%threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95%confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.

AB - Background and objectives Identifying predictors of kidney disease progression is critical toward the development of strategies to prevent kidney failure. Clinical notes provide a unique opportunity for big data approaches to identify novel risk factors for disease. Design, setting, participants, &measurements Weusednatural language processing tools to extract concepts from the preceding year’s clinical notes among patients newly referred to a tertiary care center’s outpatient nephrology clinics and retrospectively evaluated these concepts as predictors for the subsequent development of ESRD using proportional subdistribution hazards (competing risk) regression. The primary outcome was time to ESRD, accounting for a competing risk of death. We identified predictors from univariate and multivariate (adjusting for Tangri linear predictor) models using a5%threshold for falsediscovery rate (q value, 0.05).Weincluded allpatients seen by an adult outpatient nephrologist between January 1, 2004 and June 18, 2014 and excluded patients seen only by transplant nephrology, with preexisting ESRD, with fewer than five clinical notes, with no follow-up, or with no baseline creatinine values. Results Among the 4013 patients selected in the final study cohort, we identified 960 concepts in the unadjusted analysis and 885 concepts in the adjusted analysis. Novel predictors identified included high-dose ascorbic acid (adjusted hazard ratio, 5.48; 95%confidence interval, 2.80 to 10.70; q, 0.001) and fast food (adjusted hazard ratio, 4.34; 95% confidence interval, 2.55 to 7.40; q, 0.001). Conclusions Novel predictors of human disease may be identified using an unbiased approach to analyze text from the electronic health record.

KW - Adult

KW - Ascorbic Acid

KW - Chronic kidney disease

KW - Cohort Studies

KW - Creatinine

KW - Disease Progression

KW - Electronic health record

KW - Electronic Health Records

KW - End stage kidney disease

KW - Fast foods

KW - Follow-Up Studies

KW - Humans

KW - Informatics

KW - Kidney

KW - Kidney Diseases

KW - Kidney Failure, Chronic

KW - Natural language processing

KW - Natural Language Processing

KW - Nephrology

KW - Outpatients

KW - Renal Insufficiency

KW - Retrospective Studies

KW - Risk factors

KW - Tertiary Care Centers

UR - http://www.scopus.com/inward/record.url?scp=85021740923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021740923&partnerID=8YFLogxK

U2 - 10.2215/CJN.02420316

DO - 10.2215/CJN.02420316

M3 - Article

VL - 11

SP - 2150

EP - 2158

JO - Clinical journal of the American Society of Nephrology : CJASN

JF - Clinical journal of the American Society of Nephrology : CJASN

SN - 1555-9041

IS - 12

ER -