Using search queries for malaria surveillance, Thailand

Alex J. Ocampo, Rumi Chunara, John S. Brownstein

Research output: Contribution to journalArticle

Abstract

Background: Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region. Methods. Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models' accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC). Results: Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand. Conclusions: Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005-2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.

Original languageEnglish (US)
Article number390
JournalMalaria Journal
Volume12
Issue number1
DOIs
StatePublished - 2013

Fingerprint

Thailand
Malaria
Communicable Diseases
Linear Models
Incidence
Public Health Surveillance
Physicians
Internet
Disease Outbreaks
Microscopy

Keywords

  • Epidemiology
  • Malaria
  • Search query
  • Surveillance
  • Thailand

ASJC Scopus subject areas

  • Infectious Diseases
  • Parasitology

Cite this

Using search queries for malaria surveillance, Thailand. / Ocampo, Alex J.; Chunara, Rumi; Brownstein, John S.

In: Malaria Journal, Vol. 12, No. 1, 390, 2013.

Research output: Contribution to journalArticle

Ocampo, Alex J. ; Chunara, Rumi ; Brownstein, John S. / Using search queries for malaria surveillance, Thailand. In: Malaria Journal. 2013 ; Vol. 12, No. 1.
@article{b7974f269cd345a8a25d775c2f6c3df6,
title = "Using search queries for malaria surveillance, Thailand",
abstract = "Background: Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region. Methods. Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models' accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC). Results: Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand. Conclusions: Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005-2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.",
keywords = "Epidemiology, Malaria, Search query, Surveillance, Thailand",
author = "Ocampo, {Alex J.} and Rumi Chunara and Brownstein, {John S.}",
year = "2013",
doi = "10.1186/1475-2875-12-390",
language = "English (US)",
volume = "12",
journal = "Malaria Journal",
issn = "1475-2875",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Using search queries for malaria surveillance, Thailand

AU - Ocampo, Alex J.

AU - Chunara, Rumi

AU - Brownstein, John S.

PY - 2013

Y1 - 2013

N2 - Background: Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region. Methods. Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models' accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC). Results: Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand. Conclusions: Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005-2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.

AB - Background: Internet search query trends have been shown to correlate with incidence trends for select infectious diseases and countries. Herein, the first use of Google search queries for malaria surveillance is investigated. The research focuses on Thailand where real-time malaria surveillance is crucial as malaria is re-emerging and developing resistance to pharmaceuticals in the region. Methods. Official Thai malaria case data was acquired from the World Health Organization (WHO) from 2005 to 2009. Using Google correlate, an openly available online tool, and by surveying Thai physicians, search queries potentially related to malaria prevalence were identified. Four linear regression models were built from different sub-sets of malaria-related queries to be used in future predictions. The models' accuracies were evaluated by their ability to predict the malaria outbreak in 2009, their correlation with the entire available malaria case data, and by Akaike information criterion (AIC). Results: Each model captured the bulk of the variability in officially reported malaria incidence. Correlation in the validation set ranged from 0.75 to 0.92 and AIC values ranged from 808 to 586 for the models. While models using malaria-related and general health terms were successful, one model using only microscopy-related terms obtained equally high correlations to malaria case data trends. The model built strictly of queries provided by Thai physicians was the only one that consistently captured the well-documented second seasonal malaria peak in Thailand. Conclusions: Models built from Google search queries were able to adequately estimate malaria activity trends in Thailand, from 2005-2010, according to official malaria case counts reported by WHO. While presenting their own limitations, these search queries may be valid real-time indicators of malaria incidence in the population, as correlations were on par with those of related studies for other infectious diseases. Additionally, this methodology provides a cost-effective description of malaria prevalence that can act as a complement to traditional public health surveillance. This and future studies will continue to identify ways to leverage web-based data to improve public health.

KW - Epidemiology

KW - Malaria

KW - Search query

KW - Surveillance

KW - Thailand

UR - http://www.scopus.com/inward/record.url?scp=84886738403&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886738403&partnerID=8YFLogxK

U2 - 10.1186/1475-2875-12-390

DO - 10.1186/1475-2875-12-390

M3 - Article

VL - 12

JO - Malaria Journal

JF - Malaria Journal

SN - 1475-2875

IS - 1

M1 - 390

ER -