Using big data to understand the human condition: The Kavli HUMAN project

Okan Azmak, Hannah Bayer, Andrew Caplin, Miyoung Chun, Paul Glimcher, Steven Koonin, Aristides Patrinos

Research output: Contribution to journalArticle

Abstract

Until now, most large-scale studies of humans have either focused on very specific domains of inquiry or have relied on between-subjects approaches. While these previous studies have been invaluable for revealing important biological factors in cardiac health or social factors in retirement choices, no single repository contains anything like a complete record of the health, education, genetics, environmental, and lifestyle profiles of a large group of individuals at the within-subject level. This seems critical today because emerging evidence about the dynamic interplay between biology, behavior, and the environment point to a pressing need for just the kind of large-scale, long-term synoptic dataset that does not yet exist at the within-subject level. At the same time that the need for such a dataset is becoming clear, there is also growing evidence that just such a synoptic dataset may now be obtainable - at least at moderate scale - using contemporary big data approaches. To this end, we introduce the Kavli HUMAN Project (KHP), an effort to aggregate data from 2,500 New York City households in all five boroughs (roughly 10,000 individuals) whose biology and behavior will be measured using an unprecedented array of modalities over 20 years. It will also richly measure environmental conditions and events that KHP members experience using a geographic information system database of unparalleled scale, currently under construction in New York. In this manner, KHP will offer both synoptic and granular views of how human health and behavior coevolve over the life cycle and why they evolve differently for different people. In turn, we argue that this will allow for new discovery-based scientific approaches, rooted in big data analytics, to improving the health and quality of human life, particularly in urban contexts.

Original languageEnglish (US)
Pages (from-to)173-188
Number of pages16
JournalBig Data
Volume3
Issue number3
DOIs
StatePublished - Sep 1 2015

Fingerprint

Health
Geographic information systems
Life cycle
Education
Big data
Genetics
Social factors
Human health
Repository
Lifestyle
Environmental conditions
Household
Retirement
Data base
Aggregate data
Pressing
Factors
Health education
Large groups
Human behavior

Keywords

  • big data analytics
  • semistructured data
  • unstructured data

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Information Systems and Management

Cite this

Using big data to understand the human condition : The Kavli HUMAN project. / Azmak, Okan; Bayer, Hannah; Caplin, Andrew; Chun, Miyoung; Glimcher, Paul; Koonin, Steven; Patrinos, Aristides.

In: Big Data, Vol. 3, No. 3, 01.09.2015, p. 173-188.

Research output: Contribution to journalArticle

Azmak, Okan ; Bayer, Hannah ; Caplin, Andrew ; Chun, Miyoung ; Glimcher, Paul ; Koonin, Steven ; Patrinos, Aristides. / Using big data to understand the human condition : The Kavli HUMAN project. In: Big Data. 2015 ; Vol. 3, No. 3. pp. 173-188.
@article{86e8000575e947e5926749e493654ced,
title = "Using big data to understand the human condition: The Kavli HUMAN project",
abstract = "Until now, most large-scale studies of humans have either focused on very specific domains of inquiry or have relied on between-subjects approaches. While these previous studies have been invaluable for revealing important biological factors in cardiac health or social factors in retirement choices, no single repository contains anything like a complete record of the health, education, genetics, environmental, and lifestyle profiles of a large group of individuals at the within-subject level. This seems critical today because emerging evidence about the dynamic interplay between biology, behavior, and the environment point to a pressing need for just the kind of large-scale, long-term synoptic dataset that does not yet exist at the within-subject level. At the same time that the need for such a dataset is becoming clear, there is also growing evidence that just such a synoptic dataset may now be obtainable - at least at moderate scale - using contemporary big data approaches. To this end, we introduce the Kavli HUMAN Project (KHP), an effort to aggregate data from 2,500 New York City households in all five boroughs (roughly 10,000 individuals) whose biology and behavior will be measured using an unprecedented array of modalities over 20 years. It will also richly measure environmental conditions and events that KHP members experience using a geographic information system database of unparalleled scale, currently under construction in New York. In this manner, KHP will offer both synoptic and granular views of how human health and behavior coevolve over the life cycle and why they evolve differently for different people. In turn, we argue that this will allow for new discovery-based scientific approaches, rooted in big data analytics, to improving the health and quality of human life, particularly in urban contexts.",
keywords = "big data analytics, semistructured data, unstructured data",
author = "Okan Azmak and Hannah Bayer and Andrew Caplin and Miyoung Chun and Paul Glimcher and Steven Koonin and Aristides Patrinos",
year = "2015",
month = "9",
day = "1",
doi = "10.1089/big.2015.0012",
language = "English (US)",
volume = "3",
pages = "173--188",
journal = "Big Data",
issn = "2167-6461",
publisher = "Mary Ann Liebert Inc.",
number = "3",

}

TY - JOUR

T1 - Using big data to understand the human condition

T2 - The Kavli HUMAN project

AU - Azmak, Okan

AU - Bayer, Hannah

AU - Caplin, Andrew

AU - Chun, Miyoung

AU - Glimcher, Paul

AU - Koonin, Steven

AU - Patrinos, Aristides

PY - 2015/9/1

Y1 - 2015/9/1

N2 - Until now, most large-scale studies of humans have either focused on very specific domains of inquiry or have relied on between-subjects approaches. While these previous studies have been invaluable for revealing important biological factors in cardiac health or social factors in retirement choices, no single repository contains anything like a complete record of the health, education, genetics, environmental, and lifestyle profiles of a large group of individuals at the within-subject level. This seems critical today because emerging evidence about the dynamic interplay between biology, behavior, and the environment point to a pressing need for just the kind of large-scale, long-term synoptic dataset that does not yet exist at the within-subject level. At the same time that the need for such a dataset is becoming clear, there is also growing evidence that just such a synoptic dataset may now be obtainable - at least at moderate scale - using contemporary big data approaches. To this end, we introduce the Kavli HUMAN Project (KHP), an effort to aggregate data from 2,500 New York City households in all five boroughs (roughly 10,000 individuals) whose biology and behavior will be measured using an unprecedented array of modalities over 20 years. It will also richly measure environmental conditions and events that KHP members experience using a geographic information system database of unparalleled scale, currently under construction in New York. In this manner, KHP will offer both synoptic and granular views of how human health and behavior coevolve over the life cycle and why they evolve differently for different people. In turn, we argue that this will allow for new discovery-based scientific approaches, rooted in big data analytics, to improving the health and quality of human life, particularly in urban contexts.

AB - Until now, most large-scale studies of humans have either focused on very specific domains of inquiry or have relied on between-subjects approaches. While these previous studies have been invaluable for revealing important biological factors in cardiac health or social factors in retirement choices, no single repository contains anything like a complete record of the health, education, genetics, environmental, and lifestyle profiles of a large group of individuals at the within-subject level. This seems critical today because emerging evidence about the dynamic interplay between biology, behavior, and the environment point to a pressing need for just the kind of large-scale, long-term synoptic dataset that does not yet exist at the within-subject level. At the same time that the need for such a dataset is becoming clear, there is also growing evidence that just such a synoptic dataset may now be obtainable - at least at moderate scale - using contemporary big data approaches. To this end, we introduce the Kavli HUMAN Project (KHP), an effort to aggregate data from 2,500 New York City households in all five boroughs (roughly 10,000 individuals) whose biology and behavior will be measured using an unprecedented array of modalities over 20 years. It will also richly measure environmental conditions and events that KHP members experience using a geographic information system database of unparalleled scale, currently under construction in New York. In this manner, KHP will offer both synoptic and granular views of how human health and behavior coevolve over the life cycle and why they evolve differently for different people. In turn, we argue that this will allow for new discovery-based scientific approaches, rooted in big data analytics, to improving the health and quality of human life, particularly in urban contexts.

KW - big data analytics

KW - semistructured data

KW - unstructured data

UR - http://www.scopus.com/inward/record.url?scp=84991744743&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84991744743&partnerID=8YFLogxK

U2 - 10.1089/big.2015.0012

DO - 10.1089/big.2015.0012

M3 - Article

VL - 3

SP - 173

EP - 188

JO - Big Data

JF - Big Data

SN - 2167-6461

IS - 3

ER -