Quantifying predictability through information theory: Small sample estimation in a non-Gaussian framework

Kyle Haven, Andrew Majda, Rafail Abramov

Research output: Contribution to journalArticle

Abstract

Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed "toy" climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95% confidence level.

Original languageEnglish (US)
Pages (from-to)334-362
Number of pages29
JournalJournal of Computational Physics
Volume206
Issue number1
DOIs
StatePublished - Jun 10 2005

Fingerprint

information theory
Information theory
predictions
Entropy
entropy
climate
confidence
null hypothesis
moments
Climate models
skewness
climate models
complex systems
weather
Probability distributions
Large scale systems
Statistical methods
tendencies
theorems
degrees of freedom

ASJC Scopus subject areas

  • Computer Science Applications
  • Physics and Astronomy (miscellaneous)

Cite this

Quantifying predictability through information theory : Small sample estimation in a non-Gaussian framework. / Haven, Kyle; Majda, Andrew; Abramov, Rafail.

In: Journal of Computational Physics, Vol. 206, No. 1, 10.06.2005, p. 334-362.

Research output: Contribution to journalArticle

@article{92dd1e9e7d72442d8feb75ca252ee221,
title = "Quantifying predictability through information theory: Small sample estimation in a non-Gaussian framework",
abstract = "Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed {"}toy{"} climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95{\%} confidence level.",
author = "Kyle Haven and Andrew Majda and Rafail Abramov",
year = "2005",
month = "6",
day = "10",
doi = "10.1016/j.jcp.2004.12.008",
language = "English (US)",
volume = "206",
pages = "334--362",
journal = "Journal of Computational Physics",
issn = "0021-9991",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Quantifying predictability through information theory

T2 - Small sample estimation in a non-Gaussian framework

AU - Haven, Kyle

AU - Majda, Andrew

AU - Abramov, Rafail

PY - 2005/6/10

Y1 - 2005/6/10

N2 - Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed "toy" climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95% confidence level.

AB - Many situations in complex systems require quantitative estimates of the lack of information in one probability distribution relative to another. In short term climate and weather prediction, examples of these issues might involve the lack of information in the historical climate record compared with an ensemble prediction, or the lack of information in a particular Gaussian ensemble prediction strategy involving the first and second moments compared with the non-Gaussian ensemble itself. The relative entropy is a natural way to quantify the predictive utility in this information, and recently a systematic computationally feasible hierarchical framework has been developed. In practical systems with many degrees of freedom, computational overhead limits ensemble predictions to relatively small sample sizes. Here the notion of predictive utility, in a relative entropy framework, is extended to small random samples by the definition of a sample utility, a measure of the unlikeliness that a random sample was produced by a given prediction strategy. The sample utility is the minimum predictability, with a statistical level of confidence, which is implied by the data. Two practical algorithms for measuring such a sample utility are developed here. The first technique is based on the statistical method of null-hypothesis testing, while the second is based upon a central limit theorem for the relative entropy of moment-based probability densities. These techniques are tested on known probability densities with parameterized bimodality and skewness, and then applied to the Lorenz '96 model, a recently developed "toy" climate model with chaotic dynamics mimicking the atmosphere. The results show a detection of non-Gaussian tendencies of prediction densities at small ensemble sizes with between 50 and 100 members, with a 95% confidence level.

UR - http://www.scopus.com/inward/record.url?scp=25444532012&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=25444532012&partnerID=8YFLogxK

U2 - 10.1016/j.jcp.2004.12.008

DO - 10.1016/j.jcp.2004.12.008

M3 - Article

AN - SCOPUS:25444532012

VL - 206

SP - 334

EP - 362

JO - Journal of Computational Physics

JF - Journal of Computational Physics

SN - 0021-9991

IS - 1

ER -