A new estimator for the number of species in a population

Lorenzo Cecconi, Alberto Gandolfi, Chelluri C.A. Sastri

Research output: Contribution to journalArticle

Abstract

We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

Original languageEnglish (US)
Pages (from-to)80-100
Number of pages21
JournalSankhya: The Indian Journal of Statistics
Volume74
Issue number1 A
DOIs
StatePublished - Dec 1 2012

Fingerprint

Dirichlet Prior
Estimator
Primary 62G05
Confidence interval
Benford's law
Simultaneous Estimation
Census
Confidence Level
Bayesian Methods
Count
Coverage
Restriction
Numerical Examples
Experiment
Dirichlet
Data base
Bayesian methods
Sampling
Confidence

Keywords

  • Bayesian posterior
  • Confidence interval
  • Dirichlet prior
  • Point estimator
  • Simple random sample
  • Unobserved probability
  • Unobserved species

ASJC Scopus subject areas

  • Statistics, Probability and Uncertainty
  • Statistics and Probability

Cite this

A new estimator for the number of species in a population. / Cecconi, Lorenzo; Gandolfi, Alberto; Sastri, Chelluri C.A.

In: Sankhya: The Indian Journal of Statistics, Vol. 74, No. 1 A, 01.12.2012, p. 80-100.

Research output: Contribution to journalArticle

Cecconi, Lorenzo ; Gandolfi, Alberto ; Sastri, Chelluri C.A. / A new estimator for the number of species in a population. In: Sankhya: The Indian Journal of Statistics. 2012 ; Vol. 74, No. 1 A. pp. 80-100.
@article{5b64279dedc34754b76c14b0629701eb,
title = "A new estimator for the number of species in a population",
abstract = "We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.",
keywords = "Bayesian posterior, Confidence interval, Dirichlet prior, Point estimator, Simple random sample, Unobserved probability, Unobserved species",
author = "Lorenzo Cecconi and Alberto Gandolfi and Sastri, {Chelluri C.A.}",
year = "2012",
month = "12",
day = "1",
doi = "10.1007/s13171-012-0012-x",
language = "English (US)",
volume = "74",
pages = "80--100",
journal = "Sankhya: The Indian Journal of Statistics",
issn = "0972-7671",
publisher = "Indian Statistical Institute",
number = "1 A",

}

TY - JOUR

T1 - A new estimator for the number of species in a population

AU - Cecconi, Lorenzo

AU - Gandolfi, Alberto

AU - Sastri, Chelluri C.A.

PY - 2012/12/1

Y1 - 2012/12/1

N2 - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

AB - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

KW - Bayesian posterior

KW - Confidence interval

KW - Dirichlet prior

KW - Point estimator

KW - Simple random sample

KW - Unobserved probability

KW - Unobserved species

UR - http://www.scopus.com/inward/record.url?scp=84872131633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872131633&partnerID=8YFLogxK

U2 - 10.1007/s13171-012-0012-x

DO - 10.1007/s13171-012-0012-x

M3 - Article

AN - SCOPUS:84872131633

VL - 74

SP - 80

EP - 100

JO - Sankhya: The Indian Journal of Statistics

JF - Sankhya: The Indian Journal of Statistics

SN - 0972-7671

IS - 1 A

ER -