### Abstract

We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ^{2}. Ourmethod involves simultaneous estimation of T, γ^{2}, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ^{2} ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ^{2} = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

Original language | English (US) |
---|---|

Pages (from-to) | 80-100 |

Number of pages | 21 |

Journal | Sankhya: The Indian Journal of Statistics |

Volume | 74 |

Issue number | 1 A |

DOIs | |

State | Published - Dec 1 2012 |

### Fingerprint

### Keywords

- Bayesian posterior
- Confidence interval
- Dirichlet prior
- Point estimator
- Simple random sample
- Unobserved probability
- Unobserved species

### ASJC Scopus subject areas

- Statistics, Probability and Uncertainty
- Statistics and Probability

### Cite this

*Sankhya: The Indian Journal of Statistics*,

*74*(1 A), 80-100. https://doi.org/10.1007/s13171-012-0012-x

**A new estimator for the number of species in a population.** / Cecconi, Lorenzo; Gandolfi, Alberto; Sastri, Chelluri C.A.

Research output: Contribution to journal › Article

*Sankhya: The Indian Journal of Statistics*, vol. 74, no. 1 A, pp. 80-100. https://doi.org/10.1007/s13171-012-0012-x

}

TY - JOUR

T1 - A new estimator for the number of species in a population

AU - Cecconi, Lorenzo

AU - Gandolfi, Alberto

AU - Sastri, Chelluri C.A.

PY - 2012/12/1

Y1 - 2012/12/1

N2 - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

AB - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.

KW - Bayesian posterior

KW - Confidence interval

KW - Dirichlet prior

KW - Point estimator

KW - Simple random sample

KW - Unobserved probability

KW - Unobserved species

UR - http://www.scopus.com/inward/record.url?scp=84872131633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872131633&partnerID=8YFLogxK

U2 - 10.1007/s13171-012-0012-x

DO - 10.1007/s13171-012-0012-x

M3 - Article

AN - SCOPUS:84872131633

VL - 74

SP - 80

EP - 100

JO - Sankhya: The Indian Journal of Statistics

JF - Sankhya: The Indian Journal of Statistics

SN - 0972-7671

IS - 1 A

ER -