### Abstract

We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ^{2}. Our method involves simultaneous estimation of T, γ^{2}, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ^{2} ≤ 1. We also obtain confidence intervals for T and an estimation of the species’ distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford’s law, showing good performances of the new estimator, even beyond γ^{2} = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level.

Original language | English (US) |
---|---|

Pages (from-to) | 80-100 |

Number of pages | 21 |

Journal | Sankhya A |

Volume | 74 |

Issue number | 1 |

DOIs | |

State | Published - Feb 1 2012 |

### Fingerprint

### ASJC Scopus subject areas

- Statistics, Probability and Uncertainty
- Statistics and Probability

### Cite this

*Sankhya A*,

*74*(1), 80-100. https://doi.org/10.1007/s13171-012-0012-x

**A new estimator for the number of species in a population.** / Cecconi, Lorenzo; Gandolfi, Alberto; Sastri, Chelluri C.A.

Research output: Contribution to journal › Article

*Sankhya A*, vol. 74, no. 1, pp. 80-100. https://doi.org/10.1007/s13171-012-0012-x

}

TY - JOUR

T1 - A new estimator for the number of species in a population

AU - Cecconi, Lorenzo

AU - Gandolfi, Alberto

AU - Sastri, Chelluri C.A.

PY - 2012/2/1

Y1 - 2012/2/1

N2 - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Our method involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species’ distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford’s law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level.

AB - We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Our method involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species’ distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford’s law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level.

UR - http://www.scopus.com/inward/record.url?scp=85034584276&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034584276&partnerID=8YFLogxK

U2 - 10.1007/s13171-012-0012-x

DO - 10.1007/s13171-012-0012-x

M3 - Article

VL - 74

SP - 80

EP - 100

JO - Sankhya A

JF - Sankhya A

SN - 0976-836X

IS - 1

ER -