### Abstract

Original language | English (US) |
---|---|

Pages (from-to) | 724-731 |

Number of pages | 8 |

Journal | Biometrika |

Volume | 106 |

Issue number | 3 |

State | Published - Sep 1 2019 |

### Fingerprint

### Keywords

- Censoring (Statistics)
- Databases
- Data
- Constant-sum condition
- Kendall's tau
- Left truncation
- Right censoring
- Survival data

### Cite this

*Biometrika*,

*106*(3), 724-731.

**Nonidentifiability in the presence of factorization for truncated data.** / Vakulenko-Lagun, Bella; Qian, J; Chiou, S H; Betensky, R A.

Research output: Contribution to journal › Article

*Biometrika*, vol. 106, no. 3, pp. 724-731.

}

TY - JOUR

T1 - Nonidentifiability in the presence of factorization for truncated data.

AU - Vakulenko-Lagun, Bella

AU - Qian, J

AU - Chiou, S H

AU - Betensky, R A

PY - 2019/9/1

Y1 - 2019/9/1

N2 - A time to event, |$X$| , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|. This often results in oversampling of large values of |$X$| , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$| , but not of the distribution of |$T$|. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.

AB - A time to event, |$X$| , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|. This often results in oversampling of large values of |$X$| , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$| , but not of the distribution of |$T$|. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.

KW - Censoring (Statistics)

KW - Databases

KW - Data

KW - Constant-sum condition

KW - Kendall's tau

KW - Left truncation

KW - Right censoring

KW - Survival data

M3 - Article

VL - 106

SP - 724

EP - 731

JO - Biometrika

JF - Biometrika

SN - 0006-3444

IS - 3

ER -