Nonidentifiability in the presence of factorization for truncated data.

Bella Vakulenko-Lagun, J Qian, S H Chiou, R A Betensky

Research output: Contribution to journalArticle

Abstract

A time to event, |$X$|⁠ , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|⁠. This often results in oversampling of large values of |$X$|⁠ , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$|⁠ , but not of the distribution of |$T$|⁠. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.
Original languageEnglish (US)
Pages (from-to)724-731
Number of pages8
JournalBiometrika
Volume106
Issue number3
StatePublished - Sep 1 2019

Fingerprint

Truncated Data
Risk Adjustment
Factorization
Quasi-independence
Adjustment
Independence Results
Left Truncation
Oversampling
Right-censored Data
Censoring
Conditional Distribution
Set theory
testing
Truncation
Estimator
Estimate
Simulation

Keywords

  • Censoring (Statistics)
  • Databases
  • Data
  • Constant-sum condition
  • Kendall's tau
  • Left truncation
  • Right censoring
  • Survival data

Cite this

Vakulenko-Lagun, B., Qian, J., Chiou, S. H., & Betensky, R. A. (2019). Nonidentifiability in the presence of factorization for truncated data. Biometrika, 106(3), 724-731.

Nonidentifiability in the presence of factorization for truncated data. / Vakulenko-Lagun, Bella; Qian, J; Chiou, S H; Betensky, R A.

In: Biometrika, Vol. 106, No. 3, 01.09.2019, p. 724-731.

Research output: Contribution to journalArticle

Vakulenko-Lagun, B, Qian, J, Chiou, SH & Betensky, RA 2019, 'Nonidentifiability in the presence of factorization for truncated data.', Biometrika, vol. 106, no. 3, pp. 724-731.
Vakulenko-Lagun B, Qian J, Chiou SH, Betensky RA. Nonidentifiability in the presence of factorization for truncated data. Biometrika. 2019 Sep 1;106(3):724-731.
Vakulenko-Lagun, Bella ; Qian, J ; Chiou, S H ; Betensky, R A. / Nonidentifiability in the presence of factorization for truncated data. In: Biometrika. 2019 ; Vol. 106, No. 3. pp. 724-731.
@article{c72689c5a9ce4d7cb4a6568e012a00a7,
title = "Nonidentifiability in the presence of factorization for truncated data.",
abstract = "A time to event, |$X$|⁠ , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|⁠. This often results in oversampling of large values of |$X$|⁠ , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$|⁠ , but not of the distribution of |$T$|⁠. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.",
keywords = "Censoring (Statistics), Databases, Data, Constant-sum condition, Kendall's tau, Left truncation, Right censoring, Survival data",
author = "Bella Vakulenko-Lagun and J Qian and Chiou, {S H} and Betensky, {R A}",
year = "2019",
month = "9",
day = "1",
language = "English (US)",
volume = "106",
pages = "724--731",
journal = "Biometrika",
issn = "0006-3444",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Nonidentifiability in the presence of factorization for truncated data.

AU - Vakulenko-Lagun, Bella

AU - Qian, J

AU - Chiou, S H

AU - Betensky, R A

PY - 2019/9/1

Y1 - 2019/9/1

N2 - A time to event, |$X$|⁠ , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|⁠. This often results in oversampling of large values of |$X$|⁠ , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$|⁠ , but not of the distribution of |$T$|⁠. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.

AB - A time to event, |$X$|⁠ , is left-truncated by |$T$| if |$X$| can be observed only if |$T<X$|⁠. This often results in oversampling of large values of |$X$|⁠ , and necessitates adjustment of estimation procedures to avoid bias. Simple risk-set adjustments can be made to standard risk-set-based estimators to accommodate left truncation when |$T$| and |$X$| are quasi-independent. We derive a weaker factorization condition for the conditional distribution of |$T$| given |$X$| in the observable region that permits risk-set adjustment for estimation of the distribution of |$X$|⁠ , but not of the distribution of |$T$|⁠. Quasi-independence results when the analogous factorization condition for |$X$| given |$T$| holds also, in which case the distributions of |$X$| and |$T$| are easily estimated. While we can test for factorization, if the test does not reject, we cannot identify which factorization condition holds, or whether quasi-independence holds. Hence we require an unverifiable assumption in order to estimate the distribution of |$X$| or |$T$| based on truncated data. This contrasts with the common understanding that truncation is different from censoring in requiring no unverifiable assumptions for estimation. We illustrate these concepts through a simulation of left-truncated and right-censored data.

KW - Censoring (Statistics)

KW - Databases

KW - Data

KW - Constant-sum condition

KW - Kendall's tau

KW - Left truncation

KW - Right censoring

KW - Survival data

M3 - Article

VL - 106

SP - 724

EP - 731

JO - Biometrika

JF - Biometrika

SN - 0006-3444

IS - 3

ER -