Measurement of inter-rater agreement for transient events using Monte Carlo sampled permutations

Robert G. Norman, Marc A. Scott

Research output: Contribution to journalArticle

Abstract

In this paper we demonstrate the adverse effect of serially observed data sequences containing transient events on the calculation of Cohen's κ as an index of inter-rater agreement in the detection of these events. We develop and use a Monte-Carlo-based permutation technique to produce an empiric distribution of κ in the presence of serial dependence. We find that the empiric confidence intervals for κ tend to be wider than parametrically derived intervals and in the case of longer event lengths, are markedly so. We evaluate the effect of number and length of events, and further, describe and evaluate three permutation methods which match specific rating situations. Finally, we apply these techniques to the measurement of inter-rater agreement for sleep disordered breathing events, a transient event identified during nocturnal polysomnography, for which traditionally computed confidence intervals for κ are incorrect.

Original languageEnglish (US)
Pages (from-to)931-942
Number of pages12
JournalStatistics in Medicine
Volume26
Issue number4
DOIs
StatePublished - Feb 20 2007

Fingerprint

Permutation
Confidence Intervals
Polysomnography
Sleep Apnea Syndromes
Confidence interval
Serial Dependence
Evaluate
Sleep
Tend
Interval
Demonstrate

Keywords

  • Inter-rater agreement
  • Monte Carlo
  • Permutation test
  • Transient events

ASJC Scopus subject areas

  • Epidemiology

Cite this

Measurement of inter-rater agreement for transient events using Monte Carlo sampled permutations. / Norman, Robert G.; Scott, Marc A.

In: Statistics in Medicine, Vol. 26, No. 4, 20.02.2007, p. 931-942.

Research output: Contribution to journalArticle

@article{c269e15637be44d59b1983871418a75a,
title = "Measurement of inter-rater agreement for transient events using Monte Carlo sampled permutations",
abstract = "In this paper we demonstrate the adverse effect of serially observed data sequences containing transient events on the calculation of Cohen's κ as an index of inter-rater agreement in the detection of these events. We develop and use a Monte-Carlo-based permutation technique to produce an empiric distribution of κ in the presence of serial dependence. We find that the empiric confidence intervals for κ tend to be wider than parametrically derived intervals and in the case of longer event lengths, are markedly so. We evaluate the effect of number and length of events, and further, describe and evaluate three permutation methods which match specific rating situations. Finally, we apply these techniques to the measurement of inter-rater agreement for sleep disordered breathing events, a transient event identified during nocturnal polysomnography, for which traditionally computed confidence intervals for κ are incorrect.",
keywords = "Inter-rater agreement, Monte Carlo, Permutation test, Transient events",
author = "Norman, {Robert G.} and Scott, {Marc A.}",
year = "2007",
month = "2",
day = "20",
doi = "10.1002/sim.2568",
language = "English (US)",
volume = "26",
pages = "931--942",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "4",

}

TY - JOUR

T1 - Measurement of inter-rater agreement for transient events using Monte Carlo sampled permutations

AU - Norman, Robert G.

AU - Scott, Marc A.

PY - 2007/2/20

Y1 - 2007/2/20

N2 - In this paper we demonstrate the adverse effect of serially observed data sequences containing transient events on the calculation of Cohen's κ as an index of inter-rater agreement in the detection of these events. We develop and use a Monte-Carlo-based permutation technique to produce an empiric distribution of κ in the presence of serial dependence. We find that the empiric confidence intervals for κ tend to be wider than parametrically derived intervals and in the case of longer event lengths, are markedly so. We evaluate the effect of number and length of events, and further, describe and evaluate three permutation methods which match specific rating situations. Finally, we apply these techniques to the measurement of inter-rater agreement for sleep disordered breathing events, a transient event identified during nocturnal polysomnography, for which traditionally computed confidence intervals for κ are incorrect.

AB - In this paper we demonstrate the adverse effect of serially observed data sequences containing transient events on the calculation of Cohen's κ as an index of inter-rater agreement in the detection of these events. We develop and use a Monte-Carlo-based permutation technique to produce an empiric distribution of κ in the presence of serial dependence. We find that the empiric confidence intervals for κ tend to be wider than parametrically derived intervals and in the case of longer event lengths, are markedly so. We evaluate the effect of number and length of events, and further, describe and evaluate three permutation methods which match specific rating situations. Finally, we apply these techniques to the measurement of inter-rater agreement for sleep disordered breathing events, a transient event identified during nocturnal polysomnography, for which traditionally computed confidence intervals for κ are incorrect.

KW - Inter-rater agreement

KW - Monte Carlo

KW - Permutation test

KW - Transient events

UR - http://www.scopus.com/inward/record.url?scp=33846815145&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33846815145&partnerID=8YFLogxK

U2 - 10.1002/sim.2568

DO - 10.1002/sim.2568

M3 - Article

C2 - 16612834

AN - SCOPUS:33846815145

VL - 26

SP - 931

EP - 942

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 4

ER -