Discovering anomalous patterns in large digital pathology images

Sriram Somanchi, Daniel Neill, Anil V. Parwani

Research output: Contribution to journalArticle

Abstract

Advances in medical imaging technology have created opportunities for computer-aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision-recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100K × 100K pixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.

Original languageEnglish (US)
Pages (from-to)3599-3615
Number of pages17
JournalStatistics in Medicine
Volume37
Issue number25
DOIs
StatePublished - Nov 10 2018

Fingerprint

Digital Image
Anomalous
Pathology
Cancer
Region of Interest
Linear Time
Scanning
Neoplasms
Pixel
Diagnostic Imaging
ROC Curve
Subset
Medical Imaging
Receiver Operating Characteristic Curve
Prostate
Microscopy
Hierarchical Structure
Statistical method
Technology
Diagnostics

Keywords

  • anomalous pattern detection
  • pathology informatics
  • prostate cancer
  • subset scanning

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

Discovering anomalous patterns in large digital pathology images. / Somanchi, Sriram; Neill, Daniel; Parwani, Anil V.

In: Statistics in Medicine, Vol. 37, No. 25, 10.11.2018, p. 3599-3615.

Research output: Contribution to journalArticle

Somanchi, Sriram ; Neill, Daniel ; Parwani, Anil V. / Discovering anomalous patterns in large digital pathology images. In: Statistics in Medicine. 2018 ; Vol. 37, No. 25. pp. 3599-3615.
@article{30de55af10484ee0b559726cd314b502,
title = "Discovering anomalous patterns in large digital pathology images",
abstract = "Advances in medical imaging technology have created opportunities for computer-aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision-recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100K × 100K pixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.",
keywords = "anomalous pattern detection, pathology informatics, prostate cancer, subset scanning",
author = "Sriram Somanchi and Daniel Neill and Parwani, {Anil V.}",
year = "2018",
month = "11",
day = "10",
doi = "10.1002/sim.7828",
language = "English (US)",
volume = "37",
pages = "3599--3615",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "25",

}

TY - JOUR

T1 - Discovering anomalous patterns in large digital pathology images

AU - Somanchi, Sriram

AU - Neill, Daniel

AU - Parwani, Anil V.

PY - 2018/11/10

Y1 - 2018/11/10

N2 - Advances in medical imaging technology have created opportunities for computer-aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision-recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100K × 100K pixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.

AB - Advances in medical imaging technology have created opportunities for computer-aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision-recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100K × 100K pixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.

KW - anomalous pattern detection

KW - pathology informatics

KW - prostate cancer

KW - subset scanning

UR - http://www.scopus.com/inward/record.url?scp=85054668922&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054668922&partnerID=8YFLogxK

U2 - 10.1002/sim.7828

DO - 10.1002/sim.7828

M3 - Article

C2 - 29900578

AN - SCOPUS:85054668922

VL - 37

SP - 3599

EP - 3615

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 25

ER -