CAPRI: Efficient inference of cancer progression models from cross-sectional data

Daniele Ramazzotti, Giulio Caravagna, Loes Olde Loohuis, Alex Graudenzi, Ilya Korsunsky, Giancarlo Mauri, Marco Antoniotti, Bhubaneswar Mishra

Research output: Contribution to journalArticle

Abstract

We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g. The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer 'progression' models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of 'selectivity' relations, where a mutation in a gene A 'selects' for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events.

Original languageEnglish (US)
Pages (from-to)3016-3026
Number of pages11
JournalBioinformatics
Volume31
Issue number18
DOIs
StatePublished - Jul 31 2014

Fingerprint

Progression
Cancer
Genes
Neoplasms
Model
Selectivity
Genomics
Leukemia, Myeloid, Chronic, Atypical, BCR-ABL Negative
Maximum likelihood
Mutation
Gene
Trajectories
Likelihood Inference
Atlases
Atlas
Directed Acyclic Graph
Leukemia
Empirical Analysis
Irregularity
Stratification

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Ramazzotti, D., Caravagna, G., Olde Loohuis, L., Graudenzi, A., Korsunsky, I., Mauri, G., ... Mishra, B. (2014). CAPRI: Efficient inference of cancer progression models from cross-sectional data. Bioinformatics, 31(18), 3016-3026. https://doi.org/10.1093/bioinformatics/btv296

CAPRI : Efficient inference of cancer progression models from cross-sectional data. / Ramazzotti, Daniele; Caravagna, Giulio; Olde Loohuis, Loes; Graudenzi, Alex; Korsunsky, Ilya; Mauri, Giancarlo; Antoniotti, Marco; Mishra, Bhubaneswar.

In: Bioinformatics, Vol. 31, No. 18, 31.07.2014, p. 3016-3026.

Research output: Contribution to journalArticle

Ramazzotti, D, Caravagna, G, Olde Loohuis, L, Graudenzi, A, Korsunsky, I, Mauri, G, Antoniotti, M & Mishra, B 2014, 'CAPRI: Efficient inference of cancer progression models from cross-sectional data', Bioinformatics, vol. 31, no. 18, pp. 3016-3026. https://doi.org/10.1093/bioinformatics/btv296
Ramazzotti D, Caravagna G, Olde Loohuis L, Graudenzi A, Korsunsky I, Mauri G et al. CAPRI: Efficient inference of cancer progression models from cross-sectional data. Bioinformatics. 2014 Jul 31;31(18):3016-3026. https://doi.org/10.1093/bioinformatics/btv296
Ramazzotti, Daniele ; Caravagna, Giulio ; Olde Loohuis, Loes ; Graudenzi, Alex ; Korsunsky, Ilya ; Mauri, Giancarlo ; Antoniotti, Marco ; Mishra, Bhubaneswar. / CAPRI : Efficient inference of cancer progression models from cross-sectional data. In: Bioinformatics. 2014 ; Vol. 31, No. 18. pp. 3016-3026.
@article{5d5798b846064541bd055476c431b4b6,
title = "CAPRI: Efficient inference of cancer progression models from cross-sectional data",
abstract = "We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g. The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer 'progression' models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of 'selectivity' relations, where a mutation in a gene A 'selects' for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events.",
author = "Daniele Ramazzotti and Giulio Caravagna and {Olde Loohuis}, Loes and Alex Graudenzi and Ilya Korsunsky and Giancarlo Mauri and Marco Antoniotti and Bhubaneswar Mishra",
year = "2014",
month = "7",
day = "31",
doi = "10.1093/bioinformatics/btv296",
language = "English (US)",
volume = "31",
pages = "3016--3026",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "18",

}

TY - JOUR

T1 - CAPRI

T2 - Efficient inference of cancer progression models from cross-sectional data

AU - Ramazzotti, Daniele

AU - Caravagna, Giulio

AU - Olde Loohuis, Loes

AU - Graudenzi, Alex

AU - Korsunsky, Ilya

AU - Mauri, Giancarlo

AU - Antoniotti, Marco

AU - Mishra, Bhubaneswar

PY - 2014/7/31

Y1 - 2014/7/31

N2 - We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g. The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer 'progression' models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of 'selectivity' relations, where a mutation in a gene A 'selects' for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events.

AB - We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g. The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer 'progression' models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of 'selectivity' relations, where a mutation in a gene A 'selects' for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events.

UR - http://www.scopus.com/inward/record.url?scp=84941792023&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84941792023&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btv296

DO - 10.1093/bioinformatics/btv296

M3 - Article

C2 - 25971740

AN - SCOPUS:84941792023

VL - 31

SP - 3016

EP - 3026

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 18

ER -