Piphillin

Improved prediction of metagenomic content by direct inference from human microbiomes

Shoko Iwai, Thomas Weinmaier, Brian Schmidt, Donna Albertson, Neil J. Poloso, Karim Dabbagh, Todd Z. DeSantis

Research output: Contribution to journalArticle

Abstract

Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin's ability to predict disease associations with specific gene orthologs exhibited a 15% increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.

Original languageEnglish (US)
Article numbere0166104
JournalPLoS One
Volume11
Issue number11
DOIs
StatePublished - Nov 1 2016

Fingerprint

Metagenomics
Microbiota
Genes
Firearms
prediction
Databases
genes
Laboratory Animals
rRNA Genes
Network protocols
Functional analysis
phylogeny
Software
Processing
sampling
laboratory animals
microbial communities
Animals
ribosomal RNA
microbiome

ASJC Scopus subject areas

  • Medicine(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Piphillin : Improved prediction of metagenomic content by direct inference from human microbiomes. / Iwai, Shoko; Weinmaier, Thomas; Schmidt, Brian; Albertson, Donna; Poloso, Neil J.; Dabbagh, Karim; DeSantis, Todd Z.

In: PLoS One, Vol. 11, No. 11, e0166104, 01.11.2016.

Research output: Contribution to journalArticle

Iwai, Shoko ; Weinmaier, Thomas ; Schmidt, Brian ; Albertson, Donna ; Poloso, Neil J. ; Dabbagh, Karim ; DeSantis, Todd Z. / Piphillin : Improved prediction of metagenomic content by direct inference from human microbiomes. In: PLoS One. 2016 ; Vol. 11, No. 11.
@article{761a59d953ec46dc8326623e5fed0d76,
title = "Piphillin: Improved prediction of metagenomic content by direct inference from human microbiomes",
abstract = "Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin's ability to predict disease associations with specific gene orthologs exhibited a 15{\%} increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.",
author = "Shoko Iwai and Thomas Weinmaier and Brian Schmidt and Donna Albertson and Poloso, {Neil J.} and Karim Dabbagh and DeSantis, {Todd Z.}",
year = "2016",
month = "11",
day = "1",
doi = "10.1371/journal.pone.0166104",
language = "English (US)",
volume = "11",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "11",

}

TY - JOUR

T1 - Piphillin

T2 - Improved prediction of metagenomic content by direct inference from human microbiomes

AU - Iwai, Shoko

AU - Weinmaier, Thomas

AU - Schmidt, Brian

AU - Albertson, Donna

AU - Poloso, Neil J.

AU - Dabbagh, Karim

AU - DeSantis, Todd Z.

PY - 2016/11/1

Y1 - 2016/11/1

N2 - Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin's ability to predict disease associations with specific gene orthologs exhibited a 15% increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.

AB - Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin's ability to predict disease associations with specific gene orthologs exhibited a 15% increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.

UR - http://www.scopus.com/inward/record.url?scp=84994638542&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994638542&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0166104

DO - 10.1371/journal.pone.0166104

M3 - Article

VL - 11

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 11

M1 - e0166104

ER -