Extensive sequencing of seven human genomes to characterize benchmark reference materials

Justin M. Zook, David Catoe, Jennifer McDaniel, Lindsay Vang, Noah Spies, Arend Sidow, Ziming Weng, Yuling Liu, Christopher E. Mason, Noah Alexander, Elizabeth Henaff, Alexa B.R. McIntyre, Dhruva Chandramohan, Feng Chen, Erich Jaeger, Ali Moshrefi, Khoa Pham, William Stedman, Tiffany Liang, Michael Saghbini & 33 others Zeljko Dzakula, Alex Hastie, Han Cao, Gintaras Deikus, Eric Schadt, Robert Sebra, Ali Bashir, Rebecca M. Truty, Christopher C. Chang, Natali Gulbahce, Keyan Zhao, Srinka Ghosh, Fiona Hyland, Yutao Fu, Mark Chaisson, Chunlin Xiao, Jonathan Trow, Stephen T. Sherry, Alexander W. Zaranek, Madeleine Ball, Jason Bobe, Preston Estep, George M. Church, Patrick Marks, Sofia Kyriazopoulou-Panagiotopoulou, Grace X.Y. Zheng, Michael Schnall-Levin, Heather S. Ordonez, Patrice A. Mudivarti, Kristina Giorda, Ying Sheng, Karoline Bjarnesdatter Rypdal, Marc Salit

Research output: Contribution to journalArticle

Abstract

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

Original languageEnglish (US)
Article number160025
JournalScientific data
Volume3
DOIs
StatePublished - Jun 7 2016

Fingerprint

Sequencing
Genome
Genes
Benchmark
Genomics
Nanopore
benchmarking
Nanopores
Bottles
Benchmarking
Human
Large Set
candidacy
Protons
DNA
Cells
Line
Cell
Ions
Standards

ASJC Scopus subject areas

  • Statistics and Probability
  • Information Systems
  • Education
  • Computer Science Applications
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Cite this

Zook, J. M., Catoe, D., McDaniel, J., Vang, L., Spies, N., Sidow, A., ... Salit, M. (2016). Extensive sequencing of seven human genomes to characterize benchmark reference materials. Scientific data, 3, [160025]. https://doi.org/10.1038/sdata.2016.25

Extensive sequencing of seven human genomes to characterize benchmark reference materials. / Zook, Justin M.; Catoe, David; McDaniel, Jennifer; Vang, Lindsay; Spies, Noah; Sidow, Arend; Weng, Ziming; Liu, Yuling; Mason, Christopher E.; Alexander, Noah; Henaff, Elizabeth; McIntyre, Alexa B.R.; Chandramohan, Dhruva; Chen, Feng; Jaeger, Erich; Moshrefi, Ali; Pham, Khoa; Stedman, William; Liang, Tiffany; Saghbini, Michael; Dzakula, Zeljko; Hastie, Alex; Cao, Han; Deikus, Gintaras; Schadt, Eric; Sebra, Robert; Bashir, Ali; Truty, Rebecca M.; Chang, Christopher C.; Gulbahce, Natali; Zhao, Keyan; Ghosh, Srinka; Hyland, Fiona; Fu, Yutao; Chaisson, Mark; Xiao, Chunlin; Trow, Jonathan; Sherry, Stephen T.; Zaranek, Alexander W.; Ball, Madeleine; Bobe, Jason; Estep, Preston; Church, George M.; Marks, Patrick; Kyriazopoulou-Panagiotopoulou, Sofia; Zheng, Grace X.Y.; Schnall-Levin, Michael; Ordonez, Heather S.; Mudivarti, Patrice A.; Giorda, Kristina; Sheng, Ying; Rypdal, Karoline Bjarnesdatter; Salit, Marc.

In: Scientific data, Vol. 3, 160025, 07.06.2016.

Research output: Contribution to journalArticle

Zook, JM, Catoe, D, McDaniel, J, Vang, L, Spies, N, Sidow, A, Weng, Z, Liu, Y, Mason, CE, Alexander, N, Henaff, E, McIntyre, ABR, Chandramohan, D, Chen, F, Jaeger, E, Moshrefi, A, Pham, K, Stedman, W, Liang, T, Saghbini, M, Dzakula, Z, Hastie, A, Cao, H, Deikus, G, Schadt, E, Sebra, R, Bashir, A, Truty, RM, Chang, CC, Gulbahce, N, Zhao, K, Ghosh, S, Hyland, F, Fu, Y, Chaisson, M, Xiao, C, Trow, J, Sherry, ST, Zaranek, AW, Ball, M, Bobe, J, Estep, P, Church, GM, Marks, P, Kyriazopoulou-Panagiotopoulou, S, Zheng, GXY, Schnall-Levin, M, Ordonez, HS, Mudivarti, PA, Giorda, K, Sheng, Y, Rypdal, KB & Salit, M 2016, 'Extensive sequencing of seven human genomes to characterize benchmark reference materials', Scientific data, vol. 3, 160025. https://doi.org/10.1038/sdata.2016.25
Zook, Justin M. ; Catoe, David ; McDaniel, Jennifer ; Vang, Lindsay ; Spies, Noah ; Sidow, Arend ; Weng, Ziming ; Liu, Yuling ; Mason, Christopher E. ; Alexander, Noah ; Henaff, Elizabeth ; McIntyre, Alexa B.R. ; Chandramohan, Dhruva ; Chen, Feng ; Jaeger, Erich ; Moshrefi, Ali ; Pham, Khoa ; Stedman, William ; Liang, Tiffany ; Saghbini, Michael ; Dzakula, Zeljko ; Hastie, Alex ; Cao, Han ; Deikus, Gintaras ; Schadt, Eric ; Sebra, Robert ; Bashir, Ali ; Truty, Rebecca M. ; Chang, Christopher C. ; Gulbahce, Natali ; Zhao, Keyan ; Ghosh, Srinka ; Hyland, Fiona ; Fu, Yutao ; Chaisson, Mark ; Xiao, Chunlin ; Trow, Jonathan ; Sherry, Stephen T. ; Zaranek, Alexander W. ; Ball, Madeleine ; Bobe, Jason ; Estep, Preston ; Church, George M. ; Marks, Patrick ; Kyriazopoulou-Panagiotopoulou, Sofia ; Zheng, Grace X.Y. ; Schnall-Levin, Michael ; Ordonez, Heather S. ; Mudivarti, Patrice A. ; Giorda, Kristina ; Sheng, Ying ; Rypdal, Karoline Bjarnesdatter ; Salit, Marc. / Extensive sequencing of seven human genomes to characterize benchmark reference materials. In: Scientific data. 2016 ; Vol. 3.
@article{19068d89889f49fc92fbf0994cca2be0,
title = "Extensive sequencing of seven human genomes to characterize benchmark reference materials",
abstract = "The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.",
author = "Zook, {Justin M.} and David Catoe and Jennifer McDaniel and Lindsay Vang and Noah Spies and Arend Sidow and Ziming Weng and Yuling Liu and Mason, {Christopher E.} and Noah Alexander and Elizabeth Henaff and McIntyre, {Alexa B.R.} and Dhruva Chandramohan and Feng Chen and Erich Jaeger and Ali Moshrefi and Khoa Pham and William Stedman and Tiffany Liang and Michael Saghbini and Zeljko Dzakula and Alex Hastie and Han Cao and Gintaras Deikus and Eric Schadt and Robert Sebra and Ali Bashir and Truty, {Rebecca M.} and Chang, {Christopher C.} and Natali Gulbahce and Keyan Zhao and Srinka Ghosh and Fiona Hyland and Yutao Fu and Mark Chaisson and Chunlin Xiao and Jonathan Trow and Sherry, {Stephen T.} and Zaranek, {Alexander W.} and Madeleine Ball and Jason Bobe and Preston Estep and Church, {George M.} and Patrick Marks and Sofia Kyriazopoulou-Panagiotopoulou and Zheng, {Grace X.Y.} and Michael Schnall-Levin and Ordonez, {Heather S.} and Mudivarti, {Patrice A.} and Kristina Giorda and Ying Sheng and Rypdal, {Karoline Bjarnesdatter} and Marc Salit",
year = "2016",
month = "6",
day = "7",
doi = "10.1038/sdata.2016.25",
language = "English (US)",
volume = "3",
journal = "Scientific data",
issn = "2052-4463",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - Extensive sequencing of seven human genomes to characterize benchmark reference materials

AU - Zook, Justin M.

AU - Catoe, David

AU - McDaniel, Jennifer

AU - Vang, Lindsay

AU - Spies, Noah

AU - Sidow, Arend

AU - Weng, Ziming

AU - Liu, Yuling

AU - Mason, Christopher E.

AU - Alexander, Noah

AU - Henaff, Elizabeth

AU - McIntyre, Alexa B.R.

AU - Chandramohan, Dhruva

AU - Chen, Feng

AU - Jaeger, Erich

AU - Moshrefi, Ali

AU - Pham, Khoa

AU - Stedman, William

AU - Liang, Tiffany

AU - Saghbini, Michael

AU - Dzakula, Zeljko

AU - Hastie, Alex

AU - Cao, Han

AU - Deikus, Gintaras

AU - Schadt, Eric

AU - Sebra, Robert

AU - Bashir, Ali

AU - Truty, Rebecca M.

AU - Chang, Christopher C.

AU - Gulbahce, Natali

AU - Zhao, Keyan

AU - Ghosh, Srinka

AU - Hyland, Fiona

AU - Fu, Yutao

AU - Chaisson, Mark

AU - Xiao, Chunlin

AU - Trow, Jonathan

AU - Sherry, Stephen T.

AU - Zaranek, Alexander W.

AU - Ball, Madeleine

AU - Bobe, Jason

AU - Estep, Preston

AU - Church, George M.

AU - Marks, Patrick

AU - Kyriazopoulou-Panagiotopoulou, Sofia

AU - Zheng, Grace X.Y.

AU - Schnall-Levin, Michael

AU - Ordonez, Heather S.

AU - Mudivarti, Patrice A.

AU - Giorda, Kristina

AU - Sheng, Ying

AU - Rypdal, Karoline Bjarnesdatter

AU - Salit, Marc

PY - 2016/6/7

Y1 - 2016/6/7

N2 - The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

AB - The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly.

UR - http://www.scopus.com/inward/record.url?scp=84976413217&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976413217&partnerID=8YFLogxK

U2 - 10.1038/sdata.2016.25

DO - 10.1038/sdata.2016.25

M3 - Article

VL - 3

JO - Scientific data

JF - Scientific data

SN - 2052-4463

M1 - 160025

ER -