High resolution annotation of zebrafish transcriptome using long-read sequencing

German Nudelman, Antonio Frasca, Brandon Kent, Kirsten Sadler Edepli, Stuart C. Sealfon, Martin J. Walsh, Elena Zaslavsky

Research output: Contribution to journalArticle

Abstract

With the emergence of zebrafish as an important model organism, a concerted effort has been made to study its transcriptome. This effort is limited, however, by gaps in zebrafish annotation, which are especially pronounced concerning transcripts dynamically expressed during zygotic genome activation (ZGA). To date, short-read sequencing has been the principal technology for zebrafish transcriptome annotation. In part because these sequence reads are too short for assembly methods to resolve the full complexity of the transcriptome, the current annotation is rudimentary. By providing direct observation of full-length transcripts, recently refined long-read sequencing platforms can dramatically improve annotation coverage and accuracy. Here, we leveraged the SMRT platform to study the transcriptome of zebrafish embryos before and after ZGA. Our analysis revealed additional novelty and complexity in the zebrafish transcriptome, identifying 2539 high-confidence novel transcripts that originated from previously unannotated loci and 1835 high-confidence new isoforms in previously annotated genes. We validated these findings using a suite of computational approaches including structural prediction, sequence homology, and functional conservation analyses, as well as by confirmatory transcript quantification with short-read sequencing data. Our analyses provided insight into new homologs and paralogs of functionally important proteins and noncoding RNAs, isoform switching occurrences, and different classes of novel splicing events. Several novel isoforms representing distinct splicing events were validated through PCR experiments, including the discovery and validation of a novel 8-kb transcript spanning multiple mir-430 elements, an important driver of early development. Our study provides a significantly improved zebrafish transcriptome annotation resource.

Original languageEnglish (US)
Pages (from-to)1415-1425
Number of pages11
JournalGenome Research
Volume28
Issue number9
DOIs
StatePublished - Sep 1 2018

Fingerprint

Zebrafish
Transcriptome
Protein Isoforms
Genome
RNA Isoforms
Untranslated RNA
Sequence Homology
Embryonic Structures
Observation
Technology
Polymerase Chain Reaction
Genes
Proteins

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Nudelman, G., Frasca, A., Kent, B., Sadler Edepli, K., Sealfon, S. C., Walsh, M. J., & Zaslavsky, E. (2018). High resolution annotation of zebrafish transcriptome using long-read sequencing. Genome Research, 28(9), 1415-1425. https://doi.org/10.1101/gr.223586.117

High resolution annotation of zebrafish transcriptome using long-read sequencing. / Nudelman, German; Frasca, Antonio; Kent, Brandon; Sadler Edepli, Kirsten; Sealfon, Stuart C.; Walsh, Martin J.; Zaslavsky, Elena.

In: Genome Research, Vol. 28, No. 9, 01.09.2018, p. 1415-1425.

Research output: Contribution to journalArticle

Nudelman, G, Frasca, A, Kent, B, Sadler Edepli, K, Sealfon, SC, Walsh, MJ & Zaslavsky, E 2018, 'High resolution annotation of zebrafish transcriptome using long-read sequencing', Genome Research, vol. 28, no. 9, pp. 1415-1425. https://doi.org/10.1101/gr.223586.117
Nudelman, German ; Frasca, Antonio ; Kent, Brandon ; Sadler Edepli, Kirsten ; Sealfon, Stuart C. ; Walsh, Martin J. ; Zaslavsky, Elena. / High resolution annotation of zebrafish transcriptome using long-read sequencing. In: Genome Research. 2018 ; Vol. 28, No. 9. pp. 1415-1425.
@article{bf49406823444abcaeddf3a6b51a3732,
title = "High resolution annotation of zebrafish transcriptome using long-read sequencing",
abstract = "With the emergence of zebrafish as an important model organism, a concerted effort has been made to study its transcriptome. This effort is limited, however, by gaps in zebrafish annotation, which are especially pronounced concerning transcripts dynamically expressed during zygotic genome activation (ZGA). To date, short-read sequencing has been the principal technology for zebrafish transcriptome annotation. In part because these sequence reads are too short for assembly methods to resolve the full complexity of the transcriptome, the current annotation is rudimentary. By providing direct observation of full-length transcripts, recently refined long-read sequencing platforms can dramatically improve annotation coverage and accuracy. Here, we leveraged the SMRT platform to study the transcriptome of zebrafish embryos before and after ZGA. Our analysis revealed additional novelty and complexity in the zebrafish transcriptome, identifying 2539 high-confidence novel transcripts that originated from previously unannotated loci and 1835 high-confidence new isoforms in previously annotated genes. We validated these findings using a suite of computational approaches including structural prediction, sequence homology, and functional conservation analyses, as well as by confirmatory transcript quantification with short-read sequencing data. Our analyses provided insight into new homologs and paralogs of functionally important proteins and noncoding RNAs, isoform switching occurrences, and different classes of novel splicing events. Several novel isoforms representing distinct splicing events were validated through PCR experiments, including the discovery and validation of a novel 8-kb transcript spanning multiple mir-430 elements, an important driver of early development. Our study provides a significantly improved zebrafish transcriptome annotation resource.",
author = "German Nudelman and Antonio Frasca and Brandon Kent and {Sadler Edepli}, Kirsten and Sealfon, {Stuart C.} and Walsh, {Martin J.} and Elena Zaslavsky",
year = "2018",
month = "9",
day = "1",
doi = "10.1101/gr.223586.117",
language = "English (US)",
volume = "28",
pages = "1415--1425",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "9",

}

TY - JOUR

T1 - High resolution annotation of zebrafish transcriptome using long-read sequencing

AU - Nudelman, German

AU - Frasca, Antonio

AU - Kent, Brandon

AU - Sadler Edepli, Kirsten

AU - Sealfon, Stuart C.

AU - Walsh, Martin J.

AU - Zaslavsky, Elena

PY - 2018/9/1

Y1 - 2018/9/1

N2 - With the emergence of zebrafish as an important model organism, a concerted effort has been made to study its transcriptome. This effort is limited, however, by gaps in zebrafish annotation, which are especially pronounced concerning transcripts dynamically expressed during zygotic genome activation (ZGA). To date, short-read sequencing has been the principal technology for zebrafish transcriptome annotation. In part because these sequence reads are too short for assembly methods to resolve the full complexity of the transcriptome, the current annotation is rudimentary. By providing direct observation of full-length transcripts, recently refined long-read sequencing platforms can dramatically improve annotation coverage and accuracy. Here, we leveraged the SMRT platform to study the transcriptome of zebrafish embryos before and after ZGA. Our analysis revealed additional novelty and complexity in the zebrafish transcriptome, identifying 2539 high-confidence novel transcripts that originated from previously unannotated loci and 1835 high-confidence new isoforms in previously annotated genes. We validated these findings using a suite of computational approaches including structural prediction, sequence homology, and functional conservation analyses, as well as by confirmatory transcript quantification with short-read sequencing data. Our analyses provided insight into new homologs and paralogs of functionally important proteins and noncoding RNAs, isoform switching occurrences, and different classes of novel splicing events. Several novel isoforms representing distinct splicing events were validated through PCR experiments, including the discovery and validation of a novel 8-kb transcript spanning multiple mir-430 elements, an important driver of early development. Our study provides a significantly improved zebrafish transcriptome annotation resource.

AB - With the emergence of zebrafish as an important model organism, a concerted effort has been made to study its transcriptome. This effort is limited, however, by gaps in zebrafish annotation, which are especially pronounced concerning transcripts dynamically expressed during zygotic genome activation (ZGA). To date, short-read sequencing has been the principal technology for zebrafish transcriptome annotation. In part because these sequence reads are too short for assembly methods to resolve the full complexity of the transcriptome, the current annotation is rudimentary. By providing direct observation of full-length transcripts, recently refined long-read sequencing platforms can dramatically improve annotation coverage and accuracy. Here, we leveraged the SMRT platform to study the transcriptome of zebrafish embryos before and after ZGA. Our analysis revealed additional novelty and complexity in the zebrafish transcriptome, identifying 2539 high-confidence novel transcripts that originated from previously unannotated loci and 1835 high-confidence new isoforms in previously annotated genes. We validated these findings using a suite of computational approaches including structural prediction, sequence homology, and functional conservation analyses, as well as by confirmatory transcript quantification with short-read sequencing data. Our analyses provided insight into new homologs and paralogs of functionally important proteins and noncoding RNAs, isoform switching occurrences, and different classes of novel splicing events. Several novel isoforms representing distinct splicing events were validated through PCR experiments, including the discovery and validation of a novel 8-kb transcript spanning multiple mir-430 elements, an important driver of early development. Our study provides a significantly improved zebrafish transcriptome annotation resource.

UR - http://www.scopus.com/inward/record.url?scp=85052756186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052756186&partnerID=8YFLogxK

U2 - 10.1101/gr.223586.117

DO - 10.1101/gr.223586.117

M3 - Article

C2 - 30061115

AN - SCOPUS:85052756186

VL - 28

SP - 1415

EP - 1425

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 9

ER -