LINE insertion polymorphisms are abundant but at low frequencies across populations of Anolis carolinensis

Robert P. Ruggiero, Yann Bourgeois, Stephane Boissinot

    Research output: Contribution to journalArticle

    Abstract

    Vertebrate genomes differ considerably in size and structure. Among the features that show the most variation is the abundance of Long Interspersed Nuclear Elements (LINEs). Mammalian genomes contain 100,000s LINEs that belong to a single clade, L1, and in most species a single family is usually active at a time. In contrast, non-mammalian vertebrates (fish, amphibians and reptiles) contain multiple active families, belonging to several clades, but each of them is represented by a small number of recently inserted copies. It is unclear why vertebrate genomes harbor such drastic differences in LINE composition. To address this issue, we conducted whole genome resequencing to investigate the population genomics of LINEs across 13 genomes of the lizard Anolis carolinensis sampled from two geographically and genetically distinct populations in the Eastern Florida and the Gulf Atlantic regions of the United States. We used the Mobile Element Locator Tool to identify and genotype polymorphic insertions from five major clades of LINEs (CR1, L1, L2, RTE and R4) and the 41 subfamilies that constitute them. Across these groups we found large variation in the frequency of polymorphic insertions and the observed length distributions of these insertions, suggesting these groups vary in their activity and how frequently they successfully generate full-length, potentially active copies. Though we found an abundance of polymorphic insertions (over 45,000) most of these were observed at low frequencies and typically appeared as singletons. Site frequency spectra for most LINEs showed a significant shift toward low frequency alleles compared to the spectra observed for total genomic single nucleotide polymorphisms. Using Tajima's D, FST and the mean number of pairwise differences in LINE insertion polymorphisms, we found evidence that negative selection is acting on LINE families in a length-dependent manner, its effects being stronger in the larger Eastern Florida population. Our results suggest that a large effective population size and negative selection limit the expansion of polymorphic LINE insertions across these populations and that the probability of LINE polymorphisms reaching fixation is extremely low.

    Original languageEnglish (US)
    Article number44
    JournalFrontiers in Genetics
    Volume8
    Issue numberAPR
    DOIs
    StatePublished - Apr 13 2017

    Fingerprint

    Genome
    Vertebrates
    Population
    Long Interspersed Nucleotide Elements
    Metagenomics
    Reptiles
    Lizards
    Amphibians
    Population Density
    Nuclear Family
    Gene Frequency
    Single Nucleotide Polymorphism
    Fishes
    Genotype

    Keywords

    • Anolis carolinensis
    • Genome resequencing
    • LINE
    • Retrotransposon
    • Selection
    • Transposable element

    ASJC Scopus subject areas

    • Molecular Medicine
    • Genetics
    • Genetics(clinical)

    Cite this

    LINE insertion polymorphisms are abundant but at low frequencies across populations of Anolis carolinensis. / Ruggiero, Robert P.; Bourgeois, Yann; Boissinot, Stephane.

    In: Frontiers in Genetics, Vol. 8, No. APR, 44, 13.04.2017.

    Research output: Contribution to journalArticle

    Ruggiero, Robert P. ; Bourgeois, Yann ; Boissinot, Stephane. / LINE insertion polymorphisms are abundant but at low frequencies across populations of Anolis carolinensis. In: Frontiers in Genetics. 2017 ; Vol. 8, No. APR.
    @article{493a0c85f55442c998ed31827cf7fbbb,
    title = "LINE insertion polymorphisms are abundant but at low frequencies across populations of Anolis carolinensis",
    abstract = "Vertebrate genomes differ considerably in size and structure. Among the features that show the most variation is the abundance of Long Interspersed Nuclear Elements (LINEs). Mammalian genomes contain 100,000s LINEs that belong to a single clade, L1, and in most species a single family is usually active at a time. In contrast, non-mammalian vertebrates (fish, amphibians and reptiles) contain multiple active families, belonging to several clades, but each of them is represented by a small number of recently inserted copies. It is unclear why vertebrate genomes harbor such drastic differences in LINE composition. To address this issue, we conducted whole genome resequencing to investigate the population genomics of LINEs across 13 genomes of the lizard Anolis carolinensis sampled from two geographically and genetically distinct populations in the Eastern Florida and the Gulf Atlantic regions of the United States. We used the Mobile Element Locator Tool to identify and genotype polymorphic insertions from five major clades of LINEs (CR1, L1, L2, RTE and R4) and the 41 subfamilies that constitute them. Across these groups we found large variation in the frequency of polymorphic insertions and the observed length distributions of these insertions, suggesting these groups vary in their activity and how frequently they successfully generate full-length, potentially active copies. Though we found an abundance of polymorphic insertions (over 45,000) most of these were observed at low frequencies and typically appeared as singletons. Site frequency spectra for most LINEs showed a significant shift toward low frequency alleles compared to the spectra observed for total genomic single nucleotide polymorphisms. Using Tajima's D, FST and the mean number of pairwise differences in LINE insertion polymorphisms, we found evidence that negative selection is acting on LINE families in a length-dependent manner, its effects being stronger in the larger Eastern Florida population. Our results suggest that a large effective population size and negative selection limit the expansion of polymorphic LINE insertions across these populations and that the probability of LINE polymorphisms reaching fixation is extremely low.",
    keywords = "Anolis carolinensis, Genome resequencing, LINE, Retrotransposon, Selection, Transposable element",
    author = "Ruggiero, {Robert P.} and Yann Bourgeois and Stephane Boissinot",
    year = "2017",
    month = "4",
    day = "13",
    doi = "10.3389/fgene.2017.00044",
    language = "English (US)",
    volume = "8",
    journal = "Frontiers in Genetics",
    issn = "1664-8021",
    publisher = "Frontiers Media S. A.",
    number = "APR",

    }

    TY - JOUR

    T1 - LINE insertion polymorphisms are abundant but at low frequencies across populations of Anolis carolinensis

    AU - Ruggiero, Robert P.

    AU - Bourgeois, Yann

    AU - Boissinot, Stephane

    PY - 2017/4/13

    Y1 - 2017/4/13

    N2 - Vertebrate genomes differ considerably in size and structure. Among the features that show the most variation is the abundance of Long Interspersed Nuclear Elements (LINEs). Mammalian genomes contain 100,000s LINEs that belong to a single clade, L1, and in most species a single family is usually active at a time. In contrast, non-mammalian vertebrates (fish, amphibians and reptiles) contain multiple active families, belonging to several clades, but each of them is represented by a small number of recently inserted copies. It is unclear why vertebrate genomes harbor such drastic differences in LINE composition. To address this issue, we conducted whole genome resequencing to investigate the population genomics of LINEs across 13 genomes of the lizard Anolis carolinensis sampled from two geographically and genetically distinct populations in the Eastern Florida and the Gulf Atlantic regions of the United States. We used the Mobile Element Locator Tool to identify and genotype polymorphic insertions from five major clades of LINEs (CR1, L1, L2, RTE and R4) and the 41 subfamilies that constitute them. Across these groups we found large variation in the frequency of polymorphic insertions and the observed length distributions of these insertions, suggesting these groups vary in their activity and how frequently they successfully generate full-length, potentially active copies. Though we found an abundance of polymorphic insertions (over 45,000) most of these were observed at low frequencies and typically appeared as singletons. Site frequency spectra for most LINEs showed a significant shift toward low frequency alleles compared to the spectra observed for total genomic single nucleotide polymorphisms. Using Tajima's D, FST and the mean number of pairwise differences in LINE insertion polymorphisms, we found evidence that negative selection is acting on LINE families in a length-dependent manner, its effects being stronger in the larger Eastern Florida population. Our results suggest that a large effective population size and negative selection limit the expansion of polymorphic LINE insertions across these populations and that the probability of LINE polymorphisms reaching fixation is extremely low.

    AB - Vertebrate genomes differ considerably in size and structure. Among the features that show the most variation is the abundance of Long Interspersed Nuclear Elements (LINEs). Mammalian genomes contain 100,000s LINEs that belong to a single clade, L1, and in most species a single family is usually active at a time. In contrast, non-mammalian vertebrates (fish, amphibians and reptiles) contain multiple active families, belonging to several clades, but each of them is represented by a small number of recently inserted copies. It is unclear why vertebrate genomes harbor such drastic differences in LINE composition. To address this issue, we conducted whole genome resequencing to investigate the population genomics of LINEs across 13 genomes of the lizard Anolis carolinensis sampled from two geographically and genetically distinct populations in the Eastern Florida and the Gulf Atlantic regions of the United States. We used the Mobile Element Locator Tool to identify and genotype polymorphic insertions from five major clades of LINEs (CR1, L1, L2, RTE and R4) and the 41 subfamilies that constitute them. Across these groups we found large variation in the frequency of polymorphic insertions and the observed length distributions of these insertions, suggesting these groups vary in their activity and how frequently they successfully generate full-length, potentially active copies. Though we found an abundance of polymorphic insertions (over 45,000) most of these were observed at low frequencies and typically appeared as singletons. Site frequency spectra for most LINEs showed a significant shift toward low frequency alleles compared to the spectra observed for total genomic single nucleotide polymorphisms. Using Tajima's D, FST and the mean number of pairwise differences in LINE insertion polymorphisms, we found evidence that negative selection is acting on LINE families in a length-dependent manner, its effects being stronger in the larger Eastern Florida population. Our results suggest that a large effective population size and negative selection limit the expansion of polymorphic LINE insertions across these populations and that the probability of LINE polymorphisms reaching fixation is extremely low.

    KW - Anolis carolinensis

    KW - Genome resequencing

    KW - LINE

    KW - Retrotransposon

    KW - Selection

    KW - Transposable element

    UR - http://www.scopus.com/inward/record.url?scp=85019085037&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85019085037&partnerID=8YFLogxK

    U2 - 10.3389/fgene.2017.00044

    DO - 10.3389/fgene.2017.00044

    M3 - Article

    VL - 8

    JO - Frontiers in Genetics

    JF - Frontiers in Genetics

    SN - 1664-8021

    IS - APR

    M1 - 44

    ER -