Skip to main content
Biology LibreTexts

14.9: On the Evolution of Transposons, Genes and Genomes

  • Page ID
    88990
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    We noted that transposons in bacteria carry antibiotic resistance genes, a clear example of the benefits of transposition in prokaryotes. Of course, prokaryotic genomes are small, as is the typical bacterial transposon load. Yeast species also have low transposon load. But what can we make of the high transposon load in eukaryotes? To many geneticists, the fact that genes encoding proteins typically represent only 1–2% of a eukaryotic genome once suggested that the rest of the genome was informationally nonessential. Even though transposons turned out to be much of the noncoding DNA in some eukaryotic genomes, they seemed to serve no purpose other than their own replication. These large amounts of transposon DNA were dubbed “selfish DNA” and their genes, “selfish genes.”

    So, are transposons just junk DNA—some kind of invasive or leftover genomic baggage? Given their propensity to jump around and their potential to raise havoc in genomes, how do organisms tolerate and survive them? Is the sole “mission” of transposons really just to reproduce themselves? Or are transposons in fact neither selfish nor junk? By their sheer proportions and activity in eukaryotic genomes, we will see that transposons have dispersed into and reshaped genomic landscapes. Does their relocation and dispersal of transposons in a genome (with the resulting disruption of genes and structural reorganization of genomes) have any functional or evolutionary value?

    All of these questions are reasonable responses to the phenomena of jumping genes. A rational hypothesis might be that, like all genetic change, the origin of transposition was a random accident. But the spread and ubiquity of transposons in genomes of higher organisms must in the long term have been selected by virtue of some benefit they provide to their host cells and organisms.

    Let’s briefly look at the evolutionary history of transposons to see if this assumption has some merit.

    14.9.1 A Common Ancestry of DNA and RNA Transposons

    Transposases catalyze transposition of bacterial IS (and related) elements and eukaryotic Class II (DNA) transposons. They are structurally similar and may share a common ancestry. Figure 14.25 compares the amino-acid sequences of the so-called integration domain of transposases from different transposons.

    Screen Shot 2022-05-23 at 7.33.39 PM.png
    Figure 14.25: The alignment of consensus amino-acid sequences among bacterial IS transposases, Mu phage, and Tc1 mariner transposases reveals conservation of D, D, and E amino acids (uppercase) at key positions in the sequence. Other amino acids shared between some but not all of the sequences are in lowercase. Slashes denote variable-length gaps in the alignments.

    A universally conserved alignment of D…D…E amino acids—at key positions in different transposase enzymes—defines a DDE domain, supporting the common ancestry of bacterial and eukaryotic transposases. This, along with shared structural features of these transposons (e.g., the flanking direct insertion-site repeats) further support a common ancestry of the transposons encoding the enzymes.

    Other sequence comparisons reveal that the transposons themselves comprise distinct families of more closely related elements. This allows us to speculate on the origins of these families in different species. For example, the TC1/mariner (DNA) transposon is found in virtually all organisms that have been examined (except diatoms and green algae). Based on sequence analysis, there is even an insertion element in bacteria that is related to the mariner element. Clearly, mariner is an ancient transposon.

    This amount and diversity of conservation bespeaks an early evolution of the enzymes of transposition and of transposition itself within (and even between) species. Linear or vertical descent (the “vertical” transmission of transposons from parents to progeny) is the rule. However, horizontal transfer best explains the presence of similar, related transposons in diverse species.

    In contrast to linear descent of sequences within a species, horizontal transfer defines an interspecific sharing of DNA. That is, a transposon in one organism must have been the “gift” of an organism of a different species! This is further discussed next.

    Clearly, moveable genes have been a part of life for a long time, speaking more to an adaptive value for organisms than to the parasitic action of a selfish, rogue DNA!

    14.9.2 Retroviruses and LTR Retrotransposons Share a Common Ancestry

    The integration domains of retrotransposon transposases and retrovirus integrases also share significant sequence similarities, as shown in the amino-acid sequence alignment in Figure 14.26.

    Screen Shot 2022-05-23 at 7.37.26 PM.png
    Figure 14.26: A comparison of amino acid sequences of the COPIA retrotransposon and a retroviral (HIV) integrase with typical transposase sequences: the alignments reveal conservation of the D, D, and E amino acids in the DDE domain of the enzymes. Other amino acids are shared between some but not all of the sequences (lowercase). Slashes denote variable gaps in the alignments.

    Here the conserved D…D…E alignment of amino acids at key positions in DDE domains supports the common ancestry of Class I (RNA transposon) retroviral integrases and Class II (DNA transposon) transposases. In other words, all transposons may share a common ancestry. But the common ancestry of retrotransposons and retroviruses raises yet other questions: did transposons (specifically retrotransposons) arise as defective versions of integrated retrovirus DNA (i.e., reverse transcripts of retroviral RNA)? Or did the retroviruses emerge when retrotransposons evolved a way to leave their host cells? To approach this question, let’s first compare mechanisms of retroviral infection and retrotransposition.

    LTR retrotransposons and retroviruses both contain flanking long terminal repeats in addition to the structural similarities of the enzymes they encode. But retrotransposition occurs within the nucleus of a cell, while retroviruses must first infect a host cell before the retroviral DNA can be replicated and new viruses produced.

    A key structural difference between retrotransposons and most retroviruses is the ENV gene-encoded protein envelope surrounding retroviral DNA. After infection, the incoming retrovirus sheds its envelope proteins, and viral RNA is reverse-transcribed.

    After the reverse transcripts enter the nucleus, transcription of genes and translation of enzymes necessary for the replication of the viral cDNA leads to the production of new enveloped infectious viruses that will eventually lyse the infected cell. But here are two curious phenomena:

    First, retroviral DNA, like any genomic DNA, is mutable. If a mutation inactivates one of the genes required for infection and retroviral release, it could become an LTR retrotransposon. Such a genetically damaged retroviral integrate might still be transcribed, and its mRNAs might still be translated. If detected by its own reverse transcriptase, the erstwhile viral genomes would be copied. The cDNAs, instead of being packaged into infectious viral particles, would become a source of so-called endogenous retroviruses (ERVs). In fact, ERVs exist and make up a substantial portion of the mammalian genome (8% in humans)—and they do, in fact, behave like LTR retrotransposons!

    Second, yeast TY elements transcribe several genes during retrotransposition (see the list in section 14.7.1 above) to produce not only a reverse transcriptase and an integrase but also a protease and a structural protein called Gag (Group-specific antigen). All the translated proteins enter the nucleus. Mimicking the retroviral ENV protein, the Gag protein makes up most of a coat protein called VLP (virus-like particle). VLP encapsulates additional retrotransposon RNAs in the cytoplasm, along with the other proteins. Double-stranded reverse transcripts (cDNAs) of the viral RNA are then made within the VLPs. But instead of bursting out of the cell, these encapsulated cDNAs (i.e., new retrotransposons) shed their VLP coat and reenter the nucleus, where they can now integrate into genomic target DNA. Compare this to the description of retroviral infection. During infection, retroviral envelope proteins attach to cell membranes and release their RNA into the cytoplasm. There, the reverse transcriptase copies viral RNA into double-stranded cDNAs that then enter the nucleus, where they can integrate into host-cell DNA. When transcribed, the integrated retroviral DNA produces transcripts that are translated in the cytoplasm into the proteins necessary to form an infectious viral particle. The resulting viral RNAs are encapsulated by an ENV (envelope) protein encoded in the viral genome. Of course, unlike VLP-coated retrotransposon RNAs, the enveloped viral RNAs do eventually lyse the host cell, releasing infectious particles. Nevertheless, while VLP-coated Ty elements are not infectious, they sure do look like a retrovirus!

    In much the same way as early biologists compared the morphological characteristics of plants and animals to show their evolutionary relationships, comparisons of aligned retroviral and retrotransposon reverse transcriptase gene DNA sequences reveal the phylogenetic relationships of genes. A phylogenetic tree of retrotransposon and viral genes is shown in Figure 14.27 (below).

    The data in the analysis supports the evolution of retroviruses from retrotransposon ancestors. From the “tree,” TY3 and a few other retrotransposons share common ancestry with Ted, 17.6, and Gypsy ERVs (shown boxed in red in Figure 14.27) in the Gypsi-TY3 subgroup. Further, this subgroup shares common ancestry with more-distantly-related retroviruses (e.g., MMTV and HTLV), as well as the even-more-distantly-related (older, longer-diverged!) CopiaTY1 transposon subgroup.

    Screen Shot 2022-05-23 at 7.43.48 PM.png
    Figure 14.27: Retroviral and retrotransposon reverse transcriptases share a common evolutionary ancestor.

    This and similar analyses strongly suggest that retroviruses evolved from a retro-transposon lineage. For a review of retroposon and retrovirus evolution, check out Lerat P. & Capy P., 1999. Retrotransposons and retroviruses: analysis of the envelope gene. Mol. Biol. Evol. 19: 1198-1207.

    14.9.3 Transposons Acquisition by Horizontal Gene Transfer

    As noted, transposons are inherited vertically, meaning that they are passed from cell to cell or parents to progeny by reproduction. But they also may have spread between species, in which an individual of species A inadvertently picks up a transposon from species B or even an individual of the same species), becoming transformed and adding a new transposon to their genome. Accidental mobility of transposons between species would have been rare, but an exchange of genes by horizontal gene transfer would have accelerated with the evolution of retroviruses. Once again, despite the potential to disrupt the health an organism, retroviral activity might also have supported a degree of genomic diversity useful to organisms.

    255 Transposon Evolution


    This page titled 14.9: On the Evolution of Transposons, Genes and Genomes is shared under a not declared license and was authored, remixed, and/or curated by Gerald Bergtrom.

    • Was this article helpful?