As science has continued its investigation into the inner workings of biology and biological systems on the molecular level, we have managed to unearth more and more complexity in the relationships within not only our own body, but in the world around us. Every organism has a symbiotic and antagonistic relationship with its surroundings, with changes on the genetic level reflecting those connections. One of the deeper layers of study has focused on the mobile portion of gene systems, whereby our own gene sequences and the sequences of pathogens in the form of bacteria and viruses can alter their state, function, and location via their own internal conditions.
Combined, these characteristics and individual parties of note have been referred to as the mobilome, the total source of mobile genetic material on a cellular level. For now, the units contained within this definition include transposons (the so-called “jumping genes”), plasmids, bacteriophages, and molecular viruses with the capability to self-replicate. Each of these deserves their own discussion and focus, so we will be limiting ourselves to only one faction today, the topic of transposons, also known as transposable elements (TEs).
A History of Genes That Jump
Transposons are complicated features that appear all across the ancestral span of living organisms and especially in eukaryotes such as plants and animals. While bacterial prokaryotic transposons also exist, these often interact with plasmid states and, thus, are outside the boundaries of this article, for now. Originally believed to be simple mutagenic elements cropping up in the genomes of organisms or potentially even viral or other forms of parasitic infection, it was still theorized by some famous researchers, such as Barbara McClintock, in the 1950’s that they may have a task in controlling the expression of other genomic components. The mere fact that they can spontaneously move along chromosomes and aren’t locked into place like other genetic material was a marvel at the time and seems cute and minute now.
Now we know that McClintock’s suspicions were the truthful case and, indeed, transposons seem to be a strong source of evolutionary pressure throughout history. The face value transposons presented for most of the last century was as yet more junk DNA that has the ability to move itself anywhere in the genome and only find ways to selfishly insert themselves into important genetic regions that will ensure they are passed on to the next generation. But their regulatory functions truly make them a centerpoint to the workings of genes across every organism’s genome. They are more than just junk.
Sadly, the suppositions presented by McClintock were ignored or passed over for the longest time, with a focus remaining more on her other flashier exploits. Even when similar mobile elements were shown to exist within bacterial cells in 1969, they were thought of as a pointless curiosity of genetic obsolescence. Many did not think moving genes could actually affect the rest of the genome and many didn’t even believe they moved at all or that they did so so rarely as to be negligible in the grand scheme of things.
The 70’s and 80’s: The Truth Emerges
The humble fruit fly would be the beginning of their disbeliefs undoing, as hybrid crosses tested in the 1970’s brought forward drastic genome modifications, randomly inducing situations such as sterility and incredibly increased rates of general mutation. But even this strange result would not be shown to be the work of transposons until much later, as they involved the actions of several types of those elements, many of which we continue to learn about to this day. There have been questions raised on whether all of them should be classified under the “transposon” umbrella or deserve their own unique designations. However, as is often found in science, we like to stick with the traditional terms that we know and expand upon them, so that has been the case with every new class of transposon discovered.
The two primary halves at play in the fly genome catastrophe were DNA transposons, the common kind that work via their own inserted genes, and non-long terminal repeat (non-LTR) retrotransposons that work via an RNA format and are a subset of the broader RNA transposons category. Their mobile actions in these hybrid fruit fly crosses were causing negative havok in the genes, which to some extent proved prior claims of these elements being selfish, in that their own attempted survival outweighed any deleterious external impact. Unfortunately, these being the first understood transposons properly described in 1980, this led to the idea that all transposons had this same selfish nature and thus were always bad for the organisms in question. In such a case, their evolutionary effect would actually be diminished and not be constructive toward future pressures.
The fly model organism Drosophila continued to be the center of further investigation, with specific transposons in it being located and isolated in 1981 and ongoing attempts to estimate just how common they were in the fly genome in general were conducted. This also helped to prove the polymorphism nature of the transposons, where they appeared in different places and forms, showing that their mobility was an ongoing feature and not a case of having happened only in the past. Though Drosophila has a comparatively low number of transposable element sites in its genome, this allowed for closer inspection of how this movement occurred. One thought was that, in inbred lines with homozygous traits, there would be less TE movement due to the duplicated traits.
Instead, just as much, if not more, mobile activity of TEs was seen in the inbred lines. What this meant is that these inbred lines aren’t truly as homozygous as expected based on their genes themselves, as transposon mobility meant that changes to the genome could happen in a single generation regardless and could influence trait expression even in controlled scenarios involving artificial selection. Population genetics attempted in the mid-80’s to map out these experimental results with a theoretical framework that could predict future alterations.
While possibly useful at the time, we can now see them to be far too simple to be practical, as they omitted so much of the transposons that exist and what they do that the models would have found inevitable errors sneaking in to their calculations. Furthermore, transposons vary far more significantly over time than the models allowed for, showcasing a fundamental misunderstanding of what these elements do and how they are used within the cell. The models did, at least, cause scientists to double check their work to ensure there wasn’t a hidden variable being introduced due to the actions of transposons.
Even with all this evidence being provided on the part of flies and bacteria, many scientists continued to believe that transposons were only at work in “lesser” organisms that were not as complicated as humans and also not in the completely different structures of plants. They were wrong, but those beliefs would last for quite some time. One by one they were debunked in the late 1980’s as irrefutable proof built up in study after study. New information though led to even more debate and not a small amount of confusion over just where transposons had come from. How did they even develop in the first place?
The 90’s Until Now: An Interactive Evolutionary Network
A portion of transposons share some amount of homology with the sequences of known retroviruses, leading to these TEs being referred to as retrotransposons. The big question that arose in the 90’s was are these transposons originally from retroviruses and simply altered over time or is the similarity a coincidence or perhaps relates to the nature of how an insertion sequence has to be ordered to function? Perhaps this is backwards from the events that happened and retroviruses actually emerged from retrotransposons far back in biological history. Since some retrotransposons can even create virus-like particles from transcription of their sequence, it’s clear that there has to be some sort of relation between them, but of what sort wasn’t clear.
Regardless of the truth behind this discussion, research into retroviruses and retrotransposons still led to some of the earlier forms of genetic engineering technologies, as the elements and pathogens allowed for cross-infectivity into other organisms. This allowed for the transfer of desired genes into other species via an existing mechanism. But as other alternatives arose, TEs fell out of fashion even for this and the only research that kept up into the early 2000’s was on very specific mechanisms involving these elements and how they regulated other genes and copied themselves.
The idea of transposons having played a grand role on the population level as a driver of evolution fell out of the public scientific eye for a fair number of years. But it wasn’t until around the time that next generation sequencing emerged that interest in this broader topic was renewed due to having full sequences of entire genomes of species. The annotations that came with these sequences showed transposable elements all over the place, in a far more numerous proportion than was believed to exist. Even selfish, self-interested replication systems wouldn’t lead to such a complicated production and amount of such things. They could only stick around in this number if they were actually positively involved in biological processes that necessitated their conservation in the genome.
Of course, it’s possible that some or all started off with this selfish feature and then later formed regulatory connections in the rest of the genome due to random mutations or a happenstance insertion location. Regardless, many scientists found the idea of transposons having a genomic function to be laughable as, up to that point, there had been none found that had reached fixation in the population, fixation meaning the property that they consistently appeared and remained in the species. All TEs at the time were different from each other and might be found in some other members of the group, but not at the level that showed they had an influence on the evolution and direction of the overall population.
It was then revealed, however, that the reason for this lack was a fault in the detection system, which was inherently only able to find large TEs and not smaller sequences. Many of the gene regulatory transposons are shorter in length, but have indeed met a fixation point in the overall species, especially when it comes to immune regulation and DNA protective elements.
Therefore, the current understanding of these transposable elements is that they are intimately connected to various cellular and organismal regulation systems, in such a complex manner that they can even control and manipulate genes far across the genome from their insertion point. Not all of their insertions are positive for their cell and can often be neutral or even detrimental, as the TE landscape is a constantly changing one.
But, on an overall scale, TEs have influenced and played a part in natural selection and the evolution of all creatures on Earth. Many early sources of confusion in genetics related to spontaneous generation of mutations and appearances of phenotypes suddenly without known gene changes can be attributed to transposable elements and their interruption of other sequences and processes when they move to a new location in the genome.
Let’s take a closer look at how TEs were involved in mammalian development and of particular systems in the evolution of humans from the very beginning.
Transposons and You
An interesting fact or discovery that came about from the deeper dives into genetics that more recent technologies have allowed scientists to conduct is that while biological complexity usually coincides with a rise in genome size, it doesn’t necessitate an increase in the total number of genes. Therefore, the amount of genomic space that genes take up shrinks the larger the genome and the more complex the organism. That doesn’t mean the rest of the genome isn’t important, no, we’ve found that the so-called “junk DNA” does indeed appear to have purposes vast in measure, but we have yet to figure out what those purposes are and how they are enacted. We’ve learned some, but there’s always more work to do.
Transposons are similar, though the complete opposite. The larger the genome, the higher the amount of transposable elements. Perhaps through things like duplication events and other methods, while their “selfish” nature then led to them sticking around, while other genes made in such a fashion lost their function due to mutations over time, keeping the number of genes static for the most part. While the numbers are still being run, it is believed that gene sequences descended from or formed due to the actions of transposons make up somewhere around 70% of the human genome, with transposon sequences themselves making up a fair amount of that. Compare this to actual genes and their piddling 2% occurrence.
Because of these derived parts of the genome, it’s plain to see that transposons have been instrumental in some way in shaping the genomes of mammals throughout all of our evolutionary history. In fact, they may represent that transposons are inherently a force of enhanced diversity as McClintock first proposed, that they form variances within a population that helps in creating organisms that can better survive in their niche roles in the environment. Now, whether transposons have that same sort of regulatory and gene expression role within actively living individuals is still being ascertained with current research, but we should hopefully know more soon.
With that note, let’s talk about some of the more specific forms and aspects of transposable elements in mammals in general and humans in particular. These types also relate to transposons in all organisms and serve as an introduction to this broader topic.
To begin, it should be pointed out that the primary distinguishing factor in the scientific literature between classes of transposons is how they extract and re-insert themselves into the genome. The intermediary by which they work determines which group they fall into. For DNA transposons, that intermediate component is clearly a strand of DNA, whether single or double-stranded. It is also this process by which much of the activity of transposons is conducted, as these intermediate forms can serve as surfaces for recombination of chromosomes and of new sequence copying, along with the altering of a reading frame for sequencing the genome. Once inserted, they can contribute to deletion of sequences internally, though any sort of change that would be deleterious to the host is minimized through certain evolved characteristics.
These defenses can include transposons being preferentially inserted into portions of the genome that are non-coding or into introns so that they won’t interrupt the production of RNAs from working genes. Cells may also methylate transposon sequences if their activity is too high, inactivating them. It is also possible for RNA interference and protein activity to work against particular transposons in order to prevent them from damaging necessary parts of the genome.
There are additionally known instances where transposons appear to have been “domesticated”, if that term can be considered apt, so that they lose their properties of mobility and instead take up a definitive gene function directly and continuously for the genome. And for mammals, all DNA transposons take this form where they have lost their mobility, which is something interesting to note. The only known exception in mammals is a select number of bat species that still retain mobile DNA transposons.
As for the mammalian genome, this domesticated state appears to be the case for the genes that code for RAG proteins used in specific antibody-related functions for the immune system, as the genes themselves have strong sequence resemblances to other DNA transposons, indicating that they too were once one and the same and lost that activity in the ancestral past. Active genes are believed to have around 4% of their exons that were descended from previous DNA transposons.
Another unique feature of this class outside of mammals is how they are duplicated once re-inserted into the genome, with further class subdivisions occurring based on whether the DNA transposon in question causes double strand breaks in the genome during their insertion. Some only break one strand to insert themselves and are duplicated from there, while others break both in order to have themselves be reverse transcribed directly onto the second strand.
Other cases involve sub-classes that do not encode their own intermediary to extract them from the genome. They are considered non-autonomous because of this. It is suspected that other autonomous transposons take on the role of intermediary and shepherd these companions to new locations in the genome. Hopefully more will be learned about this special method of transposon repositioning in due time.
Now, onto the other major class of transposable elements.
While we discussed the situation of DNA transposons first, they are actually considered Class 2 in the overall hierarchy. We can then move back a step to Class 1, where things get a bit more complicated. To start, RNA transposons clearly work via an RNA intermediary form versus the DNA step that their counterparts utilize. These RNAs have to use reverse transcription in order to be properly re-inserted back into the genome, in a manner similar to how retroviruses function, hence their alternative title of retrotransposons. As previously pointed out, there may also be a connection between some RNA transposons and historical forms of retroviruses, but we’ll get to that in more depth later.
This class can then be subdivided into two more distinct families, the long terminal repeats (LTR) retrotransposons, and the not so cleverly named non-LTR retrotransposons. The entirety of this division is based around the presence of long repeating sections within the transposable elements. The LTR group are the ones that seem similar to known retroviruses in their structural sequences and how they go about their insertions. That has led to them being referred to by an additional name under the title of endogenous retroviruses (ERVs). Their shape features an ERV segment surrounded on both sides by those long repeats that help enhance the transcription of the ERV portion.
Since the full length forms also have active retroviral pol and gag genes within them that help facilitate their conversion from RNA intermediate into double-stranded DNA, one can consider them only barely a step removed from being an ongoing retrovirus. But, since they lack the envelope (env) gene, they do not directly produce viral copies and thus do not do active harm to their hosts. As LTR retrotransposons are prone to recombination, it is not rare for them to lose their ERV section and create LTRs that stand by themselves. The mobilization of an existing ERV segment uses the stated RNA intermediate and a copy-paste action mimicking retroviruses, with this process often causing a high rate of mutation accumulation. Luckily, ERVs largely have no mobilization capabilities in the human genome, unlike some other mammal species.
Retrotransposons overall are far more numerous in mammals specifically, showing up in 40% of typical species and, as noted, they largely retain their mobility unlike mammalian DNA transposons. Though the LTR retrotransposons, due to the above problems, in almost every case lack the autonomy to mobilize on their own. That is not true for the other sub-class we will now consider, the non-LTR retrotransposons.
For these, the common case is the inclusion of a genetic element known as LINE-1 for long interspersed elements, which are 17% of the human genome and thus nearly half of all retrotransposons. Their mobility is haphazard, with some copies in the genome retaining it and other existing as mere relics of long past insertions in our eons-back ancestors. The short interspersed element (SINE) group lack their own autonomy and conscript the proteins made by LINE retrotransposons to assist them as well.
Non-LTR retrotransposons use a copy-paste insertion mechanism as well, but with a fundamental twist. They encode their own open reading frame proteins that does the reverse transcription themselves. Due to this requirement and limitation, they usually only have a single open reading frame in their entire sequence, two at most. The endonuclease domain in their sequences can assist in their excision, though isn’t always necessary for this process.
Due to the way they control their own insertion, the promoter sequence for non-LTR retrotransposons is largely not conserved and it is not uncommon for their promoter sequences to be replaced over evolutionary time with a different one. So separate species probably will have unrelated promoter sequences.
Because the 3’ end of the sequences often has a recognition site that relates and is connected to the open reading frame that conducts the reverse transcription, this site is strongly retained in the sequence. Otherwise, it might entirely lose its transposon properties. Comparatively, the 5’ end has no such importance and, with the way that the reverse transcription is conducted, it is an incomplete mechanism that routinely fails to excise the 5’ end. For example in humans and the L1 transposon, out of 500,000 total sites of it in the human genome, only 7,000 retain their full length.
As with other transposon classes, non-LTR retrotransposons have played major roles in certain organisms and their evolutionary histories. For humans alone, the aforementioned L1 transposon has been the instigator for the creation of around one-third of our genome and so understanding the role of this one transposon in light of the others is deeply relevant to the study of the genomic and molecular development of the human race.
Transpovirons, Polintons, And The Extended Viral Transposon World
In the previous sections, we’ve danced around the topic of viruses and their involvement in the genomes of other organisms. Obviously retroviruses exist and still plague us to this day, but their historic assaults on our once bacterial forms and ever since has, at times, created a melding of our genetic makeups. But that fact shouldn’t take away the very real danger and capabilities that viruses have. The past few years has seen the discovery of not only new forms and shapes of viruses that live in the world, but also new classes of transposons that they create. Things in the microorganism world can get pretty bizarre.
The first big find in this field recently was the unearthing of what are known as giant viruses. As their name implies, these viruses have an enormous capsid that can sometimes reach sizes rivaling small bacteria. Using these, the primary host they target are amoeba species that can be considered in a way “giant bacteria”, though really just overly-large eukaryotes. With all of the hefty sizes going on, it isn’t surprising that some new tricks have arisen to prey upon such viruses themselves. After all, it’s not like normal sized viruses stopped existing.
These virus-infecting viruses are known as virophages and act in similar ways to their bacteriophage cousins. Only a handful are known, though new varieties are being reported in the literature more and more often in the past few years. With these noted, it is time to talk about transpovirons.
Transpovirons: A Plasmid and Transposon In One
If one is to find an entirely new class of viruses as large as bacteria, it is not so surprising that they would have their own massive group of transposons. The transpoviron is another mobile genetic element that is quite abundant in the genomes of giant viruses, containing on average 6 to 8 genes in their sequence. For now, they have only been seen to be a part of the Mimiviridae family of giant viruses and none of the other familial groups.
The intriguing part is that there are four of the genes that are shared between separate transpovirons and two of those four are significant matches in homology to genes found in a virophage, the only virophage known that targets Mimiviruses. These connections give a hint to the evolutionary origin of these special transposable elements, where they must have been obtained from the virophage in the past. The question is whether the genes were stolen as defenses or purposefully inserted into the giant virus genome in a retroviral attack.
As for the transpoviron sequences themselves, they are entirely linear in structure and in the form of double-stranded DNA with plasmid similarities. During the process of Mimiviruses reproducing, a large amount of transpovirons are transcribed, become a part of the particle production pathway, and then are integrated into the genome in even more frequent locations. The proteins that are encoded from them aren’t outright virulent, but do show polymerase transcribing capabilities.
The discussion we’ve been having thus far on these viral transposons have been skipping around another major facet of their existence and that is due to their counterpart mobile elements known as Polintons. Let’s talk about those for a quick minute.
Polintons: Massive ‘Virus-Like’ Transposons
Across protist species and animals themselves, there is a class of transposons that are rather large and self-replicating known as Polintons (or Mavericks). They can be found in some amount across a number of genomes, but reach incredibly high densities in some protists. The fact that they are 20,000 nucleotide bases long at a minimum makes them stand out and has fueled ideas of them being viral in origin. Additionally, the existence of several virus genes, including a DNA polymerase, an ATPase, and a protease have further cemented this idea.
The past few years has shown that they also encode viral capsid proteins, suggesting that under the right conditions they would be able to successfully produce viral particles called virions that could infect other cells. So, are they mere transposons or do they count as dormant viruses? If the latter, how dormant is dormant when one is talking about the evolutionary history of tens of millions of years?
As of yet, their production of such virions has never been seen and documented, so it remains only a hypothesis. If they ever did, then perhaps a renaming and reclassification of Polintons to polintoviruses might be appropriate. Another notable feature of Polintons is how they share the highest number of genes with other transposons, plasmids, and mobile elements in general. In appearance, it looks like they are the central hub of the network that makes up such elements. These shared genes also include homologues with bacteriophages and virophages, hence their relation to transpovirons.
The mobilome that interconnects all of these transposable genetic sequences is still being deciphered and the info we do have on them has only come about in the past five years or so. What they mean in the larger scheme of evolutionary history is, for now, obscured. But the research will continue on.
This was meant to just be a short taste of the broader world of mobile genetic elements outside of transposons and how so many of them relate back to viral starting points. In some ways, one could say we are more virus than eukaryote. Just how much so will require more data.
A Universe Of Moving Genes
Transposons have been a part of science and scientific inquiry since nearly the beginning of genetics itself, though it took a long time for us to parse their true nature. Now that we have, it seems like the mysteries behind them are neverending, opening up layer upon layer to show us that they have played a role in nearly all genetic developments in the history of life on Earth. And that that entire time period has allowed them to become more varied and interrelated in unique ways within the tree of life.
We have work to do if we are ever to come to any true conclusion on their purposes and how they came about. And to find what role viruses truly have in how they formed or whether viruses themselves were a side product of them in some cases. The questions are on our lips, now we need only reach to find the answers.
1. Siefert, J. L. Defining the Mobilome. (2009) Horizontal Gene Transfer Methods in Molecular Biology 13–27. doi: 10.1007/978-1-60327-853-9_2
2. Lanciano, S. & Mirouze, M. (Apr 2018) Transposable elements: all mobile, all different, some stress responsive, some adaptive? Current Opinion in Genetics & Development 49, 106–114. doi: 10.1016/j.gde.2018.04.002
3. Biemont, C. (Dec 2010) A Brief History of the Status of Transposable Elements: From Junk DNA to Major Players in Evolution. Genetics 186, 1085–1093. doi: 10.1534/genetics.110.124180
4. Munoz-Lopez, M. & Garcia-Perez, J. (Apr 2010) DNA Transposons: Nature and Applications in Genomics. Current Genomics 11, 115–128. doi: 10.2174/138920210790886871
5. Ivics, Z. & Izsvák, Z. (Dec 2010) The expanding universe of transposon technologies for gene and cell engineering. Mobile DNA 1, 25. doi: 10.1186/1759-8753-1-25
6. Kent, T. V., Uzunović, J. & Wright, S. I. (Nov 2017) Coevolution between transposable elements and recombination. Philosophical Transactions of the Royal Society B: Biological Sciences 372. doi: 10.1098/rstb.2016.0458
7. Garcia-Perez, J. L., Widmann, T. J., & Adams, I. R. (2016). The impact of transposable elements on mammalian development. Development 143, 4101-4114. doi: 10.1242/dev.132639
8. Qin, S., Jin, P., Zhou, X., Chen, L. & Ma, F. (Jun 2015) The Role of Transposable Elements in the Origin and Evolution of MicroRNAs in Human. Plos One 10. doi: 10.1371/journal.pone.0131365
9. Han, J. S. (May 2010) Non-long terminal repeat (non-LTR) retrotransposons: mechanisms, recent developments, and unanswered questions. Mobile DNA 1, 15. doi: 10.1186/1759-8753-1-15
10. Aronovich, E. L., Mcivor, R. S. & Hackett, P. B. (Apr 2011) The Sleeping Beauty transposon system: a non-viral vector for gene therapy. Human Molecular Genetics 20. doi: 10.1093/hmg/ddr140
11. Hackett, P. B., Largaespada, D. A. & Cooper, L. J. (Apr 2010) A Transposon and Transposase System for Human Application. Molecular Therapy 18, 674–683. doi: 10.1038/mt.2010.2
12. Li, Z.-W. et al. (Aug 2018) Transposable Elements Contribute to the Adaptation of Arabidopsis thaliana. Genome Biology and Evolution 10, 2140–2150. doi: 10.1093/gbe/evy171
13. Xu, T., Bharucha, N. & Kumar, A. (Jun 2011) Genome-Wide Transposon Mutagenesis in Saccharomyces cerevisiae and Candida albicans. Methods in Molecular Biology Strain Engineering 207–224. doi: 10.1007/978-1-61779-197-0_13
14. Lorenzo, V. D. (Nov 1998) Mini-transposons in microbial ecology and environmental biotechnology. FEMS Microbiology Ecology 27, 211–224. doi: 10.1016/S0168-6496(98)00064-6
15. Shah, V. & Kim, J. R. (Sep 2016) Transposon for protein engineering. Mobile Genetic Elements 6. doi: 10.1080/2159256X.2016.1239601
16. Krupovic, M. & Koonin, E. V. (Dec 2014) Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution. Nature Reviews Microbiology 13, 105–115. doi: 10.1038/nrmicro3389
17. Desnues, C. et al. (Oct 2012) Provirophages and transpovirons as the diverse mobilome of giant viruses. Proceedings of the National Academy of Sciences 109, 18078–18083. doi: 10.1073/pnas.1208835109
18. Yutin, N., Raoult, D. & Koonin, E. V. (May 2013) Virophages, polintons, and transpovirons: a complex evolutionary network of diverse selfish genetic elements with different reproduction strategies. Virology Journal 10, 158. doi: 10.1186/1743-422X-10-158
19. Koonin, E. V. & Krupovic, M. (Aug 2017) Polintons, virophages and transpovirons: a tangled web linking viruses, transposons and immunity. Current Opinion in Virology 25, 7–15. doi: 10.1016/j.coviro.2017.06.008
Photo CCs: 18110 from the Public Health Image Library