Control over the genome and the epigenome has continued to be a prime focal point for the fields of science. New genome editing technologies and their related epigenetic tools has only exacerbated this spotlight. However, there are other areas of expression control that are available and some techniques, such as RNA interference (RNAi), seem to try and prevent expression not by altering genes in any fashion, but preventing their subsequent RNA from continuing on to protein creation or their other functions.
When it comes to tools like CRISPR, there are also several options in this regard. The section of classification known as Type VI has generally become the realm of RNA editors, along with all the so-called Cas13 protein variants within it. The first that was discovered was known as C2c2, before eventually being renamed to Cas13a. But that was several years ago at this point and CRISPR research has been anything but slow-going or quiet. In many ways, it is incredibly difficult to keep up with the new discoveries, but here i’ll do my best to brief you on the newest tool revealed by the Salk Institute.
A Short History of RNA Control
One of the major issues with RNA controlling technologies is that they are each limited in some way, usually through what RNA sequences they are able to bind to, known as RNA-binding domains. An early binding domain used that was obtained from the bacteriophage MS2 is only able to recognize a specific 21 nucleotide sequence, making its use very restrictive. Another older option is the Pumilio homology domain used by Puf proteins that can be modularized into specific targets. However, those targets can only ever be 8 nucleotides long, meaning they often lack the specificity needed for accurate targeting.
Even the other CRISPR types that have been made for RNA focusing, such as the specialized Cas9 in Type II and the other forms in Type VI, have problems. They have a much larger moddable targeting region at 20 to 30 nucleotides long, but they are fairly bulky at around 1200 nucleotides in length. This size makes it difficult to deliver them into cells through the usual viral capsid method.
An Algorithm For Detecting Cas Proteins
The Salk Institute researchers hypothesized that there are probably many more types of CRISPR systems aimed at RNA editing that have yet to be found. So they developed a bioinformatics program for analyzing bacterial genomes that aims to isolate sequences that have a similar homology to known Class 2 systems like Type II and Type VI, since they both have single effector proteins as their main active splicer. This means they are more likely to both be smaller in size and have the capability to be used on RNA.
When they ran their algorithm on the genomic databases, it came up with 21,175 possibilities. They narrowed their search by removing all hits that matched the exact sequences for known proteins, like Cas9, and for these single proteins to be greater than 750 amino acids in length and for their coding gene to be within five genes of the repeat array system for the rest of the CRISPR complex. These requirements were added because a bigger single protein is more likely to contain the multiple domains and abilities needed for this tool and will still be small enough to be used by itself if required. By adding these alterations to the program, they reduced the results to 408 possible candidates.
There were a few more additions to the algorithm to lower that number further with changes to how close the sequences had to be together and some other tweaks, finally reducing them to a handful. The one they ultimately decided on was a ribonuclease CRISPR family with the smallest single effector protein found yet at 930 amino acids long. It features the signature HEPN cleavage domains, which separates Cas13 systems from Cas9 (the latter of which uses HNH and RuvC cleavage domains). However, other than these expected domains, the system the scientists found has no other significant similarities to the other Cas13 systems, so they named it with its own letter classification as Cas13d.
The Characteristics of Cas13d
After some genomic sequencing, they identified that the Cas13d systems are found within the gut bacteria in the genus Ruminococcus and have some other features of note in their CRISPR complex. The system appropriately lacks the Cas1 spacer collector protein, this lack an aspect of Class 2 CRISPRs since their all in one effector protein removes the need for that component. The bacterial family using Cas13d also appears to keep the length of the repeats and their structure heavily conserved.
To test how Cas13d is expressed, they cloned the gene sequences for it into an expression plasmid and saw that it indeed began producing guide RNAs. Then, to find out more about its properties, they purified a sample of it from the species Eubacterium siraeum and found that it is sufficient by itself to begin gRNA processing without needing any additional helper nucleases, again like several other Class 2’s.
The next test was to see if Cas13d needed to bind with its gRNAs in order to form a ribonucleoprotein for cleaving foreign RNA. They found that it could cleave in both this combined state and before processing the guide RNAs to a mature form, but it did require the guides in order to find target sequences. They also identified the minimum spacer length at which the complex still was able to target and cleave RNA was set at 22 nucleotides long. Its cleavage mechanism also appears to be variable at where it cleaves between different sequences that the guide RNAs bring it to, with a bias toward cleaving at uracil bases.
Making A New RNA System
With all of its properties unveiled, the researchers then decided to see if they could turn Cas13d into a tool for programmable RNA cleavage and binding in mammals. The hope was that it could work similarly to RNA interference techniques, but with a far more amenable modular setup. By using several screening methods and combining different portions of the complex from separate bacterial species with Cas13d, they were able to make the most optimized and active version of the system. They decided to call this specific combination that was the basis of their tool, CasRx.
The first test of the tool was done by using four spacer targets, two of mRNAs and two of long non-coding RNAs (lncRNAs). CasRx managed to easily knockdown all of these RNAs at an efficiency rate of greater than 90%. Next, they compared its efficiency against two other RNA transcript repression tools, dCas9 and shRNAs. It outperformed both across multiple runs, achieving 96% repression while the other two saw 53% and 65%, respectively. Even against the recently described systems of Cas13a and Cas13b it outperformed (at 97% vs. 80% and 66%).
Off-target effects were also checked against a control of shRNA, which itself saw around 500 significant accidental changes. CasRx, meanwhile, showed zero significant off-target changes whatsoever. Lastly in this set of performance comparison checks, they put it up against 11 RNAs from genes with a diverse range of functions and it saw a medium of 96% reduction of these RNAs, with the lowest result out of all of them still being higher than 80%.
Controlling RNA Splicing
With performance of CasRx confirmed, the scientists could then turn their consideration to the true purpose of this tool. While it can clearly be used in a wide variety of ways with RNA, as their prior testing showed, the researchers wanted to use it to control and subvert the RNA processing system. Since CasRx is so dependent on its HEPN domains for cleavage and targeting, the catalytically dead form of it (dCasRx) is pretty useless at reducing RNA levels. It doesn’t affect protein translation at all. However, what if this aspect of it could be utilized to manipulate how pre-mRNA is turned into its mature form after splicing out of introns?
That was the goal. The scientists theorized that simply binding the dCasRx to the correct location on the pre-mRNA would be enough to perturb how it is spliced, pushing it toward a desired splicing path rather than the alternative splicing possibilities. There are a number of disorders that function by producing incorrect proteins after the use of an alternative splicing method on the mRNA, thus allowing these wrong amino acid sequences to be generated. These wrong RNA to protein systems often make up neurological disorders and the researchers decided to use dCasRx against the disease known as frontotemporal dementia (FTD).
FTD is a Parkinson’s-related disorder characterized by neurodegeneration. Normally in the brain’s neurons, a protein named Tau is abundant in two specific forms called 4R and 3R. These isoforms have to be kept balanced in number in a healthy brain and are produced based on whether a particular exon in the RNA is kept in or not in the protein. FTD occurs when several point mutations in the gene cause the intron just after that exon to not be spliced, increasing the production of 4R specifically. This in turn causes the degenerative condition.
Because Cas13d is several hundred amino acids shorter than all of its CRISPR competition, there was no issue in cramming a pre-programmed dCasRx complex into a viral vector. The goal was for it to target that precise exon and have it be excluded from the mature mRNA to more balance out the two Tau forms. In order to test this, human stem cells with FTD were taken and differentiated into neurons, before dCasRx was introduced.
Overall, they found that dCasRx was able to alleviate the imbalance of 4R and 3R around 50% better than the comparative control and to amounts that more or less matched neurons without FTD. Thus, their goal was accomplished.
The Future For CasRx and Cas13d
The creation of a tool like CasRx out of Cas13d is, in a way, an epigenetic mechanism for the RNA transcriptome. It allows for highly precise alteration of how RNAs are process and which splicing path they go down. And, of course, that is just one facet of what this tool is capable of doing, just the one that the researchers chose to use it for in this particular study. Who knows what other scientists will end up managing with it in the future?
Either way, this means we have yet another piece in our arsenal against disease and on our path to better understand and control the entire process of genome expression, from genes themselves to RNA, proteins, and beyond. We are ever closer to universal eradication of disease and improvement of the human condition.
Photo CCs: Neuron cluster from Wikimedia Commons