This experiment aims to study splicing in mES-derived neurons. QuantSeqFlex Targeted RNASeq was used with targeted first strand synthesis primers in Ank3 and control primers in Actb and B2m to study splicing in differentiated mES cells.
RNA sequencing of T-cells labelled with 4SU for 24 hours (pulse-chase and short pulse). The library preparation was performed using QuantSeq 3'end mRNA seq (lexogen)
Data collected from ENCODE and processed by custom pipeline using STAR for mapping and GRCh38 GENCODE annotation.
This experiment aims to study splicing in mES-derived neurons. QuantSeqFlex Targeted RNASeq was used with targeted first strand synthesis primers to study splicing in differentiated mES cells.
iCLIP was performed as previously described 3 with only minor modifications described below. Briefly, F-CDK11 293 Flp-in cells as well as F-CDK11 (226-783) 293 Flp-in cells were plated onto 150 cm2 plates to reach 75% confluency at the day of crosslinking. CDK11 was induced with 1 µg/ml of doxycycline 24 h before crosslinking. We used two methods to crosslink RNA and proteins: either by UV-C (254 nm, 200 mJ/cm2) or by UV-A (365 nm, 200 mJ/cm2) after 100 µM final concentration of 4-thiouridine (Sigma, T4509) was added to the cells 6-8 h prior to the crosslinking. Cells from one 150 cm2 plate were used per one immunoprecipitation and two technical replicates were performed for each condition, both of them were mixed together after reverse transcription step. Composition of all buffers was the same as described in 3. Each cell pellet (originally from one 150 cm2 plate) was lysed in 1 ml of lysis buffer and the lysate was homogenized by passing three-times through an insulin syringe (B.BROWN, Omnican U-100, 32G). Lysate was treated with 4 U/ml Turbo DNase (Thermo Fisher Scientific, AM2238), 12 U/ml RNase I (Thermo Fisher Scientific, AM2295) shaking at 1100 rpm and 37°C for 3 min. Clarified extracts (21000g for 30 min) were incubated for 2 h with 2 µg of flag antibody (Sigma, F1804) pre-bound to 50 µl of protein G Dynabeads. After series of stringent washes, adenylated L3 RNA adapter was ligated to the 3’end of crosslinked RNAs. Crosslinked protein-RNA complexes were resolved by SDS-PAGE (NuPAGE 4-12% Bis-Tris Protein Gel, Thermo Fisher Scientific, NP0322) and transferred to nitrocellulose membrane. The region of the membrane containing the radioactively labeled crosslinked protein–RNA complexes was excised, RNA was isolated and reverse transcribed to cDNA (technical replicates were mixed together after this step). cDNA was size-selected using urea denaturing gel electrophoresis and three fractions running between 70-85 nt (L-low), 85-120 nt (M-medium) and 120-200 nt (H-high) were isolated. Each fraction was independently circularized by single-stranded DNA ligase, annealed to an oligonucleotide complementary to the restriction site and cut between the two adapter regions by BamHI. After final PCR amplification using P3 and P5 Solexa primers all three fractions were pooled together in ratio 1:5:5 (L:M:H). Multiplexed libraries were sequenced as 50bp single-end reads on Illumina sequencer (EMBL, Heidelberg).
Santiago's first eiCLIP following Sibley's protocol.
Autosomal-recessive loss of the NSUN2 gene has been identified as a causative link to intellectual disability disorders in humans. NSun2 is an RNA methyltransferase modifying cytosine-5 in transfer RNAs (tRNAs), yet the identification of cytosine methylation in other RNA species has been hampered by the lack of sensitive and reliable molecular techniques. Here, we describe miCLIP as an additional approach for identifying RNA methylation sites in transcriptomes. miCLIP is a customized version of the individual-nucleotide-resolution crosslinking and immunoprecipitation (iCLIP) method. We confirm site-specific methylation in tRNAs and additional messenger and noncoding RNAs (ncRNAs). Among these, vault ncRNAs contained six NSun2-methylated cytosines, three of which were confirmed by RNA bisulfite sequencing. Using patient cells lacking the NSun2 protein, we further show that loss of cytosine-5 methylation in vault RNAs causes aberrant processing into Argonaute-associated small RNA fragments that can function as microRNAs. Thus, impaired processing of vault ncRNA may contribute to the etiology of NSun2-deficiency human disorders.
UV cross-linking and immunoprecipitation (CLIP) and individual-nucleotide resolution CLIP (iCLIP) are methods to study protein-RNA interactions in untreated cells and tissues. Here, we analyzed six published and two novel data sets to confirm that both methods identify protein-RNA cross-link sites, and to identify a slight uridine preference of UV-C-induced cross-linking. Comparing Nova CLIP and iCLIP data revealed that cDNA deletions have a preference for TTT motifs, whereas iCLIP cDNA truncations are more likely to identify clusters of YCAY motifs as the primary Nova binding sites. In conclusion, we demonstrate how each method impacts the analysis of protein-RNA binding specificity.
used in sarcoma (FUS) and TAR DNA-binding protein 43 (TDP-43) are RNA-binding proteins pathogenetically linked to amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD), but it is not known if they regulate the same transcripts. We addressed this question using crosslinking and immunoprecipitation (iCLIP) in mouse brain, which showed that FUS binds along the whole length of the nascent RNA with limited sequence specificity to GGU and related motifs. A saw-tooth binding pattern in long genes demonstrated that FUS remains bound to pre-mRNAs until splicing is completed. Analysis of FUS−/− brain demonstrated a role for FUS in alternative splicing, with increased crosslinking of FUS in introns around the repressed exons. We did not observe a significant overlap in the RNA binding sites or the exons regulated by FUS and TDP-43. Nevertheless, we found that both proteins regulate genes that function in neuronal development.
There are ∼650,000 Alu elements in transcribed regions of the human genome. These elements contain cryptic splice sites, so they are in constant danger of aberrant incorporation into mature transcripts. Despite posing a major threat to transcriptome integrity, little is known about the molecular mechanisms preventing their inclusion. Here, we present a mechanism for protecting the human transcriptome from the aberrant exonization of transposable elements. Quantitative iCLIP data show that the RNA-binding protein hnRNP C competes with the splicing factor U2AF65 at many genuine and cryptic splice sites. Loss of hnRNP C leads to formation of previously suppressed Alu exons, which severely disrupt transcript function. Minigene experiments explain disease-associated mutations in Alu elements that hamper hnRNP C binding. Thus, by preventing U2AF65 binding to Alu elements, hnRNP C plays a critical role as a genome-wide sentinel protecting the transcriptome. The findings have important implications for human evolution and disease.
It is generally believed that splicing removes introns as single units from pre-mRNA transcripts. However, some long D. melanogaster introns contain a cryptic site, called a recursive splice site (RS-site), that enables a multi-step process of intron removal termed recursive splicing. The extent to which recursive splicing occurs in other species and its mechanistic basis remain unclear. Here we identify highly conserved RS-sites in genes expressed in the mammalian brain that encode proteins functioning in neuronal development. Moreover, the RS-sites are found in some of the longest introns across vertebrates. We find that vertebrate recursive splicing requires initial definition of a “RS-exon” that follows the RS-site. The RS-exon is then excluded from the dominant mRNA isoform due to competition with a reconstituted 5′ splice site formed at the RS-site after the first splicing step. Conversely, the RS-exon is included when preceded by cryptic exons or promoters that are prevalent in long introns, but which fail to reconstitute an efficient 5′ splice site. Most RS-exons contain a premature stop codon such that their inclusion may decrease mRNA stability. Thus, by establishing a binary splicing switch, RS-sites demarcate different mRNA isoforms emerging from long genes by coupling inclusion of cryptic elements with RS-exons.
mRNA structure is important for post-transcriptional regulation, largely because it affects binding of trans-acting factors. However, little is known about the in vivo structure of full-length mRNAs. Here we present hiCLIP, a high-throughput technique to identify RNA secondary structures interacting with RNA-binding proteins (RBPs) in vivo. Using this technique to investigate RNA structures bound by Staufen 1 (STAU1), we uncover a dominance of intra-molecular RNA duplexes, a depletion of duplexes from coding regions of highly translated mRNAs, an unforeseen prevalence of long-range duplexes in 3′ untranslated regions (UTRs), and a decreased incidence of SNPs in duplex-forming regions. We also discover a duplex spanning 858nts in the 3′ UTR of the X-box binding Protein 1 (XBP1) mRNA that regulates its cytoplasmic splicing and stability. Our study reveals the fundamental role of mRNA secondary structures in gene regulation and introduces hiCLIP as a widely applicable method for discovering novel, especially long-range, RNA duplexes.
Many RNA-binding proteins (RBPs) regulate both alternative exons and poly(A) site selection. To understand their regulatory principles, we developed expressRNA, a web platform encompassing computational tools for integration of iCLIP and RNA motif analyses with RNA-seq and 3′ mRNA sequencing. This reveals at nucleotide resolution the “RNA maps” describing how the RNA binding positions of RBPs relate to their regulatory functions. We use this approach to examine how TDP-43, an RBP involved in several neurodegenerative diseases, binds around its regulated poly(A) sites. Binding close to the poly(A) site generally represses, whereas binding further downstream enhances use of the site, which is similar to TDP-43 binding around regulated exons. Our RNAmotifs2 software also identifies sequence motifs that cluster together with the binding motifs of TDP-43. We conclude that TDP-43 directly regulates diverse types of pre-mRNA processing according to common position-dependent principles.
TDP-43 is a predominantly nuclear RNA-binding protein that forms inclusion bodies in frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis (ALS). The mRNA targets of TDP-43 in the human brain and its role in RNA processing are largely unknown. Using individual-nucleotide resolution UV-crosslinking and immunoprecipitation (iCLIP), we demonstrated that TDP-43 preferentially binds long clusters of UG-rich sequences in vivo. Analysis of TDP-43 RNA binding in FTLD-TDP brains revealed the greatest increases in binding to MALAT1 and NEAT1 non-coding RNAs. We also showed that TDP-43 binding on pre-mRNAs influences alternative splicing in a similar position-dependent manner to Nova proteins. In addition, we identified unusually long clusters of TDP-43 binding at deep intronic positions downstream of silenced exons. A significant proportion of alternative mRNA isoforms regulated by TDP-43 encode proteins that regulate neuronal development or are implicated in neurological diseases, highlighting the importance of TDP-43 for splicing regulation in the brain.
Alu elements are retrotransposons that frequently form new exons during primate evolution. Here, we assess the interplay of splicing repression by hnRNPC and nonsense-mediated mRNA decay (NMD) in the quality control and evolution of new Alu-exons. We identify 3100 new Alu-exons and show that NMD more efficiently recognises transcripts with Alu-exons compared to other exons with premature termination codons. However, some Alu-exons escape NMD, especially when an adjacent intron is retained, highlighting the importance of concerted repression by splicing and NMD. We show that evolutionary progression of 3' splice sites is coupled with longer repressive uridine tracts. Once the 3' splice site at ancient Alu-exons reaches a stable phase, splicing repression by hnRNPC decreases, but the exons generally remain sensitive to NMD. We conclude that repressive motifs are strongest next to cryptic exons and that gradual weakening of these motifs contributes to the evolutionary emergence of new alternative exons.
Ultraviolet (UV) crosslinking and immunoprecipitation (CLIP) identifies the sites on RNAs that are in direct contact with RNA-binding proteins (RBPs). Several variants of CLIP exist, which require different computational approaches for analysis. This variety of approaches can create challenges for a novice user and can hamper insights from multi-study comparisons. Here, we produce data with multiple variants of CLIP and evaluate the data with various computational methods to better understand their suitability. We perform experiments for PTBP1 and eIF4A3 using individual-nucleotide resolution CLIP (iCLIP), employing either UV-C or photoactivatable 4-thiouridine (4SU) combined with UV-A crosslinking and compare the results with published data. As previously noted, the positions of complementary DNA (cDNA)-starts depend on cDNA length in several iCLIP experiments and we now find that this is caused by constrained cDNA-ends, which can result from the sequence and structure constraints of RNA fragmentation. These constraints are overcome when fragmentation by RNase I is efficient and when a broad cDNA size range is obtained. Our study also shows that if RNase does not efficiently cut within the binding sites, the original CLIP method is less capable of identifying the longer binding sites of RBPs. In contrast, we show that a broad size range of cDNAs in iCLIP allows the cDNA-starts to efficiently delineate the complete RNA-binding sites. We demonstrate the advantage of iCLIP and related methods that can amplify cDNAs that truncate at crosslink sites and we show that computational analyses based on cDNAs-starts are appropriate for such methods.
This experiment compares spliceosome positioning in K562, HepG2 or HEK293 cell lines by using iCLIP with antibody against SmB/B’. A comparison is made between mild and medium stringency washing conditions. Cells were UV crosslinked on ice and subjected to iCLIP analysis. RNase I was added to cell lysate for RNA fragmentation. Pre-coupled anti-SmB/B’ magnetic dynabeads were used to isolate Protein-RNA complexes, and RNA was ligated to L3 adaptor. The complexes were then size-separated with SDS-PAGE and visualised. cDNA was synthesized with Superscript IV Reverse Transcriptase, cDNA was then circularised. After PCR amplification, libraries were QCed for sequencing.