Draft September 15, 2018 The spliceosome in constitutive splicing

The spliceosome in constitutive splicing

Patrizia Fabrizio and Reinhard Lührmann

Key words: Splicing complexes, Steps of splicing, Catalytic mechanism, Evolution of the spliceosome, RNPs

Metazoans genes are transcribed into precursor mRNAs (premRNAs) that are converted to messenger RNA (mRNA) by an essential step of eukaryotic gene expression, termed splicing. During premRNA splicing noncoding sequences (introns) are removed from the primary transcript and coding sequences (exons) are ligated together to form mRNA. Some exons are constitutively spliced, which means that all exons are joined together in the order in which they occur in the premRNA. Many exons, on the other hand, are ‘alternatively’ spliced (i.e. some exons may be missed out during the splicing reaction), and this means that a given premRNA can lead to any one of a variety of possible mRNAs. The proteomic complexity of many higher eukaryotes is achieved in part by alternative splicing, which expands the number of mRNAs and proteins that are generated from a single gene [1].

Splicing is catalysed by an elaborate and dynamic multimegadalton ribonucleoprotein (RNP) machine, termed the spliceosome. Most premRNA introns are removed by the U2dependent (major) spliceosome that is found in all eukaryotes. The less abundant U12dependent (minor) spliceosome, on the other hand, splices a rare class of premRNA introns that is found in only a subset of eukaryotes. Here, we will focus on the function of the major spliceosome in constitutive splicing and we will refer to both the human and yeast Saccharomyces cerevisiae, since premRNA splicing has been extensively characterised in both of these organisms.

The mechanism of splicing

Within an assembled spliceosome, intron removal from the pre-mRNA substrate proceeds by way of two transesterification reactions (Figure1A). In the first step, the 3’ hydroxyl group of the so-called ‘branch site’ (BS) adenosine carries out a nucleophilic attack on the 5’ splice site (5’SS), breaking the phosphodiester bond and simultaneously forming the 2’5’ phosphodiester linkage between the BS and the 5' terminal nucleotide of the intron. The products of the first step are thus the free 5’ exon and the intron3’ exon intermediates. In the second step, the newly released 3’ hydroxyl of the 5’ exon created in the first step attacks the 3’ splice site (3’SS), breaking the phosphodiester bond there, and forming a new phosphodiester bond between the 5’ exon and the 3’ exon. Thus, the products of the second step are the ligated exons and the free intron, the latter being released in the form of a ‘lariat’. Following completion of the second step, the spliceosome dissociates and its components are recycled for further rounds of splicing.

The chemistry of splicing, i.e., two consecutive transesterification reactions, (in which one phosphodiester linkage is traded for another), is quite straightforward. Indeed, such reactions can occur without the assistance of any protein cofactors [2]. Why, then, does such a “simple” reaction require such an elaborate array of cofactors? Although RNA may be directly involved in catalysis, spliceosomal proteins are not mere passive building blocks in this process; proteins carry out essential recognition and catalytic functions during the assembly of the spliceosome and the catalytic reactions [3, 4], including conformational changes, and there is evidence that the spliceosome is actually an RNP enzyme [3]. Indeed, the 5′SS, BS, and 3′SS of nuclear pre-mRNA introns are defined by only very short consensus sequences that in metazoans, in contrast to yeast, are very poorly conserved (Figure 1B). As a consequence, introns contain relatively little secondary or tertiary structural information. Therefore, the folding of nuclear premRNA introns in a manner conducive to splicing is dependent upon a multitude of transacting factors that make up the spliceosome. During splicing, the spliceosome must overcome a number of challenges. These include the correct recognition and pairing of the splice sites within a multitude of similar sequences, as well as the positioning of these splice sites – within atomic distance from one another – that allows the transesterification reactions to proceed. Solutions to these problems come from the large number of subunits in the spliceosome and the principles by which the various protein and RNA players are brought together on the substrate pre-mRNA [5, 6].

As most splice-site consensus sequences are relatively degenerate in higher eukaryotes, where alternative splicing is predominant, it follows that splice sites alone are not capable of efficiently directing spliceosome assembly. The recognition and selection of splice sites is in most cases also influenced by flanking pre-mRNA regulatory sequences – so-called intronic and exonic splicing enhancers or silencers – that can have positive or negative effects on splice-site usage [7]. These cis-acting elements mediate their effects primarily by functioning as binding sites for trans-acting regulatory factors that in turn recruit the snRNP subunits of the splicing machinery to the adjacent splice site or, in the case of negative regulators, prevent their association. Exonic splicing enhancers (ESEs) are often bound by serineargininerich (SR) proteins, whereas exonic splicing silencers (ESSs) are typically bound by heterogeneous nuclear ribonucleoproteins (hnRNP). Ultimately, it is the sum of numerous factors, some exerting positive effects and others exerting negative effects, that decides whether a particular site is recognised by the spliceosome for inclusion of the adjacent exon in the mRNA product [6].

The stepwise assembly pathway of the spliceosome

Unlike most other enzymes, the spliceosome does not have a pre-formed active site: on the contrary, the catalytic centre must be assembled anew on each premRNA intron by the stepwise interaction of the U1, U2, U4/U6 and U5snRNPs and numerous nonsnRNP splicing factors [5]. The snRNPs are the main building blocks of the spliceosome. Each of these consists of an snRNA molecule (or two in the case of the U4/U6 snRNP), seven Sm proteins (B/B’, D3, D2, D1, E, F, and G) that are shared by all of the spliceosomal snRNPs and several other, particle-specific proteins (Figure 2) [5]. The sequence of the spliceosomal snRNAs – in particular those regions engaging in base-pairing interactions – and to some extent their secondary structures, are highly conserved evolutionarily [8]. To date all of the major spliceosomal snRNPs have been purified from human and yeast cell extracts and their protein compositions have been determined. Proteins associated with the human U1, U2 and U4/U6.U5 snRNPs under physiological conditions (i.e., 150 mM salt) are summarised in Figure 2. S.cerevisiae U1 snRNP contains seven additional proteins designated Prp39, Prp40, Prp42, Snu56, Snu71, Nam8 and Luc7 [9, 10]. With the exception of the protein 52K (in yeast, Lin1), all human U5 proteins are also found in purified U4/U6.U5 tri-snRNPs. Unlike human Prp28, the S. cerevisiae protein Prp28 is not detected in yeast trisnRNPs [11, 12]. All human U4/U6 proteins are also present in the trisnRNP, whereas the yeast Prp31 is found only in purified yeast trisnRNP and not in U4/U6. Prp6 is associated with the S.cerevisiae trisnRNP, but not with U5 [11, 12].

The RNA and protein components of the snRNPs play critical roles in splice-site recognition, in the assembly and catalytic activation of the spliceosome, and in splicing catalysis per se. The basic steps of the splicing cycle, whereby a single intron is removed from the premRNA being processed, are shown in Figure3. Initially, the U1 snRNP interacts with the 5’SS of the pre-mRNA to form the so-called E complex. The U2 snRNP then associates stably with the branch site, generating the A complex (Figure 3). Subsequent recruitment of the U4/U6 and U5 snRNPs, in the form of a preformed U4/U6.U5 trisnRNP, yields the B complex. After major conformational and compositional rearrangements, including the release of U1 and U4, an activated complex termed Bact is formed. Bact must be remodelled and transformed into the catalytically activated complex B*; this complex is able to catalyse the first step of splicing, which generates complex C. After additional rearrangements, including a conformational change in the U2 snRNA and repositioning of the reaction intermediate within the catalytic centre of the spliceosome [13], complex C catalyses the second step, after which the spliceosome dissociates: the mRNA product is released, and the excised intron remains bound to U2, U5 and U6. Finally, these snRNPs also dissociate and can then take part in the next round of splicing.

Dynamics of the spliceosomal RNA-RNA rearrangements

The spliceosome is a particularly dynamic RNP machine that undergoes many changes in composition and conformation. At the molecular level most of these changes consist in remodelling of base-pairing patterns between premRNA and snRNA, and among the snRNAs. Before activation of the spliceosome – i.e., at the stage of complex B – the U1snRNA base-pairs with the conserved sequence at the 5'SS, and the U2snRNA pairs with the BS (Figure 4A). Most of the highly conserved sequences of the U6 snRNA are essential components of the spliceosome’s active site, but in order to prevent premature activation, these regions of U6 are sequestered by base-pairing with U4, which thus functions as an ‘anti-sense negative regulator’. When appropriate signals have been recognised, U4 is actively displaced by an ATPdependent mechanism (see below), allowing U6 to refold into a catalytically active conformation and to base-pair with intron nucleotides at the 5’SS, replacing U1 in the process. U6 also forms short RNA–RNA duplexes with U2 and an intramolecular U6 stem-loop (U6ISL). These RNA structures involving U2 and U6 snRNA play crucial roles in the catalytic core of the spliceosome, with nucleotides of U6 directly involved in the catalysis of pre-mRNA splicing (the so called “U6 ribozyme hypothesis”) [14]. The U5 snRNA is initially in contact with nucleotides of the 5' exon near the 5'SS, and later also the 3' exon, which assists in correctly positioning the exons for the second step of splicing (see below, Figure 5A and B). Little is known about the precise timing of the formation of the various RNA–RNA interactions and their rearrangements during splicing. Moreover, both in human and yeast spliceosomes, there is still a paucity of information about conformational RNA interactions.

Splice-site recognition and pairinginvolves the co-ordinated action of RNA and proteins

A major task of the spliceosome is the recognition and pairing of the correct 5’ and 3’SS. During spliceosome assembly, the splice sites and branch site are recognised several times by both proteins and snRNAs, and thus both contribute to ensuring the remarkable precision of the splicing reaction. Many functionally important interactions within the spliceosome are weak, but the overall stability of a particular complex is enhanced by the combination of several weak interactions. This ensures that the spliceosome responds quickly to regulatory signals. Specifically, several recognition events occur at the 5’SS. The U1 snRNP binds to the 5’SS of the intron through basepairing interactions of the 5′end of the U1 snRNA. This interaction in higher eukaryotes is stabilised by members of the SR protein family and by the U1associated 70K and C proteins.Indeed, since most of the functionally important RNA–RNA interactions formed within the spliceosome are weak, they generally require the assistance of proteins to enhance their stability. In addition to the U15′SS interaction, the earliest assembly phase of the spliceosome also involves the binding of the SF1/BBP protein and the U2 auxiliary factor (U2AF) to the BS and the polypyrimidine tract just downstream of the BS, respectively (Figure 4B). These proteins bind cooperatively, with SF1/BBP interacting with the 65 kDa subunit of U2AF (U2AF65). In addition, the 35kDa subunit of U2AF (U2AF35), binds the AG dinucleotide of the 3′SS. Together, these molecular interactions yield the spliceosomal E complex and play crucial roles in the initial recognition of the 5′SS and 3′SS of an intron [6].

During formation of the spliceosomal A complex, the U2 snRNA engages in an ATP-dependent manner in a basepairing interaction with the pre-mRNA's BS, displacing SF1/BBP. This basepairing interaction is stabilised by heteromeric protein complexes of the U2 snRNP, namely SF3a and SF3b, and also by U2AF65. (Figure 4B). In higher eukaryotes the BS adenosine is now bound by the protein SF3b14a/p14, while U2AF65 interacts with the protein SF3b155. These RNP rearrangements, occurring as they do at an early stage of the splicing process, are relatively well understood; however, RNP rearrangements associated with subsequent steps of spliceosome assembly and catalytic activation are less well characterised. For example, little is known about how, during the activation of the spliceosome, U1 is replaced at the 5’SS by the U6 snRNA, or about what determines the contact of U5 snRNA and Prp8 with nucleotides at or near the 5’SS and the 5’exon. Also poorly understood are the steps leading to the dissociation of U2AF35 from the 3′SS and the replacement of a 3′SS interactionin later stages of splicing by a different set of factors after the first transesterification reaction.

Driving forces and molecular switches required during spliceosome’s activation and catalysis

During spliceosome assembly, the dynamic network of RNA–RNA interactions (e.g. Figure 4A) plays a central part in juxtaposing the reactive groups of the pre-mRNA. The dynamic remodelling of RNA–RNA and RNA–protein interactions during spliceosome assembly, dissociation and catalysis requires appropriate driving forces and molecular switches. These functions are carried out primarily by DExD/H-type RNAdependent ATPases/helicases. Eight of these helicases (Sub2/UAP56, Prp5, Prp28/U5-100K, Brr2/U5-200K, Prp2, Prp16, Prp22, and Prp43) are evolutionarily conserved between yeast and human and act at specific steps of splicing during formation of the spliceosomal RNA/RNP network [4, 15]. By stimulating conformational transitions within the spliceosome, DExD/Htype ATPases play an integral part in the maintenance of splicing fidelity.

Initially, when U1 snRNA base-pairs with the 5’SS, Prp5 and probably Sub2/UAP56 mediate the entry or stabilisation of U2 snRNP at the BS. Two additional helicases and two evolutionarily conserved proteins which are components of the U5 and U4/U6.U5 tri-snRNP, Prp8 and Snu114, are required for the transition from the B to the Bact complex, during which activation takes place. Initially, Prp28 mediates the transfer of the 5'SS from U1 to U6. Unlike the other RNA helicases, which interact only transiently with the spliceosome, Brr2 is a core component of the U5 and U4/U6.U5 tri-snRNP [11], and a component of the spliceosome throughout the splicing cycle [16, 17], suggesting that Brr2 requires regulation at several steps. Brr2 is required for the unwinding of the U4/U6 duplex, a process that allows the U6RNA to base-pair with the U2RNA[18, 19] and again during the dissociation of the spliceosome [20]. Prp8, one of the spliceosomal proteins most highly conserved in evolution, interacts with Brr2, and the ubiquitinated form of Prp8 represses Brr2 helicase activity [21]. Thus, posttranslational modification of Prp8 probably acts as a switch to regulate the activity of Brr2. Snu114, the homologue of the ribosomal translocase EF2 GTPase [22], also modulates Brr2 activity [20, 23]. It has been shown that the GTPase Snu114 mediates the regulation of spliceosome activation [23] and disassembly [20]. Specifically, both the unwinding of U4/U6 and the dismantling of the post-splicing U2/U6.U5 intron complex are repressed by Snu114 bound to GDP and activated by Snu114 bound to GTP [20]. Despite the fact that Snu114p is homologous to the ribosomal translocase EFG/EF-2 [22], these finding suggest that Snu114 functions as a classical regulatory G protein. In summary,the combined action of these enzymes yields the Bact complex. The final catalytic activation of Bact to yield B*, requires the RNA helicase Prp2. The B* complex catalyses step1 of splicing, yielding the C complex. After a further remodelling step, which requires the RNA helicase Prp16, complex C catalyses the second step. The spliced mRNA is released from the excised intron/postspliceosomal complex, a process which requires the RNA helicase Prp22. Finally, Prp43 in cooperation with Brr2 and Snu114 promotes the dissociation of the U2, U5 and U6 snRNPs from the excised intron (Figure 2) [20].

Several DExD/H-box proteins such as Prp5, Prp2, Prp16 and Prp22 couple rearrangements of RNP with proof-reading functions that ensure the fidelity of the splicing process. These enzymes facilitate the progression of the splicing process when a given step is accurately carried out and/or allow for the discard of substrates or intermediates that are aberrant and thus not rapidly used as substrates during the subsequent step [24]. While these observations provide highly interesting initial insight into the problem of how the spliceosome may discriminate against aberrant substrates, more mechanistically oriented questions such as how the DExD/H-box ATPases use ATP to enhance fidelity, or how aberrant substrates are ejected from the spliceosome, cannot currently be answered at all.

A conformational twostate model for the spliceosome’s catalytic centre

Since the substrates for the two chemical reactions are different (Figure 5A and B), a spatial rearrangement of the substrate(s) and/or enzyme at the catalytic centre is necessary to reposition the splicing intermediates generated during the first catalytic step, so that the reactive groups involved in the second step are brought closer together. Thus, the spliceosome must be pictured as existing in two distinct conformational states during the catalytic phase, binding the substrates differently for the two steps [25]. Understanding this repositioning may also help in identifying key spliceosomal components involved in catalysis [26].

Consistently with this idea, a large number of mutations in spliceosomal factors (Prp8, Prp16, U6 snRNA, Isy1) alter the relative efficiencies of the first and second steps. Analogously to the ribosome, where the decoding by tRNA involves transitions between open and closed conformations at the 30S subunit's A site that are modulated by the stability of interface contacts, it has been suggested that the catalytic centre of the spliceosome may likewise ‘toggle’ between open and closed states during the catalytic phase. Similarly, it has been suggested that the first and second catalytic steps require different conformational states of the spliceosome during the catalytic phase [25]. As the ATPase Prp2 and Prp16 are required for activating the spliceosome prior to the first and second catalytic steps, respectively, the equilibrium between these conformations is probably modulated by these factors, which most probably play a major role in facilitating conformational changes of the catalytic centre and thus also in positioning of the substrates to the active site. Specifically, the Prp16 ATPase facilitates the transition between the first and second steps, and, as a result, it also provides an opportunity for discarding of substrates that do not proceed efficiently to the second step. This modulation of transition and opportunity for discarding probably occurs at several points in both assembly and postcatalytic phases. However, very little is currently known about the nature of these remodelling steps. The two-state model of the catalytic spliceosome has also been extremely helpful in reconciling the effects of various splicing factors with respect to the fidelity with which the spliceosome discriminates against aberrant introns. Guthrie and collaborators proposed and tested the idea that the DEADbox ATPase Prp16 functions as an ATPdependent “proofreading clock.” [27]. This paradigm, linking fidelity and ATP hydrolysis, remains a very exciting theme in the splicing field [28, 29].