14.4 mRNA Processing
Melissa Hardy and Elizabeth Dahlhoff
Learning Objectives
By the end of this section, you will be able to do the following:
- Describe the different steps in RNA processing.
- Understand the significance of exons, introns, and splicing.
mRNA Processing
After transcription, eukaryotic pre-mRNA undergoes extensive processing before it is ready to be translated. Eukaryotic protein-coding sequences are not continuous, as they are in prokaryotes. The coding sequence exons are interrupted by noncoding introns, which must be removed to make a translatable mRNA. The additional steps involved in eukaryotic mRNA maturation also create a molecule with a much longer half-life than a prokaryotic mRNA. Eukaryotic mRNAs last for several hours, whereas the typical E. coli mRNA lasts no more than five seconds.
Pre-mRNAs are first coated in RNA-stabilizing proteins; these protect the pre-mRNA from degradation while it is processed and exported out of the nucleus. The three most important steps of pre-mRNA processing are the addition of stabilizing and signaling factors at the 5′ and 3′ ends of the molecule, and the removal of the introns.

5′ Capping
While the pre-mRNA is still being synthesized, a 7-methylguanosine cap, also called the 5' cap, is added to the 5′ end of the growing transcript by a phosphate linkage. This functional group protects the nascent mRNA from degradation. In addition, factors involved in protein synthesis recognize the cap to help initiate translation by ribosomes.
3′ Poly-A Tail
Once elongation is complete, the pre-mRNA is cleaved by an endonuclease between an AAUAAA consensus sequence and a GU-rich sequence, leaving the AAUAAA sequence on the pre-mRNA. An enzyme called poly-A polymerase then adds a string of approximately 200 A residues, called the poly-A tail. This modification further protects the pre-mRNA from degradation and is also the binding site for a protein necessary for exporting the processed mRNA to the cytoplasm.
RNA Editing
Other editing may occur in mRNA. The trypanosomes are a group of protozoa that include the pathogen Trypanosoma brucei, which causes sleeping sickness in humans. Trypanosomes, and virtually all other eukaryotes, have organelles called mitochondria that supply the cell with chemical energy. Mitochondria are organelles that express their own DNA and are believed to be the remnants of a symbiotic relationship between a eukaryote and an engulfed prokaryote. The mitochondrial DNA of trypanosomes exhibit an interesting exception to The Central Dogma: their pre-mRNAs do not have the correct information to specify a functional protein. Usually, this is because the mRNA is missing several U nucleotides. The cell performs an additional RNA processing step called RNA editing to remedy this (credit: modification of work by Torsten Ochsenreiter).
Other genes in the mitochondrial genome encode 40- to 80-nucleotide guide RNAs. One or more of these molecules interacts by complementary base pairing with some of the nucleotides in the pre-mRNA transcript. However, the guide RNA has more A nucleotides than the pre-mRNA has U nucleotides to bind with. In these regions, the guide RNA loops out. The 3′ ends of guide RNAs have a long poly-U tail, and these U bases are inserted in regions of the pre-mRNA transcript at which the guide RNAs are looped. This process is entirely mediated by RNA molecules. That is, guide RNAs—rather than proteins—serve as the catalysts in RNA editing.
RNA editing is not just a phenomenon of trypanosomes. In the mitochondria of some plants, almost all pre-mRNAs are edited. RNA editing has also been identified in mammals such as rats, rabbits, and even humans. What could be the evolutionary reason for this additional step in pre-mRNA processing? One possibility is that the mitochondria, being remnants of ancient prokaryotes, have an equally ancient RNA-based method for regulating gene expression. In support of this hypothesis, edits made to pre-mRNAs differ depending on cellular conditions. Although speculative, the process of RNA editing may be a holdover from a primordial time when RNA molecules, instead of proteins, were responsible for catalyzing reactions.
Splicing
Eukaryotic genes are composed of exons, which correspond to protein-coding sequences (ex-on signifies that they are expressed), and intervening sequences called introns (int-ron denotes their intervening role), which may be involved in gene regulation but are removed from the pre-mRNA during processing. Intron sequences in mRNA do not encode functional proteins.
The discovery of introns came as a surprise to researchers in the 1970s who expected that pre-mRNAs would specify protein sequences without further processing, as they had observed in prokaryotes. The genes of higher eukaryotes very often contain one or more introns. These regions may correspond to regulatory sequences; however, the biological significance of having many introns or having very long introns in a gene is unclear. It is possible that introns slow down gene expression because it takes longer to transcribe pre-mRNAs with lots of introns. Alternatively, introns may be nonfunctional sequence remnants left over from the fusion of ancient genes throughout the course of evolution. This is supported by the fact that separate exons often encode separate protein subunits or domains. For the most part, the sequences of introns can be mutated without ultimately affecting the protein product.
All of a pre-mRNA’s introns must be completely and precisely removed before protein synthesis. If the process errs by even a single nucleotide, the reading frame of the rejoined exons would shift, and the resulting protein would be dysfunctional. The process of removing introns and reconnecting exons is called splicing. Introns are removed and degraded while the pre-mRNA is still in the nucleus. Splicing occurs by a sequence-specific mechanism that ensures introns will be removed and exons rejoined with the accuracy and precision of a single nucleotide. Although the intron itself is noncoding, the beginning and end of each intron is marked with specific nucleotides: GU at the 5′ end and AG at the 3′ end of the intron. The splicing of pre-mRNAs is conducted by complexes of proteins and RNA molecules called spliceosomes.
Note that more than 70 individual introns can be present, and each has to undergo the process of splicing—in addition to 5′ capping and the addition of a poly-A tail—just to generate a single, translatable mRNA molecule.
Section Summary
Eukaryotic pre-mRNAs are modified with a 5′ methylguanosine cap and a poly-A tail. These structures protect the mature mRNA from degradation and help export it from the nucleus. Pre-mRNAs also undergo splicing, in which introns are removed and exons are reconnected with single-nucleotide accuracy. Only finished mRNAs that have undergone 5′ capping, 3′ polyadenylation, and intron splicing are exported from the nucleus to the cytoplasm. Rarely, RNA editing is also performed to insert missing bases after an mRNA has been synthesized.
Review Question
Errors in splicing are implicated in cancers and other human diseases. What kinds of mutations might lead to splicing errors? Think of different possible outcomes if splicing errors occur.
Mutations in the spliceosome recognition sequence at each end of the intron, or in the proteins and RNAs that make up the spliceosome, may impair splicing. Mutations may also add new spliceosome recognition sites. Splicing errors could lead to introns being retained in spliced RNA, exons being excised, or changes in the location of the splice site.
Glossary
- 7-methylguanosine cap
- modification added to the 5′ end of pre-mRNAs to protect mRNA from degradation and assist translation
- anticodon
- three-nucleotide sequence in a tRNA molecule that corresponds to an mRNA codon
- exon
- sequence present in protein-coding mRNA after completion of pre-mRNA splicing
- intron
- non–protein-coding intervening sequences that are spliced from mRNA during processing
- poly-A tail
- modification added to the 3′ end of pre-mRNAs to protect mRNA from degradation and assist mRNA export from the nucleus
- RNA editing
- direct alteration of one or more nucleotides in an mRNA that has already been synthesized
- splicing
- process of removing introns and reconnecting exons in a pre-mRNA
Text adapted from OpenStax Biology 2e and used under a Creative Commons Attribution License 4.0.
Access for free at https://openstax.org/books/biology-2e/pages/1-introduction
Media Attributions
- mrna processing © OpenStax is licensed under a CC BY (Attribution) license
The Central Dogma: DNA Encodes RNA; RNA Encodes Protein
The flow of genetic information in cells from DNA to mRNA to protein is described by the central dogma, which states that genes specify the sequence of mRNAs, which in turn specify the sequence of amino acids making up all proteins. Gene expression is the process of using information from a gene to make a functional product, such as a protein. Because the information stored in DNA is so central to cellular function, it makes sense that the cell would make mRNA copies of this information for protein synthesis, while keeping the DNA itself intact and protected.
Transcription, the copying of DNA to RNA is relatively straightforward in terms of information flow, with one nucleotide being added to the mRNA strand for every nucleotide read in the DNA strand. The translation to protein is a bit more complex because three mRNA nucleotides correspond to one amino acid in the polypeptide sequence. Nucleotides 1 to 3 correspond to amino acid 1, nucleotides 4 to 6 correspond to amino acid 2, and so on.

https://youtu.be/gG7uCskUOrA
The Genome
The cell's entire genetic content is its genome. Genomes consist of one or more chromosomes. Each chromosome is a single, double-stranded molecule of DNA. Prokaryotes generally have a single circular chromosome. Eukaryotes generally have multiple linear chromosomes that are enclosed within a membrane-bound nucleus. Each eukaryotic species has a characteristic number of chromosomes per cell. For example, humans have 46 chromosomes per cell. Eukaryotic genomes also include mitochondrial DNA and/or plastid DNA. These organelles have their own DNA because they were originally derived from free-living bacteria.
Chromatin and Chromosomes
Chromatin is the material that makes up a chromosome. It consists of DNA and proteins. The major proteins in chromatin are histone proteins, which function to package and condense the DNA molecule. Each chromosome contains one or two double-stranded DNA molecules. The word chromosome is composed of two parts: “chromo” meaning colored or stained and “some” meaning object or body. It is important to recognize that chromosome refers to a complete object. A single chromosome may contain thousands of genes.

When a eukaryotic cell is actively undergoing cell division, the chromatin is tightly packaged into a compact chromosome structure. When a cell is not actively dividing, the chromatin is more relaxed and spread out so that gene expression can take place.

Genes
A gene is defined as a sequence of DNA that codes for a functional product. Many genes contain the information to make protein products. Other genes code for RNA products. On each chromosome, there are thousands of genes that are responsible for determining the genotype and phenotype of the individual. The human genome contains about 3 billion base pairs and has around 20,000 genes that code for proteins.
