2.1: Overview of Transcription
Learning Objectives
- Identify the key steps of transcription, the function of the promoter and the function of RNA polymerase.
- Distinguish between coding (RNA-like) and non-coding (template) strands of DNA. Understand that within a single piece of DNA, either strand can be used as the template for different genes, but the RNA will still be produced from 5’ → 3’.
- Draw a line diagram showing a segment of DNA from a gene and its RNA transcript, indicating which DNA strand is the template, the direction of transcription and the polarities of all DNA and RNA strands.
- Give examples of non-coding RNA molecules.
What is transcription?
Consider that all of the cells in a multicellular organism have arisen by division from a single fertilized egg and therefore, all have the same DNA. Division of that original fertilized egg produces, in the case of humans, over a trillion cells, by the time a baby is produced from that egg (that's a lot of DNA replication!). Yet, we also know that a baby is not a giant ball of a trillion identical cells, but has the many different kinds of cells that make up tissues like skin and muscle and bone and nerves. How did cells that have identical DNA turn out so different?
The answer lies in gene expression, which is the process by which the information in DNA is used. Although all the cells in a baby have the same DNA, each different cell type uses a different subset of the genes in that DNA to direct the synthesis of a distinctive set of RNAs and proteins. The first step in gene expression is transcription, the process of copying information from DNA sequences into RNA sequences. This process is also known as DNA-dependent RNA synthesis. When a sequence of DNA is transcribed, only one of the two DNA strands is copied into RNA, when this RNA encodes a protein is it known as messenger RNA (mRNA).
Important features of transcription
- All RNA, mRNA as well as tRNA, rRNA, microRNA and more, is produced by transcription.
- Only one strand of DNA is used as a template by enzymes called RNA polymerases
- RNA is synthesized from 5' to 3'.
- RNA polymerases do not need primers to begin transcription.
- The four ribonucleotide triphosphates (rNTPs) are ATP, GTP, UTP, and CTP.
- RNA polymerases begin transcription at DNA sequences called promoters.
- RNA polymerases end transcription at sequences called terminators.
In transcription, an RNA polymerase uses only one strand of DNA, called the template strand, of a gene to catalyze synthesis of a complementary, antiparallel RNA strand. RNA polymerases use ribose nucleotide triphosphate (NTP) precursors, in contrast to DNA polymerases, which use deoxyribose nucleotide (dNTP) precursors (compared on page 1.1: The Structure of DNA ). In addition, RNAs incorporate uracil (U) nucleotides into RNA strands instead of the thymine (T) nucleotides used in DNA. RNA polymerases differ from DNA polymerases in that they do not require primers. With the help of transcription initiation factors, RNA polymerase locates the transcription start site of a gene and begins synthesis of a new RNA strand from scratch by joining the two ribonucleotides that are complementary to the first two bases of the template strand.
Overview of the Stages of Transcription
The basic steps of transcription are initiation, elongation, and termination. Here we can identify several of the DNA sequences that characterize a gene. The promoter is the binding site for RNA polymerase. It usually lies 5’ to, or upstream of the transcription start site. Binding of the RNA polymerase positions the enzyme to near the transcription start site, where it will start unwinding the double helix and begin synthesizing new RNA. The transcribed grey DNA region in each of the three panels are the transcription unit of the gene. Termination sites are typically 3’ to, or downstream from the transcribed region of the gene. By convention, upstream refers to DNA 5’ to a given reference point on the DNA (e.g., the transcription start-site of a gene). Downstream then, refers to DNA 3’ to a given reference point on the DNA.
RNA polymerase
Building an RNA strand is very similar to building a DNA strand. This is not surprising, knowing that DNA and RNA are very similar molecules. What enzyme carries out transcription? Transcription is catalyzed by the enzyme RNA Polymerase. "RNA polymerase" is a general term for an enzyme that makes RNA. There are many different RNA polymerases.
Like DNA polymerases, RNA polymerases synthesize new strands only in the 5' to 3' direction, but because they are making RNA, they use ribonucleotides (i.e., RNA nucleotides) rather than deoxyribonucleotides. Ribonucleotides are joined in exactly the same way as deoxyribonucleotides, which is to say that the 3'OH of the last nucleotide on the growing chain is joined to the 5' phosphate on the incoming nucleotide.
One important difference between DNA polymerases and RNA polymerases is that the latter do not require a primer to start making RNA. Once RNA polymerases are in the right place to start copying DNA, they just begin making RNA by stringing together RNA nucleotides complementary to the DNA template.
This, of course, brings us to an obvious question- how do RNA polymerases "know" where to start copying on the DNA. Unlike the situation in replication, where every nucleotide of the parental DNA must eventually be copied, transcription, as we have already noted, only copies selected genes into RNA at any given time.What indicates to an RNA polymerase where to start copying DNA to make a transcript? Signals in DNA indicate to RNA polymerase where it should start (and end) transcription. These signals are special sequences in DNA that are recognized by the RNA polymerase or by proteins that help RNA polymerase determine where it should bind the DNA to start transcription. A DNA sequence at which the RNA polymerase binds to start transcription is called a promoter.
A promoter is generally situated upstream of the gene that it controls. What this means is that on the DNA strand that the gene is on, the promoter sequence is "before" the gene. Remember that, by convention, DNA sequences are read from 5' to 3'. So the promoter lies 5' to the start point of transcription.
Also notice that the promoter is said to "control" the gene it is associated with. This is because expression of the gene is dependent on the binding of RNA polymerase to the promoter sequence to begin transcription. If the RNA polymerase and its helper proteins do not bind the promoter, the gene cannot be transcribed and it will therefore, not be expressed.
What is special about a promoter sequence? In an effort to answer this question, scientists looked at many genes and their surrounding sequences. It makes sense that because the same RNA polymerase has to bind to many different promoters, the promoters should have some similarities in their sequences. Sure enough, common sequence patterns were seen to be present in many promoters. We will first take a look at prokaryotic promoters. When prokaryotic genes were examined, the following features commonly emerged:
- A transcription start site (this the base in the DNA across from which the first RNA nucleotide is paired).
- A -10 sequence: this is a 6 bp region centered about 10 bp upstream of the start site. The consensus sequence at this position is TATAAT. In other words, if you count back from the transcription start site, which by convention, is called the +1, the sequence found at -10 in the majority of promoters studied is TATAAT).
- A -35 sequence: this is a sequence at about 35 basepairs upstream from the start of transcription. The consensus sequence at this position is TTGACA.
What is the significance of these sequences? It turns out that the sequences at -10 and -35 are recognized and bound by a subunit of prokaryotic RNA polymerase before transcription can begin.
The RNA polymerase of E. coli, for example, has a subunit called the sigma (σ) subunit (or sigma factor) in addition to the core polymerase, which is the part of the enzyme that actually makes RNA. Together, the sigma subunit and core polymerase make up what is termed the RNA polymerase holoenzyme . The sigma subunit of the polymerase can recognize and bind to the -10 and -35 sequences in the promoter, thus positioning the RNA polymerase at the right place to initiate transcription. Once transcription begins, the core polymerase and the sigma subunit separate, with the core polymerase continuing RNA synthesis and the sigma subunit wandering off to escort another core polymerase molecule to a promoter. The sigma subunit can be thought of as a sort of usher that leads the polymerase to its "seat" on the promoter.
As already mentioned, an RNA chain, complementary to the DNA template, is built by the RNA polymerase by the joining of the 5' phosphate of an incoming ribonucleotide to the 3'OH on the last nucleotide of the growing RNA strand. How does the polymerase know where to stop? A sequence of nucleotides called the terminator is the signal to the RNA polymerase to stop transcription and dissociate from the template.
Although the process of RNA synthesis is the same in eukaryotes as in prokaryotes, there are some additional issues to keep in mind in eukaryotes. One is that in eukaryotes, the DNA template exists as chromatin, where the DNA is tightly associated with histones and other proteins. The "packaging" of the DNA must therefore be opened up to allow the RNA polymerase access to the template in the region to be transcribed.
A second difference is that eukaryotes have multiple RNA polymerases, not one as in bacterial cells. The different polymerases transcribe different genes. For example, RNA polymerase I transcribes the ribosomal RNA genes, while RNA polymerase III copies tRNA genes. The RNA polymerase we will focus on most is RNA polymerase II, which transcribes protein-coding genes to make mRNAs.
All three eukaryotic RNA polymerases need additional proteins to help them get transcription started. In prokaryotes, RNA polymerase by itself can initiate transcription (remember that the sigma subunit is a subunit of the prokaryotic RNA polymerase). The additional proteins needed by eukaryotic RNA polymerases are referred to as transcription factors.
Finally, in eukaryotic cells, transcription is separated in space and time from translation. Transcription happens in the nucleus, and the mRNAs produced are processed further before they are sent into the cytoplasm. Protein synthesis (translation) happens in the cytoplasm. In prokaryotic cells, mRNAs can be translated as they are coming off the DNA template, and because there is no nucleus, transcription and protein synthesis occur in a single cellular compartment.
Like genes in prokaryotes, eukaryotic genes also have promoters. Eukaryotic promoters commonly have a TATA box, a sequence about 25 base pairs upstream of the start of transcription that is recognized and bound by proteins that help the RNA polymerase to position itself correctly to begin transcription. (Some eukaryotic promoters lack TATA boxes, and have, instead, other recognition sequences to help the RNA polymerase find the spot on the DNA where it spot on the DNA where it binds and initiates transcription.)
We noted earlier that eukaryotic RNA polymerases need additional proteins to bind promoters and start transcription. What are these additional proteins that are needed to start transcription? General transcription factors are proteins that help eukaryotic RNA polymerases find transcription start sites and initiate RNA synthesis. We will focus on the transcription factors that assist RNA polymerase II. These transcription factors are named TFIIA, TFIIB and so on (TF= transcription factor, II=RNA polymerase II, and the letters distinguish individual transcription factors).
Transcription in eukaryotes requires the general transcription factors and the RNA polymerase to form a complex at the TATA box called the basal transcription complex or transcription initiation complex. This is the minimum requirement for any gene to be transcribed. The first step in the formation of this complex is the binding of the TATA box by a transcription factor called the TATA Binding Protein or TBP. Binding of the TBP causes the DNA to bend at this spot and take on a structure that is suitable for the binding of additional transcription factors and RNA polymerase. As shown in the figure at left, a number of different general transcription factors, together with RNA polymerase (Pol II) form a complex at the TATA box.
The final step in the assembly of the basal transcription complex is the binding of a general transcription factor called TFIIH. TFIIH is a multifunctional protein that has helicase activity (i.e., it is capable of opening up a DNA double helix) as well as kinase activity. The kinase activity of TFIIH adds a phosphate onto the C-terminal domain (CTD) of the RNA polymerase. This phosphorylation appears to be the signal that releases the RNA polymerase from the basal transcription complex and allows it to move forward and begin transcription.
Either DNA strand can be a template
The promoter is the sequence of DNA that encodes the information about where to begin transcription for each gene. Depending on the promoter, either strand of DNA can be used as the template strand.
Watch this video to see how either strand of DNA can be used as a template for different genes on the same chromosome.
template vs. non-template strands summary
- The template strand is the one that RNA polymerase uses as the basis to build the RNA. This strand is also called the non-coding strand or the antisense strand.
- The non-template strand has the identical sequence of the RNA (except for the substituion of U for T). This strand is also called the coding strand or sense strand .
Major Types of Cellular RNA
Cells make several different kinds of RNA:
- mRNAs that code for proteins
- rRNAS that form part of ribosomes
- tRNAs that serve as adaptors between mRNA and amino acids during translation
- MicroRNAs that regulate gene expression
- Other small RNAs that have a variety of functions.