Skip to main content
Biology LibreTexts

15.1: The Genetic Code

  • Page ID
    70075
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The Central Dogma: DNA Encodes RNA; RNA Encodes Protein

    To summarize what we know to this point, the cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and uracil (U). Translation of the mRNA template converts nucleotide-based genetic information into a protein product. This flow of genetic information in cells from DNA to mRNA to protein is described by the Central Dogma (Figure \(\PageIndex{1}\)), which states that genes specify the sequence of mRNAs, which in turn specify the sequence of proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis, while keeping the DNA itself intact and protected.

    It turns out that the central dogma is not always true. We will not discuss the exceptions here, however.

    To make a protein, genetic information encoded by the DNA must be transcribed onto an mRNA molecule. The RNA is then processed by splicing to remove exons and by the addition of a 5' cap and a poly-A tail. A ribosome then reads the sequence on the mRNA, and uses this information to string amino acids into a protein.
    Figure \(\PageIndex{1}\): Instructions on DNA are transcribed onto messenger RNA. Ribosomes are able to read the genetic information inscribed on a strand of messenger RNA and use this information to string amino acids together into a protein.

    Amino Acid Structure

    Protein sequences consist of 20 commonly occurring amino acids (Figure \(\PageIndex{2}\)); therefore, it can be said that the protein alphabet consists of 20 letters. Different amino acids have different chemistries (such as acidic versus basic, or polar and non-polar) and different structural constraints. Variation in amino acid sequence gives rise to enormous variation in protein structure and function.

    Structures of the twenty amino acids are given. Six amino acids—glycine, alanine, valine, leucine, methionine, and isoleucine—are non-polar and aliphatic, meaning they do not have a ring. Six amino acids—serine, threonine, cysteine, proline, asparagine, and glutamate—are polar but uncharged. Three amino acids—lysine, arginine, and histidine—are positively charged. Two amino acids, glutamate and aspartate, are negatively charged. Three amino acids—phenylalanine, tyrosine, and tryptophan—are nonpolar and aromatic.
    Figure \(\PageIndex{2}\): Structures of the 20 amino acids found in proteins are shown. Each amino acid is composed of an amino group (NH3+), a carboxyl group (COO-), and a side chain (blue). The side chain may be nonpolar, polar, or charged, as well as large or small. It is the variety of amino acid side chains that gives rise to the incredible variation of protein structure and function.

    Genetic Code

    Each amino acid is defined by a three-nucleotide sequence called the triplet codon. The relationship between a nucleotide codon and its corresponding amino acid is called the genetic code. Given the different numbers of “letters” in the mRNA (4 – A, U, C, G) and protein “alphabets” (20 different amino acids) one nucleotide could not correspond to one amino acid. Nucleotide doublets would also not be sufficient to specify every amino acid because there are only 16 possible two-nucleotide combinations (42). In contrast, there are 64 possible nucleotide triplets (43), which is far more than the number of amino acids. Scientists theorized that amino acids were encoded by nucleotide triplets and that the genetic code was degenerate. In other words, a given amino acid could be encoded by more than one nucleotide triplet (Figure \(\PageIndex{3}\)). These nucleotide triplets are called codons.

    The same codon will always specify the insertion of one specific amino acid. The chart seen in Figure \(\PageIndex{3}\) can be used to translate an mRNA sequence into an amino acid sequence. For example, the codon UUU will always cause the insertion of the amino acid phenylalanine (Phe), while the codon UUA will cause the insertion of leucine (Leu).

    Figure shows all 64 codons. Sixty-two of these code for amino acids, and three are stop codons.
    Figure \(\PageIndex{3}\): This figure shows the genetic code for translating each nucleotide triplet in mRNA into an amino acid or a termination signal in a nascent protein. (credit: modification of work by NIH)

    Each set of three bases (one codon) causes the insertion of one specific amino acid into the growing protein. This means that the insertion of one or two nucleotides can completely change the triplet “reading frame”, thereby altering the message for every subsequent amino acid (Figure \(\PageIndex{4}\)). Though insertion of three nucleotides causes an extra amino acid to be inserted during translation, the integrity of the rest of the protein is maintained.

    Illustration shows a frameshift mutation in which the reading frame is altered by the deletion of two amino acids.
    Figure \(\PageIndex{4}\): The deletion of two nucleotides shifts the reading frame of an mRNA and changes the entire protein message, creating a nonfunctional protein or terminating protein synthesis altogether.

    Three of the 64 codons terminate protein synthesis and release the polypeptide from the translation machinery. These triplets are called stop codons. Another codon, AUG, also has a special function. In addition to specifying the amino acid methionine, it also serves as the start codon to initiate translation. The reading frame for translation is set by the AUG start codon near the 5′ end of the mRNA. The genetic code is universal. With a few exceptions, virtually all species use the same genetic code for protein synthesis, which is powerful evidence that all life on Earth shares a common origin.

    References

    Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

    OpenStax, Biology. OpenStax CNX. January 2, 2017 https://cnx.org/contents/GFy_h8cu@10...fig-ch15_01_05


    15.1: The Genetic Code is shared under a CC BY license and was authored, remixed, and/or curated by LibreTexts.