10.4: The Genetic Code

Last updated
Save as PDF

Page ID: 16152

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We have blithely described the purpose of the DNA chromosomes as carrying the information for building the proteins of the cell, and the RNA as the intermediary for doing so. Exactly how is it, though, that a molecule made up of just four different nucleotides joined together (albeit thousands and even thousands of thousands of them), can tell the cell which of twenty-odd amino acids to string together to form a functional protein? The obvious solution was that since there are not enough individual unique nucleotides to code for each amino acid, there must be combinations of nucleotides that designate particular amino acids. A doublet code, would allow for only 16 different combinations (4 possible nucleotides in the first position x 4 possible nucleotides in the second position = 16 combinations) and would not be enough to encode the 20 amino acids. However, a triplet code would yield 64 combinations, easily enough to encode 20 amino acids. So would a quadruplet or quintuplet code, for that matter, but those would be wasteful of resources, and thus less likely. Further investigation proved the existence of a triplet code as described in the table below.

With so many combinations and only 20 amino acids, what does the cell do with the other possibilities? The genetic code is a degenerate code, which means that there is redundancy so that most amino acids are encoded by more than one triplet combination (codon). Although it is a redundant code, it is not an ambiguous code: under normal circumstances, a given codon encodes one and only one amino acid. In addition to the 20 amino acids, there are also three “stop codons” dedicated to ending translation. The three stop codons also have colloquial names: UAA (ochre), UAG (amber), UGA (opal), with UAA being the most common in prokaryotic genes.

The colloquial names were started when the discoverers of UAG decided to name the codon after a friend whose last name translated into “amber”. Opal and ochre were named to continue the idea of giving stop codons color names.

The stop codons are sometimes also used to encode what are now considered the 21^st and 22^nd amino acids, selenocysteine (UGA) and pyrrolysine (UAG). These amino acids have been discovered to be consistently encoded in some species of prokarya and archaea.

Note that there are no dedicated start codons: instead, AUG codes for both methionine and the start of translation, depending on the circumstance, as explained forthwith. The initial Met is a methionine, but in prokaryotes, it is a specially modified formyl-methionine (f-Met). The tRNA is also specialized and is different from the tRNA that carries methionine to the ribosome for addition to a growing polypeptide. Therefore, in referring to a loaded initiator tRNA, the usual nomenclature is fMet-tRNA_i or fMet-tRNA_f. There also seems to be a little more leeway in defining the start site in prokaryotes than in eukaryotes, as some bacteria use GUG or UUG. Though these codons normally encode valine and leucine, respectively, when they are used as start codons, the initiator tRNA brings in f-Met.

Although the genetic code as described is nearly universal, there are some situations in which it has been modified, and the modifications retained in evolutionarily stable environments. The mitochondria in a broad range of organisms demonstrate stable changes to the genetic code including converting the AGA from encoding arginine into a stop codon and changing AAA from encoding lysine to encoding asparagine. Rarely, a change is found in translation of an organismic (nuclear) genome, but most of those rare alterations are conversions to or from stop codons.

Other minor alterations to the genetic code exist as well, but the universality of the code in general remains. Some mitochondrial DNAs can use different start codons: human mitochondrial ribosomes can use AUA and AUU. In some yeast species, the CGA and CGC codons for arginine are unused. Many of these changes have been cataloged by the National Center for Biotechnology Information (NCBI) based on work by Jukes and Osawa at the University of California at Berkeley (USA) and the University of Nagoya (Japan), respectively.