A. The (Nearly) Universal, Degenerate Genetic Code
The genetic code is the information for linking amino acids into polypeptides in an order based on the base sequence of 3-base code words (codons) in a gene and its messenger RNA (mRNA). With a few exceptions (some prokaryotes, mitochondria, chloroplasts), the genetic code is universal – it’s the same in all organisms from viruses and bacteria to humans. The table of the Standard Universal Genetic Code on the next page shows the RNA version of triplet codons and their corresponding amino acids. There is a single codon for two amino acids (methionine and tryptophan), but two or more codons for each of the other 18 amino acids. For the latter reason, we say that the genetic code is degenerate. The three stop codons in the Standard Genetic Code ‘tell’ ribosomes the location of the last amino acid to add to a polypeptide. The last amino acid itself can be any amino acid consistent with the function of the polypeptide being synthesized. However, evolution has selected AUG as the start codon for all polypeptides, regardless of function, as well as for the placement of methionine within a polypeptide. Thus, all polypeptides begin life with a methionine at their amino-terminal end. As we will see in more detail, the mRNA translation machine is the ribosome and the decoding device is tRNA. Each amino acid attaches to a tRNA whose short sequence contains a 3-base anticodon that is complementary to an mRNA codon. Enzymatic reactions catalyze the dehydration synthesis (condensation) reactions that link amino acids in peptide bonds in the order specified by codons in the mRNA.
B. Comments on the Nature and Evolution of Genetic Information
The near-universality of the genetic code from bacteria to humans implies that the code originated early in evolution. It is probable that portions of the code were in place even before life began. Once in place however, the genetic code was highly constrained against evolutionary change. The degeneracy of the genetic code enabled and contributed to this constraint by permitting base many base changes that do not affect the amino acid encoded in a codon.
The near universality of the genetic code and its resistance to change are features of our genomes that allow us to compare gene and other DNA sequences to establish evolutionary relationships between organisms (species), groups of organisms (genus, family, order, etc.) and even individuals within a species.
In addition to constraints imposed by a universal genetic code, some organisms show codon bias, a recent constraint on which universal codons an organism uses. Codon bias is seen in organisms preferably use A-T rich codons, or in organisms that favor codons richer in G and C. Interestingly, codon bias in genes often accompanies corresponding genomic nucleotide bias. An organism with an AT codon bias may also have an AT-rich genome (likewise GC-rich codons in GC-rich genomes). You can recognize genome nucleotide bias in Chargaff’s base ratios!
Finally, we often think of genetic information as genes for proteins. Obvious examples of non-coding genetic information include the genes for rRNAs and tRNAs, common to all organisms. The amount of these kinds of informational DNA (i.e., genes for polypeptides, tRNAs and rRNAs) as a proportion of total DNA can range across species, although it is higher in eukaryotes prokaryotes. For example, ~88% of the E. coli circular chromosome encodes polypeptides, while that figure is less ~1.5% for humans. Some less obvious informative DNA sequences in higher organisms are transcribed (e.g., introns). Other informative DNA in the genome is never transcribed. The latter include regulatory DNA sequences, DNA sequences that support chromosome structure and other DNAs that contribute to development and phenotype. As for that amount of truly non-informative (useless) DNA in a eukaryotic genome, that amount is steadily shrinking as we sequence entire genomes, identify novel DNA sequences and discover novel RNAs (topics covered elsewhere in this text).