Proteins are polymers of a bifunctional monomer, the amino acid. The twenty common naturally-occurring amino acids each contain an a-carbon, an a-amino group, an a-carboxylic acid group, and an a-side chain or side group. These side chains (or R groups) may be either nonpolar, polar and uncharged, or charged, depending on the pH and pKa of the ionizable group. Two other amino acids occasionally appear in proteins. One is selenocysteine, which is found in Arachea, eubacteria, and animals. Another just recently found is pyrrolysine, found in Arachea. Shultz et al. have gone one step further. They have engineered bacterial to incorporate two new amino acids, O-methyl-tyrosine and p-aminophenylalanine. More recently, they (Chin et al.) have engineered the yeast strain Saccharomyces cerevisiae to incorporate five new unnatural amino acid (using the TAG nonsense codon and new, modified tRNA and tRNA synthetases) with keto groups that allow chemical modifications to the protein. We will concentrate only on the 20 abundant, naturally-occurring amino acids.
- Structure and Property of the Naturally-Occurring Amino Acids (Too large to include in text: print separately)
- Learning Amino Acids Structure: YouTube - Part 1 | Part 2
Amino acids form polymers through a nucleophilic attack by the amino group of an amino acid at the electrophilic carbonyl carbon of the carboxyl group of another amino acid. The carboxyl group of the amino acid must first be activated to provide a better leaving group than OH-. (We will discuss this activation by ATP latter in the course.) The resulting link between the amino acids is an amide link which biochemists call a peptide bond. In this reaction, water is released. In a reverse reaction, the peptide bond can be cleaved by water (hydrolysis).
When two amino acids link together to form an amide link, the resulting structure is called a dipeptide. Likewise, we can have tripeptides, tetrapeptides, and other polypeptides. At some point, when the structure is long enough, it is called a protein. The average molecular weight of proteins in yeast is about 50,000 with about 450 amino acids. The large protein might be titin with molecular weight of about 3 million (about 27,0000 amino acids). A new class of very small proteins (30 or fewer amino acids and perhaps better named as polypeptides) called smORFs (small open reading frames) have recently been discovered to have significant biological activity (Science, doi:10.1126/science.1238802, 2013). These are encoded directly in the genome and are produced by the same processes that produce regular proteins (DNA transcription and RNA translation). They are not simply the result of selective cleavage of a larger protein into smaller peptide fragments.
There are many different ways to represent the structure of a polypeptide or protein. each showing differing amounts of information.
Figure: Different Representations of a Polypeptide (heptapeptide)
Figure: Amino Acids React to Form Proteins
(Note: above picture represents the amino acid in an unlikely protonation state with the weak acid protonated and the weak base deprotonated for simplicity in showing removal of water on peptide bond formation and the hydrolysis reaction.) Proteins are polymers of twenty naturally occurring amino acids. In contrast, nucleic acids are polymers of just 4 different monomeric nucleotides. Both the sequence of a protein and it's total length differentiate one protein from another. Just for an octapeptide, there are over 25 billion different possible arrangement of amino acids (820). Compare this to just 65536 different oligonucleotides (4 different monomeric deoxynucleotides) of 8 monomeric units, an 8mer (84). Hence the diversity of possible proteins is enormous.
Please consult the Jmol site below dealing with amino acids. Please learn the 3 letter code for the amino acids.
Jsmol: Amino Acids from Charles S. Gasser, UC Davis Jmol: Amino Acids