As you might remember from chemistry carbon atoms (C) typically form four bonds. We can think of an amino acid as a (highly) modified form of methane (CH4), with the C referred to as the alpha carbon (Cα). Instead of four hydrogens attached to the central C, there is one H, an amino group (-NH2), a carboxylic acid group (-COOH), and a final, variable (R) group attached to the central Cα atom. The four groups attached to the α-carbon are arranged at the vertices of a tetrahedron. If all four groups attached to the α-carbon are different from one another, as they are in all amino acids except glycine, the resulting amino acid can exist in two possible form, known as enantiomeric stereoisomers. Enantiomers are mirror images of one another and are referred to as the L- and D- forms. Only L-type amino acids are found in proteins, even though there is no obvious chemical reason that proteins could not have also been made using both types of amino acids or using only D-amino acids229. It appears that the universal use of L-type amino acids in the polypeptides found in biological systems is another example of the evolutionary relatedness of organisms, it appears to be a homologous trait. Even though there are hundreds of different amino acids known, only 22 amino acids (these include the 20 common amino acids and two others, selenocysteine and pyrrolysine) are found in proteins.
Amino acids differ from one another by their R-groups, which are often referred to as "side-chains". Some of these R-groups are large, some are small, some are hydrophobic, some are hydrophilic, some of the hydrophilic R-groups contain weak acidic or basic groups. The extent to which these weak acidic or basic groups are positively or negatively charged will change in response to environmental pH. Changes in charge will (as we will see) influence the structure of the polypeptide/protein in which they find themselves. The different R-groups provide proteins with a broad range of chemical properties, which are further extended by the presence of co-factors.
As we noted for nucleic acids, a polymer is a chain of subunits, amino acid monomers linked together by peptide bonds. Under the conditions that exist inside the cell, this is a thermodynamically unfavorable dehydration reaction, and so must be coupled to a thermodynamically favorable reaction. A molecule formed from two amino acids, joined together by a peptide bond, is known as a dipeptide. As in the case of each amino acid, the dipeptide has an N-terminal (amino) end and a C-terminal (carboxylic acid) end. To generate a polypeptide, new amino acids are added sequentially (and exclusively) to the C-terminal end of the polymer. A peptide bond forms between the amino group of the added amino acid and the carboxylic acid group of the polymer; the formation of a peptide bond is associated with the release of a water molecule. When complete, the reaction generates a new C-terminal carboxylic acid group. It is important to note that while some amino acids have a carboxylic acid group as part of their R-groups, new amino acids are not added there. Because of this fact, polypeptides are unbranched, linear polymers. This process of amino acid addition can continue, theoretically without limit. Biological polypeptides range from the very short (5-10) to very long (many hundreds to thousands) amino acids in length. For example, the protein Titin230 (found in muscle cells) can be more than 30,000 amino acids in length. Because there is no theoretical constraint on which amino acids occurs at a particular position within a polypeptide, there is a enormous universe of possible polypeptides that can exist. In the case of a 100 amino acid long polypeptide, there are 20100 possible different polypeptides that could, in theory, be formed.