At the end of this laboratory, students should be able to:
identify amino acids by their 1-letter code.
explain the differences between high and low scores on the BLOSUM 62 matrix.
use the BLASTP algorithm to compare protein sequences.
identify conserved regions in a multiple sequence alignment.
As species evolve, their proteins change. The rate at which an individual protein sequence changes varies widely, reflecting the evolutionary pressures that organisms experience and the physiological role of the protein. Our goal this semester is to determine if the proteins involved in Met and Cys biosynthesis have been functionally conserved between S. pombe andS. cerevisiae, species that are separated by close to a billion years of evolution. In this lab, you will search databases for homologs of S. cerevisiae sequences in several species, including S. pombe. Homologs are similar DNA sequences that are descended from a common gene. When homologs are found in different species, they are referred to as orthologs.
Homologs within the same genome are referred to as paralogs. Paralogs arise by gene duplication, but diversify over time and assume distinct functions. Although a whole genome duplication occurred during the evolution of S. cerevisiae (Kellis et al., 2004), only a few genes in the methionine superpathway have paralogs. Interestingly, MET17 is paralogous to three genes involved in sulfur transfer: STR1 (CYS3), STR2 and STR4, reflecting multiple gene duplications. The presence of these four distinct enzymes confers unusual flexibility to S. cerevisiae in its use of sulfur sources. The SAM1 and SAM2 genes are also paralogs, but their sequences have remained almost identical, providing functional redundancy if one gene is inactivated (Chapter 6).
Protein function is intimately related to its structure. You will recall that the final folded form of a protein is determined by its primary sequence, the sequence of amino acids. Protein functionality changes less rapidly during evolution when the amino acid substitutions are conservative. Conservative substitutions occur when the size and chemistry of a new amino acid side chain is similar to the one it is replacing. In this lab, we will begin with a discussion of amino acid side chains. You will then use the BLASTP algorithm to identify orthologs in several model organisms. You will perform a multiple sequence alignment that will distinguish regions which are more highly conserved than others.
As you work through the exercises, you will note that protein sequences in databases are written in the 1-letter code. Familiarity with the 1-letter code is an essential skill for today’s molecular biologists.