5.3 Cloning and Recombinant Expression
To accomplish the applications described above, biochemists must be able to extract, manipulate, and analyze nucleic acids. To understand the basic techniques used to work with nucleic acids, remember that nucleic acids are macromolecules made of nucleotides (a sugar, a phosphate, and a nitrogenous base). The phosphate groups on these molecules each have a net negative charge. An entire set of DNA molecules in the nucleus of eukaryotic organisms is called the genome. DNA has two complementary strands linked by hydrogen bonds between the paired bases.
Unlike DNA in eukaryotic cells, RNA molecules leave the nucleus. Messenger RNA (mRNA) is analyzed most frequently because it represents the protein-coding genes that are being expressed in the cell.
DNA isolation techniques have been described in section 5.1 and are the first step used to study or manipulate nucleic acids. RNA can also be extracted and is studied to understand gene expression patterns in cells. RNA is naturally very unstable because enzymes that break down RNA are commonly present in nature. Some are even secreted by our own skin and are very difficult to inactivate. During RNA extraction, RNase inhibitors and the special treatment of glassware are used to reduce the risk of destroying the sample during isolation
Because nucleic acids are negatively charged ions at neutral or alkaline pH in an aqueous environment, they can be moved by an electric field. Gel electrophoresis is a technique used to separate charged molecules on the basis of size and charge. The nucleic acids can be separated as whole chromosomes or as fragments. The nucleic acids are loaded into a slot at one end of a gel matrix, an electric current is applied, and negatively charged molecules are pulled toward the opposite end of the gel (the end with the positive electrode). Smaller molecules move through the pores in the gel faster than larger molecules; this difference in the rate of migration separates the fragments on the basis of size. The nucleic acids in a gel matrix are invisible until they are stained with a compound that allows them to be seen, such as a dye. Distinct fragments of nucleic acids appear as bands at specific distances from the top of the gel (the negative electrode end) that are based on their size (Figure 5.15). A mixture of many fragments of varying sizes appear as a long smear, whereas uncut genomic DNA is usually too large to run through the gel and forms a single large band at the top of the gel.
Polymerase Chain Reaction (PCR)
The details of PCR are discussed in section 5.1. This technique is used in DNA cloning to rapidly increase the number of copies of specific regions of DNA.
In general, cloning means the creation of a perfect replica. Typically, the word is used to describe the creation of a genetically identical copy. In biology, the re-creation of a whole organism is referred to as “reproductive cloning.” Long before attempts were made to clone an entire organism, researchers learned how to copy short stretches of DNA—a process that is referred to as molecular cloning.
Molecular cloning allows for the creation of multiple copies of genes, expression of genes, and study of specific genes. To get the DNA fragment into a bacterial cell in a form that will be copied or expressed, the fragment is first inserted into a cloning vector.
A cloning vector is a small piece of DNA that can be stably maintained in an organism, and into which a foreign DNA fragment can be inserted for cloning purposes. The cloning vector may be DNA taken from a virus, the cell of a higher organism, or it may be the plasmid of a bacterium. The vector therefore contains features that allow for the convenient insertion or removal of a DNA fragment to or from the vector, for example by treating the vector and the foreign DNA with a restriction enzyme that cuts the DNA. DNA fragments thus generated contain either blunt ends or overhangs known as sticky ends, and vector DNA and foreign DNA with compatible ends can then be joined together by molecular ligation. After a DNA fragment has been cloned into a cloning vector, it may be further subcloned into another vector designed for more specific use.
There are many types of cloning vectors, but the most commonly used ones are genetically engineered plasmids. Cloning is generally first performed using Escherichia coli, and cloning vectors in E. coli include plasmids, bacteriophages (such as phage λ), cosmids, and bacterial artificial chromosomes (BACs). Some DNA, however, cannot be stably maintained in E. coli, for example very large DNA fragments. For these studies, other organisms such as yeast may be used. Cloning vectors in yeast include yeast artificial chromosomes (YACs).
Figure 5.16 Example of a Common Cloning Vector.
Image by Ayacop and Yikrazuul
All commonly used cloning vectors in molecular biology have key features necessary for their function, such as a suitable cloning site with restriction enzymes and a selectable marker. Others may have additional features specific to their use. For reasons of ease and convenience, cloning is often performed using E. coli. Thus, the cloning vectors used often have elements necessary for their propagation and maintenance in E. coli, such as a functional origin of replication (ori). The ColE1 origin of replication is found in many plasmids. Some vectors also include elements that allow them to be maintained in another organism in addition to E. coli, and these vectors are called shuttle vectors.
All cloning vectors have features that allow a gene to be conveniently inserted into the vector or removed from it. This may be a multiple cloning site (MCS) or polylinker, which contains many unique restriction sites. The restriction sites in the MCS are first cleaved by restriction enzymes, then a PCR-amplified target gene also digested with the same enzymes is ligated into the vectors using DNA ligase. The target DNA sequence can be inserted into the vector in a specific direction if so desired. The restriction sites may be further used for sub-cloning into another vector if necessary.
Other cloning vectors may use topoisomerase instead of ligase and cloning may be done more rapidly without the need for restriction digest of the vector or insert. In this TOPO cloning method a linearized vector is activated by attaching topoisomerase I to its ends, and this "TOPO-activated" vector may then accept a PCR product by ligating both the 5' ends of the PCR product, releasing the topoisomerase and forming a circular vector in the process. Another method of cloning without the use of DNA digest and ligase is by DNA recombination, for example as used in the Gateway cloning system. The gene, once cloned into the cloning vector (called entry clone in this method), may be conveniently introduced into a variety of expression vectors by recombination.
Restriction enzymes (also called restriction endonucleases) recognize specific DNA sequences and cut them in a predictable manner; they are naturally produced by bacteria as a defense mechanism against foreign DNA.
As the name implies, restriction endonucleases (or restriction enzymes) are “restricted” in their ability to cut or digest DNA. The restriction that is useful to biochemists is usually a palindromic DNA sequence. Palindromic sequences are the same sequence forwards and backwards. Some examples of palindromes: RACE CAR, CIVIC, A MAN A PLAN A CANAL PANAMA. With respect to DNA, there are 2 strands that run antiparallelel to each other. Therefore, the reverse complement of one strand is identical to the other.
Like with a word palindrome, this means the DNA palindromic sequence reads the same forward and backward. In most cases, the sequence reads the same forward on one strand and backward on the complementary strand. REs often cut DNA into a staggered pattern. When a staggered cut is made in a sequence, the overhangs are complementary (Figure 5.17).
Figure 5.17 Restriction Enzyme Recognition Sequences. In this (a) six-nucleotide restriction enzyme recognition site, notice that the sequence of six nucleotides reads the same in the 5′ to 3′ direction on one strand as it does in the 5′ to 3′ direction on the complementary strand. This is known as a palindrome. (b) The restriction enzyme makes breaks in the DNA strands, and (c) the cut in the DNA results in “sticky ends”. Another piece of DNA cut on either end by the same restriction enzyme could attach to these sticky ends and be inserted into the gap made by this cut.
Molecular biologists also tend to use these special molecular scissors that recognize palindromes of 6 or 8. By using 6-cutters or 8-cutters, the sequences occur throughout large stretches rarely, but often enough to be of utility.
Figure 5.18 Restriction Enzymes. Restriction enzymes recognize palindromic sequences in DNA and hydrolyze covalent phosphodiester bonds of the DNA to leave either “sticky/cohesive” ends or “blunt” ends. This distinction in cutting is important because an EcoRI sticky end can be used to match up a piece of DNA cut with the same enzyme in order to glue or ligate them back together. While endonucleases cut DNA, ligases join them back together. DNA digested with EcoRI can be ligated back together with another piece of DNA digested with EcoRI, but not to a piece digested with SmaI. Another blunt cutter is EcoRV with a recognition sequence of GAT | ATC.
A selectable marker is carried by the vector to allow the selection of positively transformed cells. Antibiotic resistance is often used as marker, an example being the beta-lactamase gene, which confers resistance to the penicillin group of beta-lactam antibiotics like ampicillin. Some vectors contain two selectable markers, for example the plasmid pACYC177 has both ampicillin and kanamycin resistance gene. Shuttle vectors which are designed to be maintained in two different organisms may also require two selectable markers, although some selectable markers such as resistance to zeocin and hygromycin B are effective in different cell types. Auxotrophic selection markers that allow an auxotrophic organism to grow in minimal growth medium may also be used; examples of these are LEU2 and URA3 which are used with their corresponding auxotrophic strains of yeast.
Another kind of selectable marker allows for the positive selection of plasmid with cloned gene. This may involve the use of a gene lethal to the host cells, such as barnase, Ccda, and the parD/parE toxins. This typically works by disrupting or removing the lethal gene during the cloning process, and unsuccessful clones where the lethal gene still remains intact would kill the host cells, therefore only successful clones are selected.
Reporter genes are used in some cloning vectors to facilitate the screening of successful clones by using features of these genes that allow successful clone to be easily identified. Such features present in cloning vectors may be the lacZα fragment for α complementation in blue-white selection, and/or marker gene or reporter genes in frame with and flanking the MCS to facilitate the production of fusion proteins. Examples of fusion partners that may be used for screening are the green fluorescent protein (GFP) and luciferase.
Figure 5.19 Reporter Genes. In this diagram, the green fluorescence protein is used as a reporter gene to study upstream regulatory sequences.
Image by TransControl
Elements for expression
If the expression of the targeted gene is desired, then a cloning vector also needs to contain suitable elements for the expression of the cloned target gene, including a promoter and ribosomal binding site (RBS). The target DNA may be inserted into a site that is under the control of a particular promoter necessary for the expression of the target gene in the chosen host. Where the promoter is present, the expression of the gene is preferably tightly controlled and inducible so that proteins are only produced when required. Some commonly used promoters are the T7 and lac promoters. The presence of a promoter is necessary when screening techniques such as blue-white selection are used.
Cloning vectors without promoter and RBS for the cloned DNA sequence are sometimes used, for example when cloning genes whose products are toxic to E. coli cells. Promoter and RBS for the cloned DNA sequence are also unnecessary when first making a genomic or cDNA library of clones since the cloned genes are normally subcloned into a more appropriate expression vector if their expression is required.
Types of cloning vectors
A large number of cloning vectors are available, and choosing the right vector may depend a number of factors, such as the size of the insert, copy number and cloning method. Large DNA inserts may not be stably maintained in a general cloning vector, especially for those with a high copy number, therefore cloning large fragments may require more specialized cloning vector.
Plasmids are autonomously replicating circular extra-chromosomal DNA. They are the standard cloning vectors and the ones most commonly used. Most general plasmids may be used to clone DNA insert of up to 15 kb in size. Many plasmids have high copy number, for example pUC19 which has a copy number of 500-700 copies per cell, and high copy number is useful as it produces greater yield of recombinant plasmid for subsequent manipulation. However low-copy-number plasmids may be preferably used in certain circumstances, for example, when the protein from the cloned gene is toxic to the cells.
The bacteriophages most commonly used for cloning are the lambda (λ) phage and M13 phage. There is an upper limit on the amount of DNA that can be packed into a phage (a maximum of 53 kb). The average lambda phage genome is roughly 48.5 kb (Figure 5.20). Therefore to allow foreign DNA to be inserted into phage DNA, phage cloning vectors may need to have some of their non-essential genes deleted to make room for the foreign DNA.
There is also a lower size limit for DNA that can be packed into a phage, and vector DNA that is too small cannot be properly packaged into the phage. This property can be used for selection - vector without insert may be too small, therefore only vectors with insert may be selected for propagation.
Images A and C modified from: Nigro, O, Culley, A., and Steward, G.F. (2012) Standards in Genomic Science 6(3):415-26, and image B is from Jack Potte
Cosmids are plasmids that incorporate a segment of bacteriophage λ DNA that has the cohesive end sites (cos) which contains elements required for packaging DNA into λ particles. It is normally used to clone large DNA fragments between 28 and 45 Kb.
Bacterial artificial chromosome
Insert size of up to 350 kb can be cloned in bacterial artificial chromosome (BAC). BACs are maintained in E. coli with a copy number of only 1 per cell. BACs have often been used to sequence the genome of organisms in genome projects, including the Human Genome Project. A short piece of the organism's DNA is amplified as an insert in BACs, and then sequenced. Finally, the sequenced parts are rearranged in silico, resulting in the genomic sequence of the organism. BACs have largely been replaced in this capacity with faster and less laborious sequencing methods like whole genome shotgun sequencing and now more recently next-gen sequencing.
Yeast artificial chromosome
Yeast artificial chromosome are used as vectors to clone DNA fragments of more than 1 mega base (1Mb = 1000kb = 1,000,000 bases) in size. They are useful in cloning larger DNA fragments as required in mapping genomes such as in human genome project. It contains a telomeric sequence, an autonomously replicating sequence( features required to replicate linear chromosomes in yeast cells). These vectors also contain suitable restriction sites to clone foreign DNA as well as genes to be used as selectable markers.
Human artificial chromosome
Human artificial chromosomes may be potentially useful as a gene transfer vectors for gene delivery into human cells, and a tool for expression studies and determining human chromosome function. It can carry very large DNA fragment (there is no upper limit on size for practical purposes), therefore it does not have the problem of limited cloning capacity of other vectors, and it also avoids possible insertional mutagenesis caused by integration into host chromosomes by viral vector.
Animal and plant viral vectors that infect plant and animal cells have also been manipulated to introduce foreign genes into plant and animal cells. The natural ability of viruses to adsorb to cells , introduce their DNA and replicate have made them ideal vehicles to transfer foreign DNA into eukaryotic cells in culture. A vector based on Simian virus 40 (SV40) was used in first cloning experiment involving mammalian cells. A number of vectors based on other type of viruses like Adenoviruses and Papilloma virus have been used to clone genes in mammals. At present , retroviral vectors are popular for cloning genes in mammalian cells. In case of plant tranformation, viruses including the Cauliflower Mosaic Virus , Tobacco Mosaic Virus and Gemini Viruses have been used with limited success.
Summary of DNA Cloning
Figure 5.21 provides a summary of the basic cloning methods most widely used in biochemistry laboratories. Foreign DNA is isolated or amplified using PCR to obtain enough material for the cloning procedure. The DNA is purified and cut with restriction enzymes, and then mixed with a vector that has been cut with the same restriction enzymes. The DNA can then be stitched back together with DNA ligase. The DNA can then be transformed into a host system, often times bacteria, to grow large quantities of the plasmid containing the cloned DNA.
Restriction fragment patterning and DNA sequencing can be used to validate the cloned material.
Figure 5.21 Diagram Showing the Major Steps in Cloning.
For a Video Tutorial on DNA Cloning Visit: HHMI - BioInteractive
Plasmids with foreign DNA inserted into them are called recombinant DNA molecules because they contain new combinations of genetic material. Proteins that are produced from recombinant DNA molecules are called recombinant proteins. Not all recombinant plasmids are capable of expressing genes. Plasmids may also be engineered to express proteins only when stimulated by certain environmental factors, so that scientists can control the expression of the recombinant proteins.
Reproductive cloning is a method used to make a clone or an identical copy of an entire multicellular organism. Most multicellular organisms undergo reproduction by sexual means, which involves the contribution of DNA from two individuals (parents), making it impossible to generate an identical copy or a clone of either parent. Recent advances in biotechnology have made it possible to reproductively clone mammals in the laboratory.
Natural sexual reproduction involves the union, during fertilization, of a sperm and an egg. Each of these gametes is haploid, meaning they contain one set of chromosomes in their nuclei. The resulting cell, or zygote, is then diploid and contains two sets of chromosomes. This cell divides mitotically to produce a multicellular organism. However, the union of just any two cells cannot produce a viable zygote; there are components in the cytoplasm of the egg cell that are essential for the early development of the embryo during its first few cell divisions. Without these provisions, there would be no subsequent development. Therefore, to produce a new individual, both a diploid genetic complement and an egg cytoplasm are required. The approach to producing an artificially cloned individual is to take the egg cell of one individual and to remove the haploid nucleus. Then a diploid nucleus from a body cell of a second individual, the donor, is put into the egg cell. The egg is then stimulated to divide so that development proceeds. This sounds simple, but in fact it takes many attempts before each of the steps is completed successfully.
The first cloned agricultural animal was Dolly, a sheep who was born in 1996. The success rate of reproductive cloning at the time was very low. Dolly lived for six years and died of a lung tumor (Figure 5.22). There was speculation that because the cell DNA that gave rise to Dolly came from an older individual, the age of the DNA may have affected her life expectancy. Since Dolly, several species of animals (such as horses, bulls, and goats) have been successfully cloned.
There have been attempts at producing cloned human embryos as sources of embryonic stem cells. In the procedure, the DNA from an adult human is introduced into a human egg cell, which is then stimulated to divide. The technology is similar to the technology that was used to produce Dolly, but the embryo is never implanted into a surrogate mother. The cells produced are called embryonic stem cells because they have the capacity to develop into many different kinds of cells, such as muscle or nerve cells. The stem cells could be used to research and ultimately provide therapeutic applications, such as replacing damaged tissues. The benefit of cloning in this instance is that the cells used to regenerate new tissues would be a perfect match to the donor of the original DNA. For example, a leukemia patient would not require a sibling with a tissue match for a bone-marrow transplant.
Why was Dolly a Finn-Dorset and not a Scottish Blackface sheep?
Because even though the original cell came from a Scottish Blackface sheep and the surrogate mother was a Scottish Blackface, the DNA came from a Finn-Dorset.
Using recombinant DNA technology to modify an organism’s DNA to achieve desirable traits is called genetic engineering. Addition of foreign DNA in the form of recombinant DNA vectors that are generated by molecular cloning is the most common method of genetic engineering. An organism that receives the recombinant DNA is called a genetically modified organism (GMO). If the foreign DNA that is introduced comes from a different species, the host organism is called transgenic. Bacteria, plants, and animals have been genetically modified since the early 1970s for academic, medical, agricultural, and industrial purposes.
Watch this short video explaining how scientists create a transgenic animal.
Although the classic methods of studying the function of genes began with a given phenotype and determined the genetic basis of that phenotype, modern techniques allow researchers to start at the DNA sequence level and ask: “What does this gene or DNA element do?” This technique, called reverse genetics, has resulted in reversing the classical genetic methodology. One example of this method is analogous to damaging a body part to determine its function. An insect that loses a wing cannot fly, which means that the wing’s function is flight. The classic genetic method compares insects that cannot fly with insects that can fly, and observes that the non-flying insects have lost wings. Similarly in a reverse genetics approach, mutating or deleting genes provides researchers with clues about gene function. Alternately, reverse genetics can be used to cause a gene to overexpress itself to determine what phenotypic effects may occur.
CRISPR stands for clustered regularly interspaced short palindromic repeats and represents a family of DNA sequences found within the genomes of prokaryotic organisms such as bacteria and archaea. These sequences are derived from DNA fragments of bacteriophages that have previously infected the prokaryote and are used to detect and destroy DNA from similar phages during subsequent infections. Hence these sequences play a key role in the antiviral defense system of prokaryotes.
5.23 Crystal structure of a CRISPR RNA-guided surveillance complex, Cascade, bound to a ssDNA target. CRISPR system Cascade protein subunits CasA, CasB, CasC, CasD, and CasE (cyan) bound to CRISPR RNA (green) and viral DNA (red) based on PDB 4QYZ and rendered with PyMOL.
Image from Boghog
Cas9 (or "CRISPR-associated protein 9") is an enzyme that uses CRISPR sequences as a guide to recognize and cleave specific strands of DNA that are complementary to the CRISPR sequence. Cas9 enzymes together with CRISPR sequences form the basis of a technology known as CRISPR-Cas9 that can be used to edit genes within organisms. This editing process has a wide variety of applications including basic biological research, development of biotechnology products, and treatment of diseases.
Image by James Atmos
The CRISPR-Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements such as those present within plasmids and phages that provides a form of acquired immunity. RNA harboring the spacer sequence helps Cas (CRISPR-associated) proteins recognize and cut foreign pathogenic DNA. Other RNA-guided Cas proteins cut foreign RNA. CRISPR are found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.