Skip to main content
Biology LibreTexts

9.3: Cloning and Recombinant Expression

  • Page ID
    14971
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Search Fundamentals of Biochemistry

    Learning Goals

    (Learning goals written by Claude, Sonnet 4.6, Anthropic)

    Molecular Cloning: Vectors, Restriction Enzymes, and Selection

    • Describe the essential features of a cloning vector — an origin of replication (ori, such as ColE1) to allow autonomous propagation in the host, a multiple cloning site (MCS) containing unique restriction enzyme recognition sequences for directional insertion of foreign DNA, and a selectable marker (such as the β-lactamase gene conferring ampicillin resistance) to identify transformed cells — and explain why the insert size, copy number, and downstream application determine the choice among plasmids (up to ~15 kb, 500–700 copies/cell for pUC19), bacteriophage λ (up to 53 kb), cosmids (28–45 kb), BACs (up to 350 kb, 1 copy/cell), YACs (>1 Mb), and human artificial chromosomes (no practical upper limit).
    • Explain the molecular mechanism of restriction endonuclease-mediated DNA cloning — describing how Type II restriction enzymes recognize palindromic DNA sequences (4–8 bp) and hydrolyze the phosphodiester backbone in a staggered pattern to generate either 5' or 3' single-stranded overhangs ("sticky ends," e.g., EcoRI: 5'-G↓AATTC-3') or flush cuts ("blunt ends," e.g., SmaI: 5'-CCC↓GGG-3'), how the complementarity of sticky ends generated by the same enzyme allows ligation by DNA ligase to form recombinant molecules, and why sticky-end ligation is directional and specific while blunt-end ligation requires no sequence compatibility.
    • Explain blue-white screening as a strategy for identifying recombinant clones — describing how insertion of foreign DNA into the lacZα fragment within the MCS of pUC19 disrupts α-complementation between the plasmid-encoded lacZα peptide and the chromosomally encoded ω fragment of β-galactosidase, so that bacteria carrying non-recombinant plasmids (intact lacZ) hydrolyze X-gal to produce blue colonies while bacteria carrying recombinant plasmids (disrupted lacZ) form white colonies — and connect this to the broader principle of reporter gene strategies (GFP fusion, luciferase) for identifying successful clones.

    Genetic Engineering, Reproductive Cloning, and CRISPR

    • Describe the key steps in reproductive cloning by somatic cell nuclear transfer (SCNT) — explaining why the enucleated recipient egg cell (not just the donor nucleus) is essential because the egg cytoplasm contains developmental factors required for early embryogenesis, how the donor diploid nucleus is fused with the enucleated egg by electrical stimulation, and how the resulting zygote is implanted in a surrogate — using Dolly the sheep (1996) as the example, noting that Dolly's breed was determined by her nuclear DNA donor (Finn-Dorset), not by the egg donor or surrogate (both Scottish Blackface), and connecting telomere shortening in older donor DNA to potential concerns about lifespan.
    • Explain the CRISPR-Cas9 system — describing its natural origin as a prokaryotic adaptive immune system in which phage-derived spacer sequences are integrated into the bacterial chromosome between palindromic repeats (CRISPRs), transcribed into crRNA, and used by Cas proteins (in the Cascade surveillance complex) to recognize and cleave complementary foreign DNA from re-infecting phages — and explain how this mechanism was repurposed as a genome-editing tool in which a synthetic guide RNA (sgRNA) directs the Cas9 nuclease to create a double-strand break at any genomic location complementary to the 20-nt guide sequence, enabling targeted gene disruption, correction, or insertion with broad applications in biological research, medicine, and biotechnology.

    To clone a gene from an organism and express it in either prokaryotic or eukaryotic cells, DNA from a target source must be isolated, purified, amplified, analyzed, and sequenced as described in previous sections.

    Cloning

    In general, cloning means creating a perfect replica. Typically, the word describes the creation of a genetically identical copy. In biology, re-creating a whole organism is called “reproductive cloning.” Long before attempts were made to clone an entire organism, researchers learned how to copy short stretches of DNA—a process called molecular cloning.

    Molecular cloning enables the creation of multiple copies of genes, their expression, and the study of specific genes. The fragment is first inserted into a cloning vector to introduce it into a bacterial cell in a form that can be copied or expressed.

    Cloning vector

    A cloning vector is a small piece of DNA that can be stably maintained in an organism and into which a foreign DNA fragment can be inserted for cloning. The cloning vector may be DNA taken from a virus, a higher organism's cell, or a bacterium's plasmid. The vector contains features that allow convenient insertion or removal of a DNA fragment into or from the vector, for example, by treating the vector and the foreign DNA with a restriction enzyme that cuts the DNA. DNA fragments thus generated contain either blunt ends or overhangs known as sticky ends, and vector DNA and foreign DNA with compatible ends can then be joined together by molecular ligation. A DNA fragment cloned into a cloning vector may be further subcloned into another vector designed for more specific use.

    There are many types of cloning vectors, but the most commonly used are genetically engineered plasmids. Cloning is generally performed first in Escherichia coli, and cloning vectors include plasmids, bacteriophages (such as phage λ), cosmids, and bacterial artificial chromosomes (BACs). However, some DNA, such as very large DNA fragments, cannot be stably maintained in E. coli. Other organisms, such as yeast, may be used for these studies. Cloning vectors in yeast include yeast artificial chromosomes (YACs). The common bacterial cloning plasmid, pRB322, is shown in Figure \(\PageIndex{1}\).

    Diagram of a circular pathway with labeled points: Init, Hand I, Hand II, Exit, Hand III, and Knot.
    Figure \(\PageIndex{1}\): The cloning vector pRB322. Image by Ayacop and Yikrazuul

    All commonly used cloning vectors in molecular biology have key features necessary for their function, such as a suitable cloning site, restriction enzyme sites, and a selectable marker. Others may have additional features specific to their use. Cloning is often performed using E. coli for ease and convenience. Thus, cloning vectors often have elements necessary for their propagation and maintenance in E. coli, such as a functional origin of replication (ori). The ColE1 origin of replication is found in many plasmids. Some vectors also include elements that allow them to be maintained in another organism in addition to E. coli; these vectors are called shuttle vectors.

    Cloning site

    All cloning vectors have features that allow a gene to be conveniently inserted or removed from the vector. This may be a multiple cloning site (MCS) or polylinker containing many unique restriction sites. The restriction sites in the MCS are first cleaved by restriction enzymes, and then a PCR-amplified target gene, which is also digested with the same enzymes, is ligated into the vectors using DNA ligase. If desired, the target DNA sequence can be inserted into the vector in a specific direction. The restriction sites may be further sub-cloned into another vector if necessary.

    Other cloning vectors may use topoisomerase instead of ligase, allowing cloning to be done more rapidly without the need for restriction digestion or vector insertion. In this TOPO cloning method, a linearized vector is activated by attaching topoisomerase I to its ends, and this "TOPO-activated" vector may then accept a PCR product by ligating both the 5' ends of the PCR product, releasing the topoisomerase, and forming a circular vector in the process. Another method of cloning without using a DNA digest and ligase is DNA recombination, for example, as used in the Gateway cloning system. Once cloned into the cloning vector (called the entry clone in this method), the gene may be conveniently introduced into a variety of expression vectors by recombination.

    Restriction Enzymes

    Restriction enzymes (restriction endonucleases) recognize and predictably cut specific DNA sequences; bacteria naturally produce them as a defense mechanism against foreign DNA.

    As the name implies, restriction endonucleases (or restriction enzymes) are “restricted” in their ability to cut or digest DNA. The restriction that is useful to biochemists is usually a palindromic DNA sequence. Palindromic sequences are the same sequence, forwards and backward. Some examples of palindromes: RACE CAR, CIVIC, A MAN A PLAN A CANAL PANAMA. DNA has two complementary strands. Therefore, the reverse complement of one strand is identical to the other.

    Like a palindromic word, the DNA palindromic sequence reads the same forward and backward. The sequence usually reads the same forward on one strand and backward on the complementary strand. Restriction enzymes often cut DNA into a staggered pattern. When a staggered cut is made in a sequence, the overhangs are complementary, as shown in Figure \(\PageIndex{2}\).

    Diagram illustrating three columns with colored segments and arrows indicating direction, likely representing a process or sequence.
    Figure_10_01_04.jpg">http://opentextbc.ca/biology/wp-cont...e_10_01_04.jpg

    Figure \(\PageIndex{2}\): Restriction Enzyme Recognition Sequences. In this, (a) six-nucleotide restriction enzyme recognition site, notice that the sequence of six nucleotides reads the same in the 5′ to 3′ direction on one strand as it does in the 5′ to 3′ direction on the complementary strand. This is known as a palindrome. (b) The restriction enzyme breaks the DNA strands and (c) the cut in the DNA results in “sticky ends”. Another piece of DNA cut on either end by the same restriction enzyme could attach to these sticky ends and be inserted into the gap made by this cut. http://opentextbc.ca/biology/wp-cont...e_10_01_04.jpg

    Molecular biologists also tend to use specialized molecular scissors that recognize palindromes of 6 or 8 nucleotides. By using 6-cutters or 8-cutters, the sequences occur rarely but often enough to be useful. Figure \(\PageIndex{3}\) the sequence for HindII cuts.


    HindIII Restriction site and sticky ends vector

    Figure \(\PageIndex{3}\): Sequence of HindIII stick end cuts.

    Figure \(\PageIndex{4}\) shows restriction enzyme cuts that leave sticky or blunt end.

    EcoRI restriction enzyme recognition site

    SmaI restriction enzyme recognition site

    Figure \(\PageIndex{4}\): Restriction Enzymes. Restriction enzymes recognize palindromic sequences in DNA and hydrolyze covalent phosphodiester bonds of the DNA to leave either “sticky/cohesive” or “blunt” ends. This distinction in cutting is important because an EcoRI sticky end can match up a piece of DNA cut with the same enzyme to glue or ligate them back together. While endonucleases cut DNA, ligases join them back together. DNA digested with EcoRI can be ligated back together with another piece of DNA digested with EcoRI, but not with a piece digested with SmaI. Another blunt cutter is EcoRV with a recognition sequence of GAT | ATC.

    Selectable marker

    The vector carries a selectable marker to allow the selection of positively transformed cells. Antibiotic resistance is often used as a marker; for example, the beta-lactamase gene confers resistance to the penicillin group of beta-lactam antibiotics, such as ampicillin. Some vectors contain two selectable markers. For example, the plasmid pACYC177 has both ampicillin and kanamycin resistance genes. Shuttle vectors designed to be maintained in two organisms may also require two selectable markers. However, some selectable markers, such as resistance to zeocin and hygromycin B, are effective in different cell types. Auxotrophic selection markers that allow an auxotrophic organism to grow in a minimal medium may also be used; examples include LEU2 and URA3, which are used with their corresponding auxotrophic yeast strains.

    Another selectable marker enables the positive selection of a plasmid carrying cloned genes. This may involve using a gene lethal to the host cells, such as barnase, Ccda, and the parD/parE toxins. This typically works by disrupting or removing the lethal gene during the cloning process; unsuccessful clones in which the lethal gene remains intact would kill the host cells. Therefore, only successful clones are selected.

    Reporter genes

    Reporter genes are used in some cloning vectors to facilitate screening for successful clones by leveraging their features, which allow them to be easily identified. Such features in cloning vectors may include the lacZα fragment for α-complementation in the blue-white selection and/or marker or reporter genes in frame with and flanking the MCS to facilitate the production of fusion proteins. Examples of fusion partners that may be used for screening are the green fluorescent protein (GFP) and luciferase. Figure \(\PageIndex{5}\) shows such a construct with GFP.

    Diagram showing a regulatory sequence linked to a reporter gene, illustrating DNA to mRNA translation and protein expression measurement.

    Figure \(\PageIndex{5}\): Reporter Genes. In this diagram, the green fluorescence protein is used as a reporter gene to study upstream regulatory sequences. Image by TransControl

    Elements for expression

    Suppose the targeted gene is to be expressed. In that case, a cloning vector also needs to contain suitable elements for expressing the cloned target gene, including a promoter and ribosomal binding site (RBS). The target DNA may be inserted into a site under the control of a particular promoter necessary to express the target gene in the chosen host. Where the promoter is present, the gene's expression is preferably tightly controlled and inducible, so that proteins are produced only when required. Commonly used promoters include the T7 and lac promoters. A promoter is required when screening techniques such as blue-white selection are used.

    Cloning vectors without a promoter and RBS for the cloned DNA sequence are sometimes used, for example, when cloning genes whose products are toxic to E. coli cells. The promoter and RBS for the cloned DNA sequence are also unnecessary when making a genomic or cDNA library, since the cloned genes are normally subcloned into a more appropriate expression vector if their expression is required.

    Types of cloning vectors

    Many cloning vectors are available, and choosing the right vector may depend on factors such as insert size, copy number, and cloning method. Large DNA inserts may not be stably maintained in a general cloning vector, especially in high-copy-number vectors. Cloning large fragments may require a more specialized cloning vector.

    Plasmids

    Plasmids are autonomously replicating circular extra-chromosomal DNA. They are the standard cloning vectors and the ones most commonly used. Most general plasmids may be used to clone DNA inserts up to 15 kb in size. Many plasmids have high copy numbers. For example, pUC19 has a copy number of 500-700 copies per cell, and a high copy number is useful as it produces a greater yield of recombinant plasmid for subsequent manipulation. However, low-copy-number plasmids may be preferred in certain circumstances, such as when the protein encoded by the cloned gene is toxic to cells.

    Bacteriophage

    The bacteriophages most commonly used for cloning are the lambda (λ) phage and the M13 phage. There is an upper limit to the amount of DNA that can be packed into a phage (53 kb). The average lambda phage genome is roughly 48.5 kb. Therefore, to allow foreign DNA to be inserted into phage DNA, phage cloning vectors may need to have some of their non-essential genes deleted to make room for the foreign DNA. Figure \(\PageIndex{6}\) shows the phage sequence and cartoon structure.

    There is also a lower limit on the size of DNA that can be packed into a phage. This property can be used for selection - vectors without inserts may be too small, therefore, only vectors with inserts may be selected for propagation.

    Diagram of λ phage genetic structure with labeled parts, alongside an electron micrograph of the phage particle.

    Figure \(\PageIndex{6}\): Lambda Phage. (A) Schematic representation of the circular genome of the lambda phage (B), Diagram of the Lambda Phage infectious particle, and (C) Electron micrograph of the related bacteriophage, vibriophage VvAWI. The bar denotes 50 nm in length. Images A and C are modified from Nigro, O, Culley, A., and Steward, G.F. (2012) Standards in Genomic Science 6(3):415-26, and image B is from Jack Potte

    Cosmids

    Cosmids are plasmids that incorporate a segment of bacteriophage λ DNA that has the cohesive end sites (cos), which contain elements required for packaging DNA into λ particles. It is typically used to clone large DNA fragments ranging from 28 to 45 kb.

    Bacterial artificial chromosome

    Inserts up to 350 kb can be cloned into a bacterial artificial chromosome (BAC). BACs are maintained in E. coli with a copy number of only 1 per cell. BACs have often been used to sequence the genomes of organisms in genome projects, including the Human Genome Project. A short piece of the organism's DNA is amplified as an insert in BACs and then sequenced. Finally, the sequenced parts are rearranged in silico to yield the organism's genomic sequence. BACs have largely been replaced in this capacity by faster, less labor-intensive sequencing methods such as whole-genome shotgun sequencing and, more recently, next-gen sequencing.

    Yeast artificial chromosome

    Yeast artificial chromosomes are used as vectors to clone DNA fragments larger than 1 megabase (1 Mb = 1,000 kb = 1,000,000 bases). They are useful for cloning larger DNA fragments, as required for genome mapping, such as in the Human Genome Project. It contains a telomeric sequence and an autonomously replicating sequence (features required for replicating linear chromosomes in yeast cells). These vectors also contain suitable restriction sites to clone foreign DNA and genes to be used as selectable markers.

    Human artificial chromosome

    Human artificial chromosomes may be useful as gene transfer vectors for gene delivery into human cells and as tools for expression studies and for determining human chromosome function. It can carry very large DNA fragments (there is no practical upper limit). Therefore, it does not have the limited cloning capacity of other vectors, and it also avoids possible insertional mutagenesis caused by integration into host chromosomes by a viral vector.

    Animal and plant viral vectors that infect both plant and animal cells have also been manipulated to introduce foreign genes into these cells. The natural ability of viruses to bind to cells, introduce their DNA, and replicate has made them ideal vehicles for transferring foreign DNA into eukaryotic cells in culture. A vector based on Simian virus 40 (SV40) was used in the first mammalian cloning experiment.  Several vectors based on viruses, such as Adenoviruses and Papillomaviruses, have been used to clone genes in mammals. At present, retroviral vectors are popular for cloning genes in mammalian cells. In plant transformation, viruses, including Cauliflower Mosaic Virus, Tobacco Mosaic Virus, and Gemini Viruses, have been used with some success.

    Summary of DNA Cloning

    Figure \(\PageIndex{7}\) summarizes the basic cloning methods most widely used in biochemistry laboratories. Foreign DNA is isolated or amplified using PCR to obtain enough material for cloning. The DNA is purified and cut with restriction enzymes, and then mixed with a vector cut with the same restriction enzymes. The DNA can then be stitched back together with DNA ligase. The DNA can then be transformed into a host system, often bacteria, to grow large quantities of the plasmid containing the cloned DNA.

    Restriction fragment patterning and DNA sequencing can be used to validate the cloned material. For a Video Tutorial on DNA cloning, visit HHMI - BioInteractive.

    An illustration showing the steps in creating recombinant DNA plasmids, inserting them into bacteria, and then selecting only the bacteria that have successfully taken up the recombinant plasmid. The steps are as follows: both foreign DNA and a plasmid are cut with the same restriction enzyme. The restriction site occurs only once in the plasmid in the middle of a gene for an enzyme (lacZ). The restriction enzyme leaves complementary sticky ends on the foreign DNA fragment and the plasmid. This allows the foreign DNA to be inserted into the plasmid when the sticky ends anneal. Adding DNA ligase reattaches the DNA backbones. These are recombinant plasmids. The plasmids are combined with a culture of living bacteria. Many of the bacteria do not take any plasmids into their cells, many take plasmids that do not have the foreign DNA in them, and a few take up the recombinant plasmid. The bacteria that take up the recombinant plasmid cannot make the enzyme from the gene that the fragment was inserted into (lacZ). They also carry a gene for resistance to the antibiotic ampicillin, which was on the original plasmid. To find the bacteria with the recombinant plasmid, the bacteria are grown on a plate with the antibiotic ampicillin and a substance that changes color when exposed to the enzyme produced by the lacZ gene. The ampicillin will kill any bacteria that did not take up a plasmid. The color of the substance will not change when the gene for lacZ contains the foreign DNA insert. These are the bacteria with the recombinant plasmid that we want to grow.
    Figure \(\PageIndex{7}\): Diagram Showing the Major Steps in Cloning.

    Plasmids with foreign DNA inserted into them are called recombinant DNA molecules because they contain new combinations of genetic material. Proteins that are produced from recombinant DNA molecules are called recombinant proteins. Not all recombinant plasmids can express genes. Plasmids may also be engineered to express proteins only when stimulated by specific environmental factors, allowing scientists to control the expression of recombinant proteins.

    Reproductive Cloning

    Reproductive cloning is a method used to create an identical copy of an entire multicellular organism. Most multicellular organisms reproduce sexually, which involves the contribution of DNA from two individuals (parents), making it impossible to generate an identical copy, or clone, of either parent. Recent advances in biotechnology have enabled the reproductive cloning of mammals in the laboratory.

    Natural sexual reproduction involves the union, during fertilization, of a sperm and an egg. Each of these gametes is haploid, meaning it contains one set of chromosomes in its nucleus. The resulting cell, or zygote, is then diploid and contains two sets of chromosomes. This cell divides mitotically to produce a multicellular organism. However, the union of any two cells cannot produce a viable zygote; the cytoplasm of the egg cell contains components essential for the embryo's early development during its first few cell divisions. Without these provisions, there would be no subsequent development. Therefore, a diploid genetic complement and an egg cytoplasm are required to produce a new individual. The approach to producing an artificially cloned individual is to take the egg cell of one individual and remove the haploid nucleus. Then, a diploid nucleus from the body cell of a second individual, the donor, is put into the egg cell. The egg is then stimulated to divide, allowing development to proceed. This sounds simple, but it takes many attempts to complete each step successfully.

    The first cloned agricultural animal was Dolly, a sheep that was born in 1996 (see Figure \(\PageIndex{8}\) below). The success rate of reproductive cloning at the time was very low. Dolly lived for six years and died of a lung tumor. There was speculation that because the cell DNA that gave rise to Dolly came from an older individual, the age of the DNA may have affected her life expectancy. Since Dolly, several species of animals (such as horses, bulls, and goats) have been successfully cloned.

    There have been attempts at producing cloned human embryos as sources of embryonic stem cells. In the procedure, the DNA from an adult human is introduced into a human egg cell, which is then stimulated to divide. The technology is similar to the technology used to produce Dolly, but the embryo is never implanted into a surrogate mother. The cells produced are called embryonic stem cells because they can develop into many different kinds of cells, such as muscle or nerve cells. The stem cells could be used for research and provide therapeutic applications, such as replacing damaged tissues. The benefit of cloning in this instance is that the cells used to regenerate new tissues would be a perfect match to the donor of the original DNA. For example, a leukemia patient would not require a sibling with a tissue match for a bone marrow transplant.

    The illustration shows the steps in cloning the sheep named Dolly. An enucleated egg cell from one sheep is fused with a mammary cell from another sheep. This fused cell then divides to the blastocyst stage and is placed in the uterus of the surrogate ewe, where it develops into the lamb, Dolly. Dolly is the genetic clone of the mammary cell donor.
    Figure \(\PageIndex{8}\): Cloning of Dolly, the first agricultural animal to be cloned.

    To create Dolly, the nucleus of a donor egg cell was removed. The enucleated egg was placed next to the other cell, and they were shocked to fuse. They were shocked again when they started the division. The cells were allowed to divide for several days until an early embryonic stage was reached, then implanted into a surrogate mother.

    Why was Dolly a Finn-Dorset and not a Scottish Blackface sheep?

    Because even though the original cell came from a Scottish Blackface sheep and the surrogate mother was a Scottish Blackface, the DNA came from a Finn-Dorset.

    Genetic Engineering

    Genetic engineering uses recombinant DNA technology to modify an organism’s DNA to achieve desirable traits. Adding foreign DNA via recombinant DNA vectors generated by molecular cloning is the most common method of genetic engineering. An organism that receives the recombinant DNA is called a genetically modified organism (GMO)The host organism is called transgenic if the foreign DNA introduced is from a different species. Bacteria, plants, and animals have been genetically modified since the early 1970s for academic, medical, agricultural, and industrial purposes.

    Watch this short video explaining how scientists create a transgenic animal.

    Although the classic methods of studying gene function began with a given phenotype and determined the genetic basis of that phenotype, modern techniques allow researchers to start at the DNA sequence level and ask: “What does this gene or DNA element do?” This technique, called reverse genetics, reverses the classical genetic methodology. One example of this method is analogous to damaging a body part to determine its function. An insect that loses a wing cannot fly, meaning the wing’s function is flight. The classic genetic method compares flying and non-flying insects and observes that the non-flying insects have lost their wings. Similarly, in a reverse genetics approach, mutating or deleting genes can provide clues about gene function. Alternatively, reverse genetics can be used to overexpress a gene to determine which phenotypic effects may occur.

    CRISPR Technology

    CRISPR stands for clustered regularly interspaced short palindromic repeats and represents a family of DNA sequences found within the genomes of prokaryotic organisms such as bacteria and archaea. These sequences are derived from DNA fragments of bacteriophages that have previously infected the prokaryote and are used to detect and destroy DNA from similar phages during subsequent infections. Hence, these sequences play a key role in the prokaryotes' antiviral defense system. Figure \(\PageIndex{9}\) shows the crystal structure of a CRISPR RNA-guided surveillance complex, Cascade, bound to a ssDNA target,

    3D molecular structure shown with red, green, and cyan representations of different components.

    Figure \(\PageIndex{9}\): Crystal structure of a CRISPR RNA-guided surveillance complex, Cascade, bound to a ssDNA target. CRISPR system Cascade protein subunits CasA, CasB, CasC, CasD, and CasE (cyan) bound to CRISPR RNA (green) and viral DNA (red) based on PDB 4QYZ and rendered with PyMOL. Image from Boghog

    Cas9 (or "CRISPR-associated protein 9") is an enzyme that uses CRISPR sequences as a guide to recognize and cleave specific strands of DNA that are complementary to the CRISPR sequence. Cas9 enzymes and CRISPR sequences form the basis of a technology known as CRISPR-Cas9 that can be used to edit genes within organisms. This editing process has a wide range of applications, including basic biological research, the development of biotechnology products, and the treatment of diseases. Figure \(\PageIndex{10}\)s shows a diagram of the CRISPR prokaryotic antiviral defense mechanism

    Diagram illustrating a cellular signaling pathway involving GTPases, RNA transcription, and protein synthesis.

    (\PageIndex{10}\): Diagram of the CRISPR prokaryotic antiviral defense mechanism. Image by James Atmos

    The CRISPR-Cas system is a prokaryotic immune system that confers resistance to foreign genetic elements, such as those carried by plasmids and phages, providing acquired immunity. RNA harboring the spacer sequence helps Cas (CRISPR-associated) proteins recognize and cut foreign pathogenic DNA. Other RNA-guided Cas proteins cut foreign RNA. CRISPR is found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.

    Summary

    (Summary written by Claude, Sonnet 4.6, Anthropic)

    This chapter introduces the molecular tools that allow biochemists to isolate, copy, insert, propagate, and edit specific DNA sequences — the technical foundation of modern recombinant DNA technology, gene expression studies, biotechnology, and genome editing.

    Molecular cloning is the process of inserting a specific DNA fragment into a cloning vector and propagating it in a host organism to produce many identical copies. The fragment of interest is typically prepared by PCR amplification or restriction digestion, and the cloning vector is a small, well-characterized DNA molecule capable of autonomous replication in the host (most commonly E. coli). All cloning vectors share three essential features: (1) an origin of replication (ori) — such as the ColE1 ori found in many plasmids — that allows the vector to replicate independently of the host chromosome; (2) a multiple cloning site (MCS) containing a series of unique restriction endonuclease recognition sequences that allow directional insertion of foreign DNA; and (3) a selectable marker — most commonly a gene encoding antibiotic resistance (β-lactamase for ampicillin, neomycin phosphotransferase for kanamycin) — that allows only cells harboring the plasmid to grow on selective medium. Many vectors additionally contain reporter genes (lacZα for blue-white selection, GFP, or luciferase) to distinguish recombinant (insert-containing) from non-recombinant (self-ligated) vectors.

    Restriction endonucleases are the molecular scissors of cloning. Naturally produced by bacteria as a defense against bacteriophage DNA (while host DNA is protected by methyltransferases that methylate the same sequences), Type II restriction enzymes recognize specific palindromic sequences (reading the same 5'→3' on both strands) and hydrolyze both phosphodiester strands within or adjacent to the recognition site. EcoRI (5'-G↓AATTC-3') generates 5' four-nucleotide overhangs ("sticky" or "cohesive" ends) that can anneal to compatible ends on any DNA cut with the same enzyme; SmaI (5'-CCC↓GGG-3') generates blunt ends. The complementary single-stranded overhangs generated by the same enzyme spontaneously anneal through Watson-Crick hydrogen bonding and are covalently joined by DNA ligase (which forms phosphodiester bonds between adjacent 3'-OH and 5'-phosphate termini). Sticky-end ligations are directional, efficient, and specific to matching overhangs; blunt-end ligations are non-directional and less efficient but allow combination of fragments from incompatible enzyme digests. After ligation, the recombinant plasmid is introduced into competent E. coli cells by transformation (chemical treatment or electroporation), and transformed cells are selected on antibiotic-containing agar plates. Blue-white screening identifies recombinant clones: insertion of foreign DNA into the lacZα coding sequence within the MCS disrupts α-complementation with the host's ω fragment of β-galactosidase, so non-recombinant colonies (intact lacZα, active β-galactosidase, blue X-gal hydrolysis product) are blue, while recombinant colonies (disrupted lacZα, no active β-galactosidase) are white. Recombinant clones are verified by restriction mapping and DNA sequencing.

    The choice of cloning vector depends on the insert size, the required copy number, and the downstream application. Standard plasmids (up to ~15 kb inserts, 10–700 copies/cell) are used for routine gene cloning and expression. Bacteriophage λ vectors (up to 53 kb total, requiring deletion of non-essential phage genes to accommodate the insert) are used for genomic library construction. Cosmids (28–45 kb inserts, containing phage cos sites for packaging but propagated as plasmids) bridge plasmid and phage properties. Bacterial artificial chromosomes (BACs, up to 350 kb, 1 copy/cell, very stable) were central to the Human Genome Project's clone-by-clone sequencing strategy. Yeast artificial chromosomes (YACs, >1 Mb, containing telomeres, centromeres, and autonomously replicating sequences required for linear chromosome maintenance in yeast) allow cloning of very large genomic regions. Human artificial chromosomes have no practical size limit and avoid insertional mutagenesis by remaining episomal, making them potentially useful for gene therapy. Viral vectors (adenoviruses, retroviruses, and lentiviruses for mammalian cells; cauliflower mosaic virus, tobacco mosaic virus, and geminiviruses for plant cells) exploit the natural viral infection machinery to deliver foreign genes into eukaryotic cells.

    Reproductive cloning by somatic cell nuclear transfer (SCNT) demonstrated that the nucleus of a somatic cell retains the full developmental program of the organism. The procedure involves removing the haploid nucleus from a recipient egg cell (enucleation), inserting a diploid somatic nucleus from the donor individual into the enucleated egg, and stimulating division of the reconstructed cell by electrical or chemical treatment. The resulting embryo is implanted into a surrogate mother and develops into an organism genetically identical to the nuclear donor. Dolly the sheep (1996, the first cloned mammal from an adult somatic cell) was a Finn-Dorset sheep because her genetic complement came entirely from the Finn-Dorset mammary cell donor nucleus — the Scottish Blackface egg donor contributed only cytoplasm, and the Scottish Blackface surrogate contributed only the gestational environment. The egg cytoplasm is essential: it contains reprogramming factors (transcription factors, chromatin remodeling enzymes) that reset the somatic nucleus to a totipotent state capable of directing embryogenesis. Dolly's early death (6 years, due to a lung tumor) raised questions about whether telomere shortening in older somatic cell DNA compromises the health and lifespan of cloned animals. Therapeutic cloning — producing cloned embryos as sources of patient-matched embryonic stem cells for tissue regeneration, without implantation — offers potential medical applications while avoiding immune rejection.

    CRISPR-Cas9 genome editing — arguably the most transformative molecular biology technology of the past decade — is based on a prokaryotic adaptive immune system. Bacteria and archaea (~50% and ~90% of sequenced genomes, respectively) incorporate short sequences (spacers) derived from previously encountered phage DNA into their chromosomes between palindromic repeat sequences (the CRISPRs). These spacers are transcribed into CRISPR RNA (crRNA), which associates with Cas proteins (the Cascade surveillance complex in type I systems, or Cas9 in type II) to form a crRNA-guided nuclease complex. When a matching phage infects again, the complex recognizes the complementary foreign DNA sequence (adjacent to a short protospacer adjacent motif, PAM, in type II systems) and Cas9 introduces a double-strand break, destroying the phage DNA. For genome-editing applications, researchers design a synthetic single-guide RNA (sgRNA)—a chimeric RNA that combines the crRNA spacer sequence with the transactivating tracrRNA—that directs Cas9 to any genomic target complementary to the 20-nucleotide guide sequence. Cas9 introduces a double-strand break at the target site, which can be repaired by error-prone non-homologous end joining (NHEJ, causing insertions/deletions that disrupt gene function) or by homology-directed repair (HDR, enabling precise sequence replacement if a homologous donor template is provided). CRISPR-Cas9 is being applied in basic research (gene knockouts to probe function, "reverse genetics"), agricultural biotechnology (disease-resistant crops, improved livestock), and clinical medicine (treatment of sickle cell disease and β-thalassemia by reactivating fetal hemoglobin expression, as in Casgevy), representing an unprecedented precision tool for reading, writing, and editing the genome


    This page titled 9.3: Cloning and Recombinant Expression is shared under a not declared license and was authored, remixed, and/or curated by Henry Jakubowski and Patricia Flatt.