15.4: Genomic Libraries
- Page ID
- 88996
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)A tube full of recombinant bacteriophage is basically a genomic library. Each phage DNA molecule should contain a fragment of foreign cellular DNA. A good genomic library will contain a representation of all of possible fragments of an organism’s genome. Bacteriophage are often used to clone genomic DNA fragments because phage genomes are bigger than plasmids and can be engineered to remove large amounts of DNA that are not needed for infection and replication in host cells. The missing DNA can then be replaced by large foreign DNA inserts—fragments as long as 18-20 Kbp, nearly twenty times longer than cDNA inserts in plasmids. Purified phage coat proteins can then be mixed with the recombined phage DNA to make infectious phage particles (i.e., recombinant phage). Infection of host bacteria by these “particles” leads to replication of the recombinant phage DNA, new phage production, cell lysis, and the release of lots of new recombinant phage.
Consider the following bit of math: A typical mammalian genome consists of more than 2,000,000,000 bp (3,200,000,000 in humans!). Inserts in plasmids are very short, rarely exceeding 1,000 bp. If you divide two billion by one thousand, you get two million, the minimum number of recombinant plasmid clones that must be screened to find a sequence of interest. In truth, you would need much more than this number of clones to find the smaller parts of a gene that add up to a whole gene! Of course, part of the solution to this “needle in a haystack” dilemma is to clone larger DNA inserts in more accommodating vectors. And that is the value of a bacteriophage vector and even more accommodating vectors, including whole chromosomes! Consider the Yeast Artificial Chromosome (YAC), hosted by (replicated in) yeast cells.
YACs can accept humongous foreign DNA inserts! To replicate in a yeast cell, a chromosome need have only one centromere and two telomeres, and little else! Recall that telomeres are needed in replication to keep the chromosome from shortening during replication of the DNA. The centromere is needed to attach chromatids to spindle fibers so that they can separate during anaphase in mitosis (and meiosis).
From this brief description, you may recognize a common strategy for engineering a cloning vector: determine the minimum properties that your vector must have and remove nonessential DNA sequences. Of course, for cloning, you must include some useful restriction sites. Then you have enabled recombination with inserts as long as 2,000 Kbp. That’s a YAC… The tough part, of course, is keeping a 2,000 Kbp-long DNA fragment intact long enough to get it into the YAC!
There is another “minimum” requirement for replicating chromosomal DNA (in a YAC vector or a cell). What might that be, and why do you think it’s not mentioned here?
Whatever the vector of choice, sequencing its insert can tell us many things. It can show us how a gene is regulated by confirming known and/or revealing new regulatory DNA sequences. It can show us neighboring genes, helping us to map them on chromosomes and reveal evolutionary genetic relationships. Genomic DNA sequences from one species can probe for similar sequences in other species, allowing comparative sequence analysis that can tell us a great deal about gene evolution and the evolution of species. One early surprise from gene sequencing studies was that we share many common genes and DNA sequences with other species, like yeast, worms, flies, and (of course) vertebrates, including our more closely related mammalian friends. You may already know that chimpanzee and human genomes are 99% similar. And we have already seen comparative sequence analysis showing how proteins with different functions in the same (and even across) species nevertheless share structural domains.
Let’s look at how we would make a phage genomic library that targets a specific gene of interest. As you’ll see, the approach is like cloning a foreign DNA into a plasmid or in fact, into any other vector, but the numbers and details used here exemplify cloning in phage.
15.4.1 Preparing Specific-Length Genomic DNA for Cloning: The Southern Blot
To begin with, high molecular weight (i.e., long molecules of) the desired genomic DNA are isolated, purified, and then digested with a restriction enzyme. Usually, the digest is partial, aiming to generate overlapping DNA fragments of random length. The digested DNA is mixed with ethidium bromide, a fluorescent dye that binds to DNA. After electrophoresis on agarose gels and exposed to UV light, the DNA appears as a bright fluorescent smear. If we wanted to clone the complete genome of an organism, we could recombine all DNA with suitably digested vector DNA. But since we are only after one gene, we can reduce the number of clones to be screened, in order to find a sequence of interest.
In the early days of cloning we would do this by creating a Southern blot (named after Edward Southern, the inventor of the technique). This technique lets us identify the size of genomic DNA fragments most likely to contain a desired gene. A summary of the steps to make a Southern blot follows:
- Digest genomic DNA with one or more restriction endonucleases.
- Run the digest products on an agarose gel to separate fragments by size (length). The DNA appears as a smear when stained with a fluorescent dye.
- Place a filter on the gel. The DNA transfers (blots) to the filter for about 24 hours (or more).
- Remove the blotted filter and place it in a bag containing a solution that can denature the DNA.
- Add a radioactive probe (e.g., cDNA) containing the gene or sequence of interest. The probe hybridizes (binds) to complementary genomic sequences on the filter.
- Prepare an autoradiograph of the filter and see a “band” representing the size of genomic fragments of DNA that include the sequence of interest. The Southern-blot protocol is illustrated in Figure 15.21 (below).
Why do the genomic DNAs on the gel (third panel, Figure 15.21) appear as a smear instead of as bands of discrete size/length?
Once you know the size (or size range) of restriction-digest fragments that contain the DNA you want to study, you are ready to run another gel of digested genomic DNA, and then:
- cut out the piece of gel containing fragments of the size that “lit up” with your probe in the autoradiograph.
- Remove (elute) the DNA from the gel piece into a suitable buffer.
- Prepare the DNA for insertion into (i.e., recombination with) a vector for genomic cloning.
15.4.2 Recombining Size-Restricted Genomic DNA with Phage DNA
After eluting the restriction-digested DNA fragments of the right size-range from the gels, we mix the DNA with compatibly digested phage DNA at concentrations that favor the formation of H-bonds between the ends of the phage DNA and the genomic fragments (rather than with each other!). The addition of DNA ligase covalently links the recombined DNA molecules. These steps are abbreviated in the illustration in Figure 15.22 (below).
The recombinant phage to be made next will contain sequences that will become the genomic library.
15.4.3 Creating Infectious Viral Particles with Recombinant phage DNA
The next step is to package the recombined phage DNA by adding purified viral coat proteins to make infectious phage particles (Figure 15.23).
269 Genomic Libraries: Make and Package Recombinant Phage DNA
Packaged phages are added to a culture tube full of host bacteria (e.g., E. coli). After infection, the recombinant DNA enters the cells, where it replicates and directs the production of new phage eventually lyse the host cell (Figure 15.24).
The recombined vector can also be introduced directly into the host cells by transduction, which is to phage DNA what transformation is to plasmid DNA. Whether by infection or transduction, the recombinant phage DNA ends up in host cells, which produce new phage that eventually lyse the host cell. The released phages go on to infect more host cells until all cells have lysed. What remains is a tube full of lysate, containing cell debris and lots of recombinant phage particles.
270 Infect Host with Recombinant Phage to Make a Genomic Library
Just a note on some other vectors for genomic DNA cloning: For large genomes, the goal is to choose a vector able to house larger fragments of “foreign” DNA, so that you end up screening fewer clones. We’ve seen that phage vectors accommodate larger foreign DNA inserts than plasmid vectors, and YACs even more. For a very large eukaryotic genome, it may be necessary to screen more than a hundred thousand clones in a phage-based genomic library. Apart from the size-selection of genomic fragments before you insert them into a vector, your selection of the appropriate vector is just as important. The following table lists commonly used vectors and the sizes of inserts they will accept.
Table 15.1
Cloning Vector type | Insert size (thousands of bases) |
---|---|
Plasmids | up to 15 |
up to 25 up to 45 |
|
Bacteriophage P1 | 70 to 100 |
P1 artificial chromosomes (PACs) | 130 to 150 |
Bacterial artificial chromosomes (BACs) | 120 to 300 |
Yeast artificial chromosomes (YACs) | 250 to 2000 |
Open the links in the table or use the QR codes (at the end of the chapter) to learn more about these cloning vectors. We will continue this example by screening a phage-lysate genomic library for a recombinant phage with a genomic sequence of interest.
15.4.4 Screening a Genomic Library; Titering Recombinant Phage Clones
A bacterial lawn is made by plating so many bacteria on the agar plate that they simply grow together rather than as separate colonies. If a small number of phages are evenly plated over the bacterial lawn, each virus will infect one cell. Subsequent lysis of this cell releases many phage, each infecting a neighboring cell. After a day or so of repeated lysis and infection, plaques (clearings) appear in the lawn at the site of the first infection. Each plaque shown in the illustration in Figure 15.25 is a clone of the original phage particle.
To screen a phage genomic library, phage lysate is titered on bacterial lawns. Titration of a phage lysate typically consists of serial 10-fold dilutions (10X, 100X, 100X, etc.) with a suitable medium. Each serial 10× dilution is then spread on a bacterial (e.g., E. coli) lawn, and the plaques formed on each lawn are counted. Remember, a plaque is a clone of a single virus. Let’s say that you spread 10 \(\mu\)l of an undiluted lysate on a bacterial lawn. After a day or so, you see lots of tiny plaques, and after staring at the plate, you figure that there are thousands of them on the lawn (i.e., too many to count!). But among the serial 10× dilutions of the lysate on bacterial lawns you count 176 large, well-separated plaques on a ‘lawn’ that was plated with 10 ml of the eighth serial dilution of the lysate. Therefore, there must have been 1,408 (8 × 176) phage particles in 10 \(\mu\)l of the original undiluted lysate.
15.4.5 Screening and Probing a Genomic Library
To represent a complete genomic library, you will need many plates of a serial 10× dilution containing, say, about five hundred to a thousand plaques per plate. If size-selected fragments (e.g., identified by Southern blotting) were cloned to make a partial genomic library, then fewer plaques must be screened to find a sequence of interest. Since plaques contain a lot of phage DNA not yet packaged at the time of lysis, it is possible to transfer the unpackaged viral DNA directly to filters by to replica plating (similar to replica plating of bacterial colonies). The replica filters are treated to denature the DNA and then hybridized to a probe with a known sequence. In the early days of cloning, the probes were often a sequenced cDNA previously isolated from libraries of the same or different (usually related) species. After soaking the filters in a radioactively labeled probe, you place an X-ray film over the filter to be exposed and developed. Black spots will form where the film is lying over a plaque containing genomic DNA complementary to the probe. In the example illustrated in Figure 15.26, a globin cDNA might have been used to probe our partial genomic library. (Globin genes were, in fact, among the first to be cloned!)
15.4.6 Isolating the Gene
Cloned genomic DNA fragments are much longer than any gene of interest, and always longer than any cDNA from a cDNA library. They are also embedded in a genome that is thousands of times as long as the gene itself, making the selection of an appropriate vector necessary. If the genome can be screened from a reasonable number of cloned phage (about a hundred thousand plaques, for instance), the one plaque producing a positive signal on the autoradiograph would be further studied. This plaque should contain the gene of interest. At this point, we seem to have identified a clone containing a globin gene sequence. We can use this clone to infect yet more host cells and to grow up much more of the globin-genecontaining DNA for further study.
271 Screen a Genomic Library; Pick and Grow a Phage Clone
Given that a gene of interest might be a short sequence embedded in a large genomic insert that is as long as 20 Kbp, we can further isolate the gene from neighboring DNA. The traditional strategy again involves Southern blotting. The cloned DNA is purified and digested with restriction endonucleases, and the digest fragments are separated by agarose gel electrophoresis. A Southern blot is made on a filter (e.g., a nylon filter), which is soaked in a solution that denatures the DNA on the blot. The filter is then probed with the same tagged probe used to find the positive clone (plaque). The smallest DNA fragment containing the gene of interest is typically subcloned and grown in a suitable vector to provide enough targetsequence DNA for further study. That suitable vector is often a plasmid easily grown in E. coli.