3.E: Isolating and Analyzing Genes (Exercises)
- Page ID
- 6920
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)3.2 Altering the ends of DNA fragments for ligation into vectors.
(Adapted from POB)
a) Draw the structure of the end of a linear DNA fragment that was generated by digesting with the restriction endonuclease EcoRI. Include those sequences remaining from the EcoRI recognition sequence.
b) Draw the structure resulting from the reaction of this end sequence with DNA polymerase I and the four deoxynucleoside triphosphates.
c) Draw the sequence produced at the junction if two ends with the structure derived in (b) are ligated.
d) Design two different short synthetic DNA fragments that would permit ligation of structure (a) with a DNA fragment produced by a PstI restriction digest. In one of these synthetic fragments, design the sequence so that the final junction contains the recognition sequences for both EcoRI and PstI. Design the sequence of the other fragment so that neither the EcoRI nor the PstI sequence appears in the junction.
3.3. What properties are required of vectors used in molecular cloning of DNA?
3.4. A student ligated a BamHI fragment containing a gene of interest to a pUC vector digested with BamHI, transformed E. coliwith the mixture of ligation products and plated the cells on plates containing the antibiotic ampicillin and the chromogenic substrate X‑gal. Which colonies should the student pick to find the ones containing the recombinant plasmid (with the gene of interest in pUC)?
3.5. Starting with an isolated mRNA, one wishes to make a double stranded copy of the mRNA and insert it at the PstI site of pBR322 via G-C homopolymer tailing. One then transforms E. coliwith this recombinant plasmid, selecting for tetracycline resistance. What are the four enzymatic steps used in preparing the cDNA insert? Name the enzymes and describe the intermediates.
3.6 A researcher needs to isolate a cDNA clone of giraffe actin mRNA, and she knows the size (Mr = 42,000) and partial amino acid sequence of giraffe actin protein and has specific antibodies against giraffe actin. After constructing a bank of cDNA plasmids from total mRNA of giraffe fibroblasts (dG-dC tailed into the PstI site of pBR322), what methods of screening the bank could be used to identify the actin cDNA clone?
3.7 The restriction map of pBR322 is
The distance in base pairs between restriction sites is as follows:
PstI to EcoRI 750 bp
EcoRI to HindIII 50 bp
HindIII to BamHI 260 bp
BamHI to PstI 3300 bp
A recombinant cDNA plasmid, pAlc-1, has double-stranded cDNA inserted at the PstI site of pBR322, using a technique that retains this cleavage site at both ends of the insert. Digestion of pBR322 and pAlc-1 with restriction endonucleases gives the following pattern after gel electrophoresis (left). The sizes of the fragments are given in base pairs. The DNA fragments were transferred out of the gel onto nitrocellulose and hybridized with radiolabeled cDNA from wild-type A. latrobus(a Southern blot-hybridizaton). Hybridizing fragments are shown in the autoradiogam diagram on the right.
a) What is the size of the cDNA insert?
b) What two restriction endonucleases cleave within the cDNA insert?
c) For those two restriction endonucleases, each DNA fragment in the single digest is cut by PstI into two DNA fragments in the double digest (i.e. the restriction endonuclease plus PstI). Determine which fragments each single digest fragment is cut into, and use this information to construct a map.
d) Draw a restriction map for pAlc-1, showing sites for PstI, EcoRI, BamHI and HindIII. Indicate the distance between sites and show the cDNA insert clearly.
3.8. You isolate and clone a KpnI fragment from A. latrobusgenomic DNA that encodes the mRNA cloned in pAlc-1 (as analyzed in question 3.7). The restriction map of the genomic fragment is
Each fragment that hybridizes to pAlc-1 is indicated by an asterisk. What does this map, especially when compared to that in problem 3.7, tell you about the structure of the gene? Be as quantitative as possible.
3.9. Some particular enzyme is composed of a polypeptide chain of 192 amino acids. The gene that encodes it has 1,440 nucleotide pairs. Explain the relationship between the number of amino acids in this polypeptide and the number of nucleotide pairs in its gene.
3.10. When viewed in the electron microscope, a hybrid between a cloned giraffe actin gene (genomic DNA) and mature actin mRNA looks like this:
What can you conclude about actin gene structure in the giraffe?
3.11. DNA complementary to pepper mRNA was synthesized using oligo (dT) as a primer for first strand synthesis. The second strand (synonymous with the mRNA) was then synthesized, and the population of double stranded cDNAs were ligated into a plasmid vector using a procedure that leaves PstI sites flanking the cDNA insert (i.e. the terminal PstI sites for each clone are not part of the cDNA). This cDNA library was screened for clones made from the mRNA from the pepper yellow gene. One clone was isolated, and subsequent analysis of the pattern of restriction endonuclease cleavage patterns showed it had the following structure:
The map shows the positions of restriction endonuclease cleavage sites and the distance between them in kilobases (kb). The map of the cDNA insert is shown with solid lines, and plasmid vector DNA flanking the cDNA is shown as dotted lines. The top strand is oriented 5' to 3' from left to right, and the bottom strand is oriented 5' to 3' from right to left. The positions and orientations of two oligonucleotides to prime synthesis for sequence determination are shown, and are placed adjacent to the strand that will be synthesized in the sequencing reaction.
a) Oligonucleotides that anneal to the plasmid vector sequences that flank the duplex cDNA insert were used to prime synthesis of DNA for sequencing by the Sanger dideoxynucleotide procedure. A primer that annealed to the vector sequences to the left of the map shown above generated the sequencing gel pattern shown below on the left. A primer that annealed to the vector sequences to the right of the map shown above generated the sequencing gel pattern shown below on the right. The gels were run from the negative electrode at the top to the positive electrode at the bottom, and the segment presented is past the PstI site (i.e. do not look for a PstI recognition site).
Left primer Right primer
G | A | T | C | G | A | T | C | ||
---|---|---|---|---|---|---|---|---|---|
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ | ||||||||
____ | ____ |
a) What is the DNA sequence of the left and right ends of the insert in the cDNA clone? Be sure to specify the 5' to 3' orientation, and the strand (top or bottom) whose sequence is reported. The terms left, right, top and bottom all refer to the map shown above for the cDNA clone.
b) Which end of the cDNA clone (left or right in the map above) is most likely to include the sequence synonymous with the 3' end of the mRNA?
c) What restriction endonuclease cleavage sites do you see in the sequencing data given?
3.12. Genomic DNA from the pepper plant was ligated into EcoRI sites in a l phage vector to construct a genomic DNA library. This library was screened by hybridization to the yellowcDNA clone. The pattern of EcoRI cleavage sites for one clone that hybridized to the yellowcDNA clone was analyzed in two experiments.
In the first experiment, the genomic DNA clone was digested to completion with EcoRI, the fragments separated on an agarose gel, transferred to a nylon filter, and hybridized with the radioactive yellowcDNA clone. The digest pattern (observed on the agarose gel) is shown in lane 1, and the pattern of hybridizing fragments (observed on an autoradiogram after hybridization) is shown in lane 2. Sizes of the EcoRI fragments are indicated in kb. The right arm of this l vector is 6 kb long, and the left arm is 30 kb.
In the second experiment, the genomic DNA clone was digested with a range of concentrations of EcoRI, so that the products ranged from a partial digest to a complete digest. The cleavage products were annealed to a radioactive oligonucleotide that hybridized only to the right cohesive end (cossite) of the l vector DNA. This simply places a radioactive tag at the right end of all the products of the reaction that extend to the right end of the l clone (partial or complete); digestion products that do not include the right end of the l clone will not be seen. The results of the digestion are shown above, on the right. Lane 1 is the clone of genomic DNA in l that has not been digested, lane 5 is the complete digest with EcoRI, and lanes 2, 3 and 4 are partial digests using increasing amounts of EcoRI. The sizes of the radioactive DNA fragments (in kb) are given, and the density of the fill in the boxes is proportional to the intensity of the signal on the autoradiogram.
a) What is the map of the EcoRI fragments in the genomic DNA clone, and which fragments encode mRNA for the yellowgene? You may wish to fill in the figure below; the left and right arms of the l vector are given. Show positions of the EcoRI cleavage sites, distances between them (in kb) and indicate the fragments that hybridize to the cDNA clone.
EcoRI EcoRI
Left arm ___| |_ Right arm
(30 kb) (6 kb)
In a third experiment, the pepper DNA from the genomic DNA clone was excised, hybridized with yellowmRNA under conditions that favor RNA-DNA duplexes and examined in the electron microscope to visualize R-loops. A pattern like the following was observed. The lines in the figure can be duplex DNA, RNA-DNA duplexes and single-stranded DNA.
b) What do the R-loop data indicate? Please draw an interpretation of the R-loops, showing clearly the two DNA strands and the mRNA and distinguishing between the template (bottom, or message complementary) and nontemplate (top, or message synonymous) strands.
The EcoRI fragments that hybridize to the yellowcDNA clone were isolated and digested with SalI (S in the figure below), HindIII (H), and the combination of SalI plus HindIII (S+H). The resulting patterns of DNA fragments are shown below; all will hybridize to the yellowcDNA clone. Cleavage of the 5 kb EcoRI fragment with SalI generates two fragments of 2.5 kb.
c) What are the maps of the SalI and HindIII site(s) in each of the EcoRI fragments? Show positions of the cleavage sites and distances between them on the diagram below.
5 kb EcoRI fragment: 4 kb EcoRI fragment:
EcoRI EcoRI EcoRI EcoRI
|___________________________| |___________________________|
d) Compare these restriction maps with that of the cDNA clone (problem 1.38) and the R-loops shown above. Assuming that the SalI and HindIII sites in the genomic DNA correspond to those in the cDNA clone, what can you deduce about the intron/exon structure of the yellowgene(s) contained within the 5 kb and 4 kb EcoRI fragments? Please diagram the exon-intron structure in as much detail as the data permit (i.e. show the size of the intron(s) and positions of intron/exon junctions as precisely as possible).
5 kb EcoRI fragment: 4 kb EcoRI fragment:
EcoRI EcoRI EcoRI EcoRI
|___________________________| |___________________________|
e) Considering all the data (maps of cDNA and genomic clones and R-loop analysis), what can you conclude about the number and location(s) of yellowgene(s) in this genomic clone?
3.13 You have isolated an 1100 base pair (bp) cDNA clone for a gene called azure that when mutated causes blue eyes in frogs. You also isolate a 3000 bp SalI genomic DNA fragment that hybridizes to the azurecDNA. The map of the azure cDNA is as follows, with sizes of fragments given in bp.
Digestion of the 3000 bp SalI fragment of genomic DNA with the indicated restriction endonucleases yields the following pattern of fragments, all of which hybridize to the azurecDNA. Remember that the starting fragment has SalI sites at each end. Sizes of fragments are in bp.
Restriction enzymes
BamHI Bam+Pst PstI Pst+Eco EcoRI Bam+Eco
2700
2300
2000
1900 1900
1200
1100 1100
800
700 700 700
300 300 300
The SalI to SalI (3000 bp) genomic fragment was hybridized to the 1100 bp cDNA fragment, and the heteroduplexes were examined in the electron microscope. Measurements on a large number of molecules resulted in the determination of the sizes indicated in the structure on the left, i.e. duplex regions of 400 and 600 bp are interrupted by a single stranded loop of 1500 nucleotides and are flanked by single stranded regions of 500 and 100 nucleotides. When the same experiment is carried out with the 2700 bp SalI to EcoRI genomic DNA fragment hybridized to the cDNA fragment, the structure on the right is observed.
a) What is the restriction map of the 3000 bp SalI to SalI genomic DNA fragment from the azuregene? Specify distances between sites in base pairs.
b) How many introns are present in the azuregenomic DNA fragment?
c) Where are the exons in the azuregenomic DNA fragment? Draw the exons as boxes on the restriction map of the 3000 bp SalI to SalI genomic DNA fragment? Specify (in base pairs) the distances between restriction sites and the intron/exon boundaries.
3.14 The T-cell receptor is present only on T-lymphocytes, not on B-lymphocytes or other cells. Describe a strategy to isolate the T-cell receptor by subtractive hybridization, using RNA from T-lymphocytes and from B-lymphocytes.
3.15.How many exons are in the human insulin (INS) gene, how big are they, and how large are the introns that separate them? Use three different bioinformatic approaches to answer this.
a. Align the available genomic sequence containing INS(encoding insulin) with the sequence of the mRNA to find exons and introns in the INSgene. The sequence files are:
INSmRNA: accession number NM_000207
INS gene (includes part of THand IGF2in addition to INS): accession number L15440
Files can be obtained from NCBI (http://www.ncbi.nlm.nih.gov), or from the course web site (www.bmb.psu.edu/Courses/bmb400/default.htm)
Align the mRNA (cDNA) and genomic sequence using the BLAST2sequences server at
http://www.ncbi.nlm.nih.gov/blast/
and the sim4server at
pbil.univ-lyon1.fr/sim4.html
Sim4is designed to take into account terminal redundancy at the exon/intron junctions, whereas BLAST2does not. Do you see this effect in the output?
b. Use the ab initioexon finding program Genscan, available at
genes.mit.edu/GENSCAN.html
to predict exons in the INSgenomic sequence (L15440).
How does this compare with the results of analyzing with the program genscan?
c. What do you see for INSat the Human Genome Browser and Ensembl? They are accessed at: