The availability of cloned DNA probes for many genes has greatly facilitated the analysis of amounts of RNAs in different cells or under different conditions. For instance, it is very common to label a DNA probe that will hybridize to mRNA; the DNA comes from either a cDNA clone or a genomic clone containing an exon. The labeled probe is then hybridized to total or polyA-containing RNA (the latter is called polyA+ RNA, and is roughly equivalent to mRNA) from a cell. The concentration of the probe is much greater than the concentration of the target mRNA for the specific gene, thus the probe is in vast excess and all mRNA from the gene of interest should be driven into a duplex with the probe. The amount of probe protected from digestion by a single-strand specific nuclease such as nuclease S1 gives a measure of the amount of the specific mRNA that is in the cell. (This situation differs in some important aspects from the materialon estimating numbers of genes expressed and abundance from the kinetics of RNA-driven reactions. In that material, one was looking at entire populations of mRNAs, whereas in this situation, one is looking at only one mRNA - the one complementary to the labeled probe.)
[Two technical notes: The diagnostic assay here measures the amount of labeled DNA in duplex and the unhybridized DNA is digested. If the DNA probe is originally double-stranded, it is initially denatured prior to hybridization, but now how do you distinguish between nuclease protection arising from DNA-mRNA duplexes versus those that arise from the two strands of DNA reannealing? The cleanest approach is to just synthesize and label the strand of DNA complementary to the mRNA; this can be done by appropriate choices of primers for synthesis of DNA from plasmids carrying the DNA used as a probe. Alternatively, a labeled duplex DNA probe can be prepared that extends past the mRNA coding portion of a gene, so that the DNA-DNA duplex resulting from reannealing is larger than the DNA-RNA duplex resulting from hybridization to mRNA. Also, hybridization conditions with high concentrations of salt and formamide are used that favor DNA-RNA duplexes over DNA-DNA duplexes. (2) An equivalent approach is to synthesize an RNA probe derived from the cloned DNA; this "complementary RNA" forms a stronger duplex with the mRNA than does cDNA; RNA-RNA duplexes are stronger than RNA-DNA duplexes under conditions of high salt and formamide concentrations. The fragments protected from digestion by RNases are then detected.]
a) Murine erythroleukemia (MEL) cells are equivalent to proerythroblasts, immortalized by the Friend virus complex so that they can grow continuously in culture. Treatment with small organic compounds like dimethylsulfoxide (DMSO) will induce them to mature on to erythroblasts, with a substantial increase in the expression of erythroid specific genes (the mechanism for this induction is still unknown). Let's say that you isolated total RNA from both uninduced (untreated) cells and an equal number of DMSO-induced cells. The RNA samples were hybridized to an excess of a radiolabeled DNA probe from a mouse b-globin gene, and the amount of probe hybridized to the mRNA was determined by treatment of the samples with nuclease S1, electrophoresis on a denaturing polyacrylamide gel, and measuring the amount of radioactivity in the fragment resulting from the mRNA-DNA duplex. An illustration of the heteroduplex, the nuclease S1 treatment, and the resultant autoradiograph of the gel are shown below. The protected fragment from uninduced cells had 10,000 cpm, and the protected fragment from induced cells had 500,000 cpm. A negative control with RNA from a T-lymphocytic cell line, which produces no globin mRNA, gave no protection, i.e. 0 cpm for the diagnostic fragment. The expression of this b-globin gene is induced how much in MEL cells treated with DMSO?
b) The previous assay gives the relative amounts of the mRNA under the two conditions, and this is an extremely powerful and widely used assay. But what does this mean in terms of mRNA molecules per cell, i.e. how does the abundance change upon induction? One can alter this assay somewhat to get a measure of abundance, similar in principle to the calculations in Section VIIF. First, one needs a measure of the number of mRNA molecules per cell. Let's say that you harvested 107 MEL cells and isolated 3 mg of polyA+ RNA (essentially mRNA). What is the total number of mRNA molecules per MEL cell, assuming an average length of mRNA of 2000 nucleotides?
c) If one labels the RNA in the MEL cells, e.g. by growing the cells in the presence of [3H] uridine, which is incorporated only into RNA, then the isolated, labeled polyA+ RNA can be hybridized to an excess of the (now unlabeled) DNA complementary to the mRNA of interest. RNA in duplex with DNA can be detected by its protection from digestion by nucleases such as RNase A and RNase T1; the resulting autoradiograph would look something like that shown below, with bands containing more radioactivity represented as a darker fill. Since the DNA is still in excess, all the mRNA complementary to the probe should be driven into duplex, and one can readily measure the fraction of polyA+ RNA complementary to each probe. The following table provides some representative, idealized data for polyA+ RNA from uninduced and induced MEL cells, including the total input RNA (not treated with nucleases) and the amount protected from nuclease digestion by hybridization with an excess of b-globin gene DNA, DNA encoding the erythroid transcription factor GATA1, and DNA encoding ovalbumin (which is not expressed in MEL cells, i.e. it is a negative control). What fraction of the mRNA (or polyA+ RNA) is composed of mRNA from these three genes, and what is their abundance in uninduced and induced cells?
|DNA probe||cpm protected uninduced MEL cells||cpm protected induced MEL cells|
|[input labeled RNA]||[1,000,000]||[1,000,000]|
d) In general, what is the distribution of mRNAs in a particular type of differentiated cell, i.e. how abundant are the different complexity classes of mRNA?
Use of databases of sequences, mutations, and functional data
4.6 We used arginine biosynthesis to illustrate complementation analysis and construction of a pathway. The steps involved in arginine synthesis are also part of the urea cycle. One of the enzymes catalyzes the formation of citrulline from carbamoyl phosphate and ornithine. Let's find out more about this enzyme, called ornithine transcarbamoylase, or OTC.
Use your favorite Web browser to go to the URL for NCBI (National Center for Biotechnology Information).
- Click on the Entrez button. Entrez provides a portal to many types of information at this server. Let's start with DNA and protein sequences.
- Click on the Nucleotides button.
- Enter "X00210" and press the Search button. Do not enter the quotation marks, and those are zeros and a one, not O or l.
- You should get a report on the gene for OTC in E. coli, called argI.
- How large is the protein-coding region, from translation initiation codon to the termination codon? How big is the encoded protein?
- Where is the argI gene on the E. colichromosome? Go back to the Entrez server (where you clicked on Nucleotides before). Click on Genomes, and then select Escherichia coli. Enter "argI" in the Search window (don't enter the quotes, and that is the letter I "eye" not a "one").
4.7 Is the E. coli OTC protein related to any other proteins in the sequence databases? You need to get the protein sequence, which you can do by clicking on argIwhile you are at the genome map, or you can go back to the entry for the gene (accession number X00210). If you are at the GenBank Report for entry X00210, you need to click on the Protein button at the top of the page, and then select FastA Report from the next page. (If you take the default path the GenPept Report, that is OK, you can get the FastA Report from there as well.) Make a copy of this OTC sequence in FastA format (you may want to save it in another program, e.g. your favorite word processor, for convenience).
Now click on the Blast button at the top of the page, and at the next page select Basic Blast search. At the Blast server, select blastp from the pull-down menu next to Program (this aligns protein sequences; the default blastn aligns nucleotide sequences), and paste the E. coliOTCsequence in FastA format into the input window. Note that the pull-down menu gives you the option of entering the accession number (40962) instead of the sequence. The default sequence databases are nr, the non-redundant compilation of databases from the US, Europe and Japan. We'll use that, but note that a pull-down menu allows you to select other databases.
a) Click on the Submit Query button. When the job finally runs (this can take a minute or more when the Server is busy) what do you see?
b) Is the E. coli OTC protein related to any human protein? Scroll down the table of hits, past many bacterial OTCs (Neisseria, Pyrococcus...) until you run into some mammalian hits. With a score of 172, you should find a hyperlink to sp|P00480|OTC_HUMAN ORNITHINE CARBAMOYLTRANSFERASE PRECURSOR. Click on this hyperlink.
4.8 The entry for human OTC (P00480, which is the same as 400687) is quite long.
a) What occupies much of the feature table? What does this tell you about the OTCgene in humans?
b) Using either the features table for the GenBank entry 400687 (or P00480) or better yet, go back to the home page for NCBI and click on the OMIM button to go to the On-Line Medelian Inheritance In Man (from Victor McKusick, M.D.). Where is the gene? What happens in OTC deficiency?
4.9 What do the aligned amino acid sequences of the bacterial and human proteins tell you? Do conserved regions correlate with functional regions? For instance, does mutation of any amino acids in the conserved regions lead to a phenotype in humans?
Since the Blast search generated so many hits with higher scores than the E. coli- human pair, we will have to use a different tool to see the alignment. At the Blast server top page (where you selected Basic Blast search before), select Blast 2 sequences. This utility allows you to enter any two sequences and generate a pairwise alignment by the program Blast2. You should use the human and E. coli OTC protein sequences or their accession numbers, and be sure to choose blastp as the program. When doing this in July of 1998, I ran into a problem with the utility making a duplicate of each sequence I entered (I don't know if that was a problem at my end or theirs); this is likely a temporary condition. If you encounter a problem, try a different Server, such as the Sequence Analysis Server at http://genome.cs.mtu.edu/sas.html. Choose Pairwise Sequence Alignment, enter your sequences and run GAP or SIM on protein sequences.
4.10 One of the important early pieces of evidence that helped define the structure of the nucleosome was the pattern of nuclease cleavage in chromatin. In this experiment, chromatin was treated briefly with an enzyme, micrococcal nuclease, that degrades DNA, then all protein was removed and the and the purified DNA resolved by electrophoresis. A regular pattern of broad bands was seen; the average sizes of the DNA fragments were multiples of 200 bp, i.e. 200, 400, 600, 800 bp, etc. What does this result tell you about chromatin structure? The bands of DNA bands were thick and spread out rather than sharp; what does this tell you about the positions of cleavage by micrococcal nuclease?
4.11 Which histones are in the core of the nucleosome? What are the protein-protein interactions in the core? What protein domains mediate these interactions?
4.12 The mammalian virus SV40 has minichromosomes in which the circular duplex DNA is packaged into nucleosomes. When histones are removed from the minichromosomes, the resulting DNA is found to be negatively supercoiled. What does this tell you about the state of the DNA in the minichrosomes and the path of the DNA around the nucleosome?
4.13 Are the following statements true or false?
- The DNA coils around the histones about 1.65 turns per nucleosomal core.
- The DNA in chromatin containing actively transcribed genes is usually more sensitive to DNases than is the DNA in nontranscribed chromatin.
4.14 The packing ratio of a nucleic acid-protein complex is the ratio between the length of the naked DNA in normal B form to the length of the protein-DNA structure. For instance, if a set of proteins folded a DNA molecule of 100 Å into a structure that is 25 Å long, this structure has a packing ratio of 4.
a) Given the dimensions of the nucleosome structure, what is the packing ratio for the DNA in the nucleosome core? Note that the pitch is the distance between the midpoints of the DNA duplex as it turns around the histones in the core.
b) If the nucleosomes are tight-packed into a solenoid with 6 nucleosomes per turn, what is the packing ratio now? Assume that each turn of the solenoid translates 110 Å, i.e. the distance between the midpoints of nucleosomes in successive turns of the solenoid is 110 Å.
4.15 How close are the edges of the DNA as it curves around the surface of the nucleosomal core?