6.6: Expressed Sequence Tags

Last updated
Save as PDF

Page ID: 4838

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Only a very small percentage (1.2% in humans) of the DNA in vertebrate genomes encodes proteins (the "proteome") because the exons of most genes are separated by much-longer introns between our genes lie vast amounts of DNA much of which appears to regulate the expression of our genes but is not transcribed and translated into a protein product. So even when the complete sequence of a genome is known, it is often difficult to spot particular genes (open reading frames or ORFs).

One approach to solving the problem is to examine a transcriptome of the organism. Most commonly this is defined as: All the messenger RNA (mRNA) molecules transcribed from the genome. It is "a" transcriptome, not "the" transcriptome, because what genes are transcribed in a cell depends on the kind of cell (e.g., liver cell vs. lymphocyte) and what the cell is doing at that time, e.g.,

getting ready to divide by mitosis;
responding to the arrival of a hormone or cytokine;
getting ready to secrete a protein product.

Expressed Sequence Tags (ESTs)

ESTs are short (200–500 nucleotides) DNA sequences that can be used to identify a gene that is being expressed in a cell at a particular time.

The Procedure:

Isolate the messenger RNA (mRNA) from a particular tissue (e.g., liver)
Treat it with reverse transcriptase. Reverse transcriptase is a DNA polymerase that uses RNA as its template. Thus it is able to make genetic information flow in the reverse (RNA ->DNA) of its normal direction (DNA -> RNA).
This produces complementary DNA (cDNA). Note that cDNA differs from the normal gene in lacking the intron sequences.
Sequence 200–500 nucleotides at both the 5′ and 3′ ends of each cDNA.
Examine the database of the organism's genome to find a matching sequence.