Skip to main content
Biology LibreTexts

10.3: Whole Genome Methods and Industrial Applications

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Learning Objectives

    • Explain the uses of genome-wide comparative analyses
    • Summarize the advantages of genetically engineered pharmaceutical products

    Advances in molecular biology have led to the creation of entirely new fields of science. Among these are fields that study aspects of whole genomes, collectively referred to as whole-genome methods. In this section, we’ll provide a brief overview of the whole-genome fields of genomics, transcriptomics, and proteomics.

    Genomics, Transcriptomics, and Proteomics

    The study and comparison of entire genomes, including the complete set of genes and their nucleotide sequence and organization, is called genomics. This field has great potential for future medical advances through the study of the human genome as well as the genomes of infectious organisms. Analysis of microbial genomes has contributed to the development of new antibiotics, diagnostic tools, vaccines, medical treatments, and environmental cleanup techniques.

    The field of transcriptomics is the science of the entire collection of mRNA molecules produced by cells. Scientists compare gene expression patterns between infected and uninfected host cells, gaining important information about the cellular responses to infectious disease. Additionally, transcriptomics can be used to monitor the gene expression of virulence factors in microorganisms, aiding scientists in better understanding pathogenic processes from this viewpoint.

    When genomics and transcriptomics are applied to entire microbial communities, we use the terms metagenomics and metatranscriptomics, respectively. Metagenomics and metatranscriptomics allow researchers to study genes and gene expression from a collection of multiple species, many of which may not be easily cultured or cultured at all in the laboratory. A DNA microarray (discussed in the previous section) can be used in metagenomics studies.

    Another up-and-coming clinical application of genomics and transcriptomics is pharmacogenomics, also called toxicogenomics, which involves evaluating the effectiveness and safety of drugs on the basis of information from an individual’s genomic sequence. Genomic responses to drugs can be studied using experimental animals (such as laboratory rats or mice) or live cells in the laboratory before embarking on studies with humans. Changes in gene expression in the presence of a drug can sometimes be an early indicator of the potential for toxic effects. Personal genome sequence information may someday be used to prescribe medications that will be most effective and least toxic on the basis of the individual patient’s genotype.

    The study of proteomics is an extension of genomics that allows scientists to study the entire complement of proteins in an organism, called the proteome. Even though all cells of a multicellular organism have the same set of genes, cells in various tissues produce different sets of proteins. Thus, the genome is constant, but the proteome varies and is dynamic within an organism. Proteomics may be used to study which proteins are expressed under various conditions within a single cell type or to compare protein expression patterns between different organisms.

    The most prominent disease being studied with proteomic approaches is cancer, but this area of study is also being applied to infectious diseases. Research is currently underway to examine the feasibility of using proteomic approaches to diagnose various types of hepatitis, tuberculosis, and HIV infection, which are rather difficult to diagnose using currently available techniques.1

    A recent and developing proteomic analysis relies on identifying proteins called biomarkers, whose expression is affected by the disease process. Biomarkers are currently being used to detect various forms of cancer as well as infections caused by pathogens such as Yersinia pestis and Vaccinia virus.2

    Other “-omic” sciences related to genomics and proteomics include metabolomics, glycomics, and lipidomics, which focus on the complete set of small-molecule metabolites, sugars, and lipids, respectively, found within a cell. Through these various global approaches, scientists continue to collect, compile, and analyze large amounts of genetic information. This emerging field of bioinformatics can be used, among many other applications, for clues to treating diseases and understanding the workings of cells.

    Additionally, researchers can use reverse genetics, a technique related to classic mutational analysis, to determine the function of specific genes. Classic methods of studying gene function involved searching for the genes responsible for a given phenotype. Reverse genetics uses the opposite approach, starting with a specific DNA sequence and attempting to determine what phenotype it produces. Alternatively, scientists can attach known genes (called reporter genes) that encode easily observable characteristics to genes of interest, and the location of expression of such genes of interest can be easily monitored. This gives the researcher important information about what the gene product might be doing or where it is located in the organism. Common reporter genes include bacterial lacZ, which encodes beta-galactosidase and whose activity can be monitored by changes in colony color in the presence of X-gal as previously described, and the gene encoding the jellyfish protein green fluorescent protein (GFP) whose activity can be visualized in colonies under ultraviolet light exposure (Figure \(\PageIndex{1}\)).

    a) A photograph of mice with green fluorescent regions. B) A photograph of an agar plate with green fluorescent colonies. C) A photograph of blue and white colonies on an agar plate
    Figure \(\PageIndex{1}\): (a) The gene encoding green fluorescence protein is a commonly used reporter gene for monitoring gene expression patterns in organisms. Under ultraviolet light, GFP fluoresces. Here, two mice are expressing GFP, while the middle mouse is not. (b) GFP can be used as a reporter gene in bacteria as well. Here, a plate containing bacterial colonies expressing GFP is shown. (c) Blue-white screening in bacteria is accomplished through the use of the lacZ reporter gene, followed by plating of bacteria onto medium containing X-gal. Cleavage of X-gal by the LacZ enzyme results in the formation of blue colonies. (credit a: modification of work by Ingrid Moen, Charlotte Jevne, Jian Wang, Karl-Henning Kalland, Martha Chekenya, Lars A Akslen, Linda Sleire, Per Ø Enger, Rolf K Reed, Anne M Øyan, Linda EB Stuhr; credit b: modification of work by “”/Flickr; credit c: modification of work by American Society for Microbiology)

    Exercise \(\PageIndex{1}\)

    1. How is genomics different from traditional genetics?
    2. If you wanted to study how two different cells in the body respond to an infection, what –omics field would you apply?
    3. What are the biomarkers uncovered in proteomics used for?

    Use and Abuse of Genome Data

    Why can some humans harbor opportunistic pathogens like Haemophilus influenzae, Staphylococcus aureus, or Streptococcus pyogenes, in their upper respiratory tracts but remain asymptomatic carriers, while other individuals become seriously ill when infected? There is evidence suggesting that differences in susceptibility to infection between patients may be a result, at least in part, of genetic differences between human hosts. For example, genetic differences in human leukocyte antigens (HLAs) and red blood cell antigens among hosts have been implicated in different immune responses and resulting disease progression from infection with H. influenzae.

    Because the genetic interplay between pathogen and host may contribute to disease outcomes, understanding differences in genetic makeup between individuals may be an important clinical tool. Ecological genomics is a relatively new field that seeks to understand how the genotypes of different organisms interact with each other in nature. The field answers questions about how gene expression of one organism affects gene expression of another. Medical applications of ecological genomics will focus on how pathogens interact with specific individuals, as opposed to humans in general. Such analyses would allow medical professionals to use knowledge of an individual’s genotype to apply more individualized plans for treatment and prevention of disease.

    With the advent of next-generation sequencing, it is relatively easy to obtain the entire genomic sequences of pathogens; a bacterial genome can be sequenced in as little as a day.3 The speed and cost of sequencing the human genome has also been greatly reduced and, already, individuals can submit samples to receive extensive reports on their personal genetic traits, including ancestry and carrier status for various genetic diseases. As sequencing technologies progress further, such services will continue to become less expensive, more extensive, and quicker.

    However, as this day quickly approaches, there are many ethical concerns with which society must grapple. For example, should genome sequencing be a standard practice for everybody? Should it be required by law or by employers if it will lower health-care costs? If one refuses genome sequencing, does he or she forfeit his or her right to health insurance coverage? For what purposes should the data be used? Who should oversee proper use of these data? If genome sequencing reveals predisposition to a particular disease, do insurance companies have the right to increase rates? Will employers treat an employee differently? Knowing that environmental influences also affect disease development, how should the data on the presence of a particular disease-causing allele in an individual be used ethically? The Genetic Information Nondiscrimination Act of 2008 (GINA) currently prohibits discriminatory practices based on genetic information by both health insurance companies and employers. However, GINA does not cover life, disability, or long-term care insurance policies. Clearly, all members of society must continue to engage in conversations about these issues so that such genomic data can be used to improve health care while simultaneously protecting an individual’s rights.

    Clinical Focus: part 3

    When Kayla described her symptoms, her physician at first suspected bacterial meningitis, which is consistent with her headaches and stiff neck. However, she soon ruled this out as a possibility because meningitis typically progresses more quickly than what Kayla was experiencing. Many of her symptoms still paralleled those of amyotrophic lateral sclerosis (ALS) and systemic lupus erythematosus (SLE), and the physician also considered Lyme disease a possibility given how much time Kayla spends in the woods. Kayla did not recall any recent tick bites (the typical means by which Lyme disease is transmitted) and she did not have the typical bull’s-eye rash associated with Lyme disease (Figure \(\PageIndex{2}\)). However, 20–30% of patients with Lyme disease never develop this rash, so the physician did not want to rule it out.

    Kayla’s doctor ordered an MRI of her brain, a complete blood count to test for anemia, blood tests assessing liver and kidney function, and additional tests to confirm or rule out SLE or Lyme disease. Her test results were inconsistent with both SLE and ALS, and the result of the test looking for Lyme disease antibodies was “equivocal,” meaning inconclusive. Having ruled out ALS and SLE, Kayla’s doctor decided to run additional tests for Lyme disease.

    Exercise \(\PageIndex{2}\)

    1. Why would Kayla’s doctor still suspect Lyme disease even if the test results did not detect Lyme antibodies in the blood?
    2. What type of molecular test might be used for the detection of blood antibodies to Lyme disease
    A photo of a bulls-eye rash; a red spot in the center and a red ring around that.
    Figure \(\PageIndex{2}\): A bulls-eye rash is one of the common symptoms of Lyme diseases, but up to 30% of infected individuals never develop a rash. (credit: Centers for Disease Control and Prevention)

    Recombinant DNA Technology and Pharmaceutical Production

    Genetic engineering has provided a way to create new pharmaceutical products called recombinant DNA pharmaceuticals. Such products include antibiotic drugs, vaccines, and hormones used to treat various diseases. Table \(\PageIndex{1}\) lists examples of recombinant DNA products and their uses.

    For example, the naturally occurring antibiotic synthesis pathways of various Streptomyces spp., long known for their antibiotic production capabilities, can be modified to improve yields or to create new antibiotics through the introduction of genes encoding additional enzymes. More than 200 new antibiotics have been generated through the targeted inactivation of genes and the novel combination of antibiotic synthesis genes in antibiotic-producing Streptomyces hosts.4

    Genetic engineering is also used to manufacture subunit vaccines, which are safer than other vaccines because they contain only a single antigenic molecule and lack any part of the genome of the pathogen. For example, a vaccine for hepatitis B is created by inserting a gene encoding a hepatitis B surface protein into a yeast; the yeast then produces this protein, which the human immune system recognizes as an antigen. The hepatitis B antigen is purified from yeast cultures and administered to patients as a vaccine. Even though the vaccine does not contain the hepatitis B virus, the presence of the antigenic protein stimulates the immune system to produce antibodies that will protect the patient against the virus in the event of exposure.5 6

    Genetic engineering has also been important in the production of other therapeutic proteins, such as insulin, interferons, and human growth hormone, to treat a variety of human medical conditions. For example, at one time, it was possible to treat diabetes only by giving patients pig insulin, which caused allergic reactions due to small differences between the proteins expressed in human and pig insulin. However, since 1978, recombinant DNA technology has been used to produce large-scale quantities of human insulin using E. coli in a relatively inexpensive process that yields a more consistently effective pharmaceutical product. Scientists have also genetically engineered E. coli capable of producing human growth hormone (HGH), which is used to treat growth disorders in children and certain other disorders in adults. The HGH gene was cloned from a cDNA library and inserted into E. coli cells by cloning it into a bacterial vector. Eventually, genetic engineering will be used to produce DNA vaccines and various gene therapies, as well as customized medicines for fighting cancer and other diseases.

    Table \(\PageIndex{1}\): Some Genetically Engineered Pharmaceutical Products and Applications
    Recombinant DNA Product Application
    Atrial natriuretic peptide Treatment of heart disease (e.g., congestive heart failure), kidney disease, high blood pressure
    DNase Treatment of viscous lung secretions in cystic fibrosis
    Erythropoietin Treatment of severe anemia with kidney damage
    Factor VIII Treatment of hemophilia
    Hepatitis B vaccine Prevention of hepatitis B infection
    Human growth hormone Treatment of growth hormone deficiency, Turner’s syndrome, burns
    Human insulin Treatment of diabetes
    Interferons Treatment of multiple sclerosis, various cancers (e.g., melanoma), viral infections (e.g., Hepatitis B and C)
    Tetracenomycins Used as antibiotics
    Tissue plasminogen activator Treatment of pulmonary embolism in ischemic stroke, myocardial infarction

    Exercise \(\PageIndex{3}\)

    1. What bacterium has been genetically engineered to produce human insulin for the treatment of diabetes?
    2. Explain how microorganisms can be engineered to produce vaccines.



    RNA Interference Technology

    In chapter 10, we described the function of mRNA, rRNA, and tRNA. In addition to these types of RNA, cells also produce several types of small noncoding RNA molecules that are involved in the regulation of gene expression. These include antisense RNA molecules, which are complementary to regions of specific mRNA molecules found in both prokaryotes and eukaryotic cells. Non-coding RNA molecules play a major role in RNA interference (RNAi), a natural regulatory mechanism by which mRNA molecules are prevented from guiding the synthesis of proteins. RNA interference of specific genes results from the base pairing of short, single-stranded antisense RNA molecules to regions within complementary mRNA molecules, preventing protein synthesis. Cells use RNA interference to protect themselves from viral invasion, which may introduce double-stranded RNA molecules as part of the viral replication process (Figure \(\PageIndex{3}\)).

    A eukaryotic cell transcribes a region of DNA into mrNA. Antisense mRNA then binds to the this mRNA to produce a double stranded region. This region is not translated (which means that ribosomes do not bind to the mRNA to produce proteins).
    Figure \(\PageIndex{3}\): Cells like the eukaryotic cell shown in this diagram commonly make small antisense RNA molecules with sequences complementary to specific mRNA molecules. When an antisense RNA molecule is bound to an mRNA molecule, the mRNA can no longer be used to direct protein synthesis. (credit: modification of work by Robinson R)

    Researchers are currently developing techniques to mimic the natural process of RNA interference as a way to treat viral infections in eukaryotic cells. RNA interference technology involves using small interfering RNAs (siRNAs) or microRNAs (miRNAs) (Figure \(\PageIndex{4}\)). siRNAs are completely complementary to the mRNA transcript of a specific gene of interest while miRNAs are mostly complementary. These double-stranded RNAs are bound to DICER, an endonuclease that cleaves the RNA into short molecules (approximately 20 nucleotides long). The RNAs are then bound to RNA-induced silencing complex (RISC), a ribonucleoprotein. The siRNA-RISC complex binds to mRNA and cleaves it. For miRNA, only one of the two strands binds to RISC. The miRNA-RISC complex then binds to mRNA, inhibiting translation. If the miRNA is completely complementary to the target gene, then the mRNA can be cleaved. Taken together, these mechanisms are known as gene silencing.

    Double stranded RNA can be produced from DNA in the nucleus. Dicer than cuts this dsRNA into either miRNA or siRNA. miRNA is an imperfect match and only one strand is usually incorporated into RISC. This blocks translation but the mRNA is stable. The RISC is stuck on the target. The siRNA has a perfect match and is incorporated into RISC. This triggers mRNA cleavage.
    Figure \(\PageIndex{4}\): This diagram illustrates the process of using siRNA or miRNA in a eukaryotic cell to silence genes involved in the pathogenesis of various diseases. (credit: modification of work by National Center for Biotechnology Information)

    Key Concepts and Summary

    • The science of genomics allows researchers to study organisms on a holistic level and has many applications of medical relevance.
    • Transcriptomics and proteomics allow researchers to compare gene expression patterns between different cells and shows great promise in better understanding global responses to various conditions.
    • The various –omics technologies complement each other and together provide a more complete picture of an organism’s or microbial community’s (metagenomics) state.
    • The analysis required for large data sets produced through genomics, transcriptomics, and proteomics has led to the emergence of bioinformatics.
    • Reporter genes encoding easily observable characteristics are commonly used to track gene expression patterns of genes of unknown function.
    • The use of recombinant DNA technology has revolutionized the pharmaceutical industry, allowing for the rapid production of high-quality recombinant DNA pharmaceuticals used to treat a wide variety of human conditions.
    • RNA interference technology has great promise as a method of treating viral infections by silencing the expression of specific genes


    1. E.O. List, D.E. Berryman, B. Bower, L. Sackmann-Sala, E. Gosney, J. Ding, S. Okada, and J.J. Kopchick. “The Use of Proteomics to Study Infectious Diseases.” Infectious Disorders-Drug Targets (Formerly Current Drug Targets-Infectious Disorders) 8 no. 1 (2008): 31–45.
    2. Mohan Natesan, and Robert G. Ulrich. “Protein Microarrays and Biomarkers of Infectious Disease.” International Journal of Molecular Sciences 11 no. 12 (2010): 5165–5183.
    3. D.J. Edwards, K.E. Holt. “Beginner’s Guide to Comparative Bacterial Genome Analysis Using Next-Generation Sequence Data.” Microbial Informatics and Experimentation 3 no. 1 (2013):2.
    4. Jose-Luis Adrio and Arnold L. Demain. “Recombinant Organisms for Production of Industrial Products.” Bioengineered Bugs 1 no. 2 (2010): 116–131.
    5. U.S. Department of Health and Human Services. “Types of Vaccines.” 2013. Accessed May 27, 2016.
    6. The Internet Drug List. Recombivax. 2015. Accessed May 27, 2016.

    Contributors and Attributions

    • Nina Parker, (Shenandoah University), Mark Schneegurt (Wichita State University), Anh-Hue Thi Tu (Georgia Southwestern State University), Philip Lister (Central New Mexico Community College), and Brian M. Forster (Saint Joseph’s University) with many contributing authors. Original content via Openstax (CC BY 4.0; Access for free at

    This page titled 10.3: Whole Genome Methods and Industrial Applications is shared under a CC BY license and was authored, remixed, and/or curated by OpenStax.