Skip to main content
Biology LibreTexts

5.9: Genome Sizes

  • Page ID
    4748
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father. The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.

    Table of Genome Sizes (haploid)
    Base pairs Genes Notes
    φX174 5,386 11 virus of E. coli
    Human mitochondrion 16,569 37  
    Nasuia deltocephalinicola 112,091 137 smallest genome yet found in a bacterium. This β-proteobacterium lives in a mutualistic relationship within a special organ of an insect (a leaf hopper) which it supplies with essential amino acids.
    Epstein-Barr virus (EBV) 172,282 80 causes mononucleosis
    nucleomorph of Guillardia theta 551,264 511 all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote
    Mycoplasma genitalium 580,073 525 two of the smallest true organisms
    Mycoplasma pneumoniae 816,394 679
    Rickettsia prowazekii 1,111,523 834 bacterium that causes epidemic typhus
    Treponema pallidum 1,138,011 1,039 bacterium that causes syphilis
    Pelagibacter ubique 1,308,759 1,354 smallest genome yet found in a free-living organism (marine α-proteobacterium)
    Helicobacter pylori 1,667,867 1,589 chief cause of stomach ulcers (not stress and diet)
    Methanocaldococcus jannaschii 1,664,970 1,783 These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea.
    Aeropyrum pernix 1,669,695 1,885
    Methanothermobacter thermoautotrophicus 1,751,377 2,008
    Streptococcus pneumoniae 2,160,837 2,236 the pneumococcus
    Pandoravirus 2,473,870 2556 A virus (of an amoeba) with a genome larger than that of the bacteria and archaea above and about the same as that of some parasitic eukaryotes.
    Listeria monocytogenes 2,944,528 2,926 2,853 of these encode proteins; the rest RNAs
    Synechocystis 3,573,470 4,003 a marine cyanobacterium ("blue-green alga")
    E. coli K-12 4,639,221 4,377 4,290 of these genes encode proteins; the rest RNAs
    E. coli O157:H7 5.44 x 106 5,416 strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12
    Schizosaccharomyces pombe 12,462,637 4,929 Fission yeast. A eukaryote with fewer genes than the three bacteria below.
    Agrobacterium tumefaciens 4,674,062 5,419 Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti
    Pseudomonas aeruginosa 6.3 x 106 5,570 Increasingly common cause of opportunistic infections in humans.
    Sinorhizobium meliloti 6,691,694 6,204 The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.
    Saccharomyces cerevisiae 12,495,682 5,770 Budding yeast. A eukaryote.
    Neurospora crassa 38,639,769 10,082 Plus 498 RNA genes.
    Thalassiosira pseudonana 34.5 x 106 11,242 A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins
    Naegleria gruberi 41 x 106 15,727 This free-living unicellular organism lives as both an amoeboid and a flagellated form. 4,133 of its genes are also found in other eukaryotes suggesting that they were present in the common ancestor of all eukaryotes. The great variety of functions encoded by these genes also suggests that the common ancestor of all eukaryotes was itself as complex as many of the present-day unicellular members.
    Drosophila melanogaster 122,653,977 ~17,000 the "fruit fly"
    Caenorhabditis elegans 100,258,171 21,733  
    Humans 3.3 x 109 ~21,000  
    Tetraodon nigroviridis (a pufferfish) 3.42 x 108 27,918 Although Tetraodon seems to have more protein-encoding genes than we do, it has much less non-coding DNA so its total genome is about a tenth the size of ours.
    Mouse 2.8 x 109 ~23,000  
    Amphibians 109–1011 ?  
    Arabidopsis thaliana 0.135 x 109 27,407 a flowering plant (angiosperm) with one of the smallest genomes known in the plant kingdom.
    Picea abies 19.6 x 109 28,354 the Norway spruce, a conifer (gymnosperm). Even though it has only ~900 more genes than Arabidopsis, it has 145 times as much DNA. Most of this appears to be derived from transposons.
    Psilotum nudum 2.5 x 1011 ?  

    Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do, but certainly are not 30 times as complex. The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.

    Not all genes are Indispensable

    The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life. In other words, the loss of any one of the 381 is lethal; the loss of any one of the others is not. (This is not to say that all the organism needs are those 381 — see "A Minimal Genome?" below.)

    Using similar techniques, three groups have recently found that only about 10% of the genes in the human genome (~2000 of them) must be present for human cells to grow successfully in culture. These genes encode proteins for such essential functions as controlling the cell cycle, DNA replication, DNA transcription and RNA translation. The cells can tolerate the loss of any one of the other ~18,000 genes. Thus the human genome appears to have redundant pathways that can often compensate for the loss of a single gene at least for cells growing in culture. Probably others will turn out to be essential for the development and functioning of the various types of differentiated cells in the intact body.

    A Minimal Genome?

    In March of 2016, workers at the J. Craig Venter Institute reported that they had created a strain of mycoplasma containing only 473 genes. This synthetic organism, which grows vigorously in culture, now holds the record for the smallest genome of a free-living organism.


    This page titled 5.9: Genome Sizes is shared under a CC BY 3.0 license and was authored, remixed, and/or curated by John W. Kimball via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.