24.2: Genome Size
-
- Last updated
- Save as PDF
The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father. The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.
| Base pairs | Genes | Notes | |
|---|---|---|---|
| φX174 | 5,386 | 11 | virus of E. coli |
| Human mitochondrion | 16,569 | 37 | |
| Nasuia deltocephalinicola | 112,091 | 137 | smallest genome yet found in a bacterium. This β-proteobacterium lives in a mutualistic relationship within a special organ of an insect (a leaf hopper) which it supplies with essential amino acids. |
| Epstein-Barr virus (EBV) | 172,282 | 80 | causes mononucleosis |
| nucleomorph of Guillardia theta | 551,264 | 511 | all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote |
| Mycoplasma genitalium | 580,073 | 525 | two of the smallest true organisms |
| Mycoplasma pneumoniae | 816,394 | 679 | |
| Rickettsia prowazekii | 1,111,523 | 834 | bacterium that causes epidemic typhus |
| Treponema pallidum | 1,138,011 | 1,039 | bacterium that causes syphilis |
| Pelagibacter ubique | 1,308,759 | 1,354 | smallest genome yet found in a free-living organism (marine α-proteobacterium) |
| Helicobacter pylori | 1,667,867 | 1,589 | chief cause of stomach ulcers (not stress and diet) |
| Methanocaldococcus jannaschii | 1,664,970 | 1,783 | These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea . |
| Aeropyrum pernix | 1,669,695 | 1,885 | |
| Methanothermobacter thermoautotrophicus | 1,751,377 | 2,008 | |
| Streptococcus pneumoniae | 2,160,837 | 2,236 | the pneumococcus |
| Pandoravirus | 2,473,870 | 2556 | A virus (of an amoeba) with a genome larger than that of the bacteria and archaea above and about the same as that of some parasitic eukaryotes. |
| Listeria monocytogenes | 2,944,528 | 2,926 | 2,853 of these encode proteins; the rest RNAs |
| Synechocystis | 3,573,470 | 4,003 | a marine cyanobacterium ("blue-green alga") |
| E. coli K-12 | 4,639,221 | 4,377 | 4,290 of these genes encode proteins; the rest RNAs |
| E. coli O157:H7 | 5.44 x 10 6 | 5,416 | strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12 |
| Schizosaccharomyces pombe | 12,462,637 | 4,929 | Fission yeast. A eukaryote with fewer genes than the three bacteria below. |
| Agrobacterium tumefaciens | 4,674,062 | 5,419 | Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti |
| Pseudomonas aeruginosa | 6.3 x 10 6 | 5,570 | Increasingly common cause of opportunistic infections in humans. |
| Sinorhizobium meliloti | 6,691,694 | 6,204 | The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids. |
| Saccharomyces cerevisiae | 12,495,682 | 5,770 | Budding yeast. A eukaryote. |
| Neurospora crassa | 38,639,769 | 10,082 | Plus 498 RNA genes. |
| Thalassiosira pseudonana | 34.5 x 10 6 | 11,242 | A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins |
| Naegleria gruberi | 41 x 10 6 | 15,727 | This free-living unicellular organism lives as both an amoeboid and a flagellated form. 4,133 of its genes are also found in other eukaryotes suggesting that they were present in the common ancestor of all eukaryotes. The great variety of functions encoded by these genes also suggests that the common ancestor of all eukaryotes was itself as complex as many of the present-day unicellular members. |
| Drosophila melanogaster | 122,653,977 | ~17,000 | the "fruit fly" |
| Caenorhabditis elegans | 100,258,171 | 21,733 | |
| Humans | 3.3 x 10 9 | ~21,000 | |
| Tetraodon nigroviridis (a pufferfish) | 3.42 x 10 8 | 27,918 | Although Tetraodon seems to have more protein-encoding genes than we do, it has much less non-coding DNA so its total genome is about a tenth the size of ours. |
| Mouse | 2.8 x 10 9 | ~23,000 | |
| Amphibians | 10 9 –10 11 | ? | |
| Arabidopsis thaliana | 0.135 x 10 9 | 27,407 | a flowering plant (angiosperm) with one of the smallest genomes known in the plant kingdom. |
| Picea abies | 19.6 x 10 9 | 28,354 | the Norway spruce, a conifer (gymnosperm). Even though it has only ~900 more genes than Arabidopsis, it has 145 times as much DNA. Most of this appears to be derived from transposons. |
| Psilotum nudum | 2.5 x 10 11 | ? |
Even though Psilotum nudum (sometimes called the "whisk fern") is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do, but certainly are not 30 times as complex. The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.
Not all genes are Indispensable
The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life. In other words, the loss of any one of the 381 is lethal; the loss of any one of the others is not. (This is not to say that all the organism needs are those 381 — see "A Minimal Genome?" below.)
Using similar techniques, three groups have recently found that only about 10% of the genes in the human genome (~2000 of them) must be present for human cells to grow successfully in culture. These genes encode proteins for such essential functions as controlling the cell cycle, DNA replication, DNA transcription and RNA translation. The cells can tolerate the loss of any one of the other ~18,000 genes. Thus the human genome appears to have redundant pathways that can often compensate for the loss of a single gene at least for cells growing in culture. Probably others will turn out to be essential for the development and functioning of the various types of differentiated cells in the intact body.
A Minimal Genome?
In March of 2016, workers at the J. Craig Venter Institute reported that they had created a strain of mycoplasma containing only 473 genes. This synthetic organism, which grows vigorously in culture, now holds the record for the smallest genome of a free-living organism.