3.20: 2020_Winter_Bis2a_Facciotti_Lecture_19
- Page ID
- 27636
This page is a draft and is under active development.
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
Learning objectives associated with 2020_Winter_Bis2a_Facciotti_Lecture_19
|
Genomes as organismal blueprints
A genome is an organism's complete collection of heritable information stored in DNA. Differences in information content help to explain the diversity of life we see all around us. Changes to the information encoded in the genome are the primary drivers of the phenotypic diversity we see (and some we can't) around us that
Determining a genome sequence
The information encoded in genomes provides important data for understanding life, its functions, its diversity, and its evolution. Therefore, a reasonable place to begin studies in biology would be to read the information content encoded in the genome
One of the very exciting elements of the DNA sequencing revolution is that it has required and continues to require contributions from biologists, chemists, materials scientists, electrical engineers, mechanical engineers, computer scientists and programmers, mathematicians and statisticians, product developers, and many other technical experts. The potential applications and implications of unlocking barriers to DNA sequencing have also engaged investors, business people, product developers, entrepreneurs, ethicists, policy makers, and many others to pursue new opportunities and to think about how to best and most responsibly use this growing technology.
The technological advances in genome sequencing have resulted in a virtual flood of complete genome sequences being determined and deposited into publicly available databases. You can find many of them at the National Center for Biotechnology Information. The number of available, completely sequenced genomes numbers in the tens of thousands—over 2,000 eukaryotic genomes, over 600 archaeal genomes, and nearly 12,000 bacterial genomes at the time of this writing. Tens of thousands of additional genome sequencing projects are in progress. With this many genome sequences available—or soon to be available—we can start asking many questions about what we see in these genomes. What patterns are common to all genomes?
Diversity of genomes
Diversity of sizes, number of genes, and chromosomes
Let's start by examining the range of genome sizes. In the table below, we see a sampling of genomes from the database. We can see that the genomes of free-living organisms range tremendously in size. The smallest known genome
Table 1. This table shows some genome data for various organisms. 2n = diploid number. Attribution:

Examining Table 1 also reveals that some organisms carry with them
Structure of genomes
Table 1 also provides clues to other points of interest. For instance, if we compare the pufferfish genome to the chimpanzee genome, we note that they encode roughly the same number of genes (19,000), but they do so on dramatically differently sized genomes—400 million base pairs versus 3.3 billion base pairs, respectively. That implies that the pufferfish genome must have much less space between its genes than what we might expect to find in the chimpanzee genome. This is the case, and the difference in gene density is not unique to these two genomes. If we look at Figure 1, which attempts to represent a 50-kb part of the human genome, we notice that besides the protein-coding regions (

Figure 1. This figure shows a 50-kb segment of the human
If we now look at what fraction of the whole human

Figure 2. This graph depicts how the many base pairs of DNA in the human haploid genome
When we examine the frequency of repeat regions versus protein-coding regions in different species, we note large differences in protein-coding versus non-coding regions.

Figure 3. This figure shows 50-kb segments of different genomes, illustrating the highly variable frequency of repeat versus protein-coding elements in different species.
Attribution:
Possible NB Discussion
Point
Propose a hypothesis for why you think some genomes might have more or fewer noncoding sequences.
Dynamics of genome structure
Genomes change over time, and many events can change their sequence.
1. Mutations
2. Genome rearrangements describe a class of large-scale changes that can occur, and they include: (a) deletions—where segments of the chromosome
These changes happen at different rates, and some
Possible NB Discussion
Point
How might mutations and genome rearrangements complicate studying/analyzing genomes? Conversely, can you think of interesting questions we can ask by comparing variation between genomes that occur because of events like mutation and genome rearrangements?
The study of genomes
Comparative genomics
One of the most common things to do with a collection of genome sequences is to compare the sequences of multiple genomes to one another.
Comparing the genomes of people who suffer from an inheritable disease to the genomes of people who
Last, some people are comparing genome sequences to understand the evolutionary history of the organisms. Typically, these types of comparisons result in a graph known as a phylogenetic tree, which is a graphical model of the evolutionary relationship between the various species being compared. This field
Metagenomics : Who is living somewhere and what are they doing?
Besides studying the genomes of individual species, the increasingly powerful DNA-sequencing technologies are making it possible to sequence simultaneously the genomes of environmental samples inhabited by many species. This field
Besides discovering "who lives where," the sequencing of microbial populations in different environments can also reveal what protein-coding genes are present in an environment. This can give investigators clues into what metabolic activities might occur in that environment. Besides providing important information about what kind of chemistry might happen in a specific environment, the catalog of genes that

