24.2 Chromosome Structure and Packaging
Recall from Chapter 8, that within eukaryotic cells, DNA is organized into long linear structures called chromosomes (Figure 4.8). A chromosome is a deoxyribonucleic acid (DNA) molecule with part or all of the genetic material (genome) of an organism. Most eukaryotic chromosomes include packaging proteins which, aided by chaperone proteins, bind to and condense the DNA molecule to prevent it from becoming an unmanageable tangle. Before typical cell division, these chromosomes are duplicated in the process of DNA replication, providing a complete set of chromosomes for each daughter cell. The replicated arms of a chromosome are called chromatids. Before being separated into the daughter cells during mitosis, replicated chromatids are held together by a chromosomal structure called the centromere.
Figure 4.8 Diagram of Replicated and Condensed Eukaryotic Chromosome. (1) Chromatid – one of the two identical parts of the chromosome after S phase. (2) Centromere – the point where the two chromatids are joined together. (3) Short arm is termed p; Long arm is termed q
Image by: Magnus Manske, Dietzel65, and Tryphon
Eukaryotic organisms (animals, plants, fungi and protists) store most of their DNA inside the cell nucleus as linear nuclear DNA, and some in the mitochondria as circular mitochondrial DNA or in chloroplasts as circular chloroplast DNA. In contrast, prokaryotes (bacteria and archaea) do not have organelle structures and thus, store their DNA only in a region of the cytoplasm known as the nucleoid region. Prokaryotic chromosomes consist of double–stranded circular DNA.
The genome of a cell is often significantly larger than the cell itself. For example, if the DNA from a human cell containing 46 chromosomes were stretched out in a line, it would extend more that 6 feet (2 meters)! How is it possible that the genetic information not only fits into the cell, but fits into the cell nucleus? Eukaryota solves this problem by a combination of supercoiling and packaging DNA around the histone family of proteins (described below). Prokaryotes do not contain histones (with a few exceptions). Prokaryotes tend to compress their DNA using nucleoid-associated-proteins (NAPs) and supercoiling (Figure 4.9).
DNA supercoiling refers to the over- or under-winding of a DNA strand, and is an expression of the strain on that strand (Figure 4.9). Supercoiling is important in a number of biological processes, such as compacting DNA, and by regulating access to the genetic code. DNA supercoiling strongly affects DNA metabolism and possibly gene expression. Additionally, certain enzymes such as topoisomerases are able to change DNA topology to facilitate functions such as DNA replication or transcription.
In a “relaxed” double-helical segment of B-DNA, the two strands twist around the helical axis once every 10.4–10.5 base pairs of sequence. Adding or subtracting twists, as some enzymes can do, imposes strain. If a DNA segment under twist strain were closed into a circle by joining its two ends and then allowed to move freely, the circular DNA would contort into a new shape, such as a simple figure-eight (Figure 4.9). Such a contortion is a supercoil. The noun form “supercoil” is often used in the context of DNA topology.
Figure 4.9 DNA Supercoiling. The supercoiled structure of linear DNA molecules with constrained ends. The helical nature of the DNA duplex is omitted for clarity.
Image by: Richard Wheeler
Positively supercoiled (overwound) DNA is transiently generated during DNA replication and transcription, and, if not promptly relaxed, inhibits (regulates) these processes. The simple figure eight is the simplest supercoil, and is the shape a circular DNA assumes to accommodate one too many or one too few helical twists. The two lobes of the figure eight will appear rotated either clockwise or counterclockwise with respect to one another, depending on whether the helix is over- or underwound. For each additional helical twist being accommodated, the lobes will show one more rotation about their axis. As a general rule, the DNA of most organisms is negatively supercoiled.
Lobal contortions of a circular DNA, such as the rotation of the figure-eight lobes above, are referred to as writhe. The above example illustrates that twist and writhe are interconvertible. Supercoiling can be represented mathematically by the sum of twist and writhe (Figure 4.9). The twist is the number of helical turns in the DNA and the writhe is the number of times the double helix crosses over on itself (these are the supercoils). Extra helical twists are positive and lead to positive supercoiling, while subtractive twisting causes negative supercoiling. Many topoisomerase enzymes sense supercoiling and either generate or dissipate it as they change DNA topology.
In part because chromosomes may be very large, segments in the middle may act as if their ends are anchored. As a result, they may be unable to distribute excess twist to the rest of the chromosome or to absorb twist to recover from underwinding—the segments may become supercoiled, in other words. In response to supercoiling, they will assume an amount of writhe, just as if their ends were joined.
Supercoiled circular DNA forms two major structures; a plectoneme or a toroid, or a combination of both (Figure 4.9). A negatively supercoiled DNA molecule will produce either a one-start left-handed helix, the toroid, or a two-start right-handed helix with terminal loops, the plectoneme. Plectonemes are typically more common in nature, and this is the shape most bacterial plasmids will take (Figure 4.10). For larger molecules it is common for hybrid structures to form – a loop on a toroid can extend into a plectoneme (Figure 4.10). DNA supercoiling is an important for DNA packaging within all cells, and seems to also play a role in gene expression.
Figure 4.10 Bacterial DNA Supercoiling. Atomic force microscopy (AFM) visualization of torsionally relaxed (A), and negativey supercoiled (B) bacterial plasmids pBR322. (C) Electron microscopy image of the E. coli chromosomal DNA displaying a hybrid toroidal-plectoneme structure.
Image A and B from: Witz, G. and Stasiak, A. (2009) Nucleic Acids Research 38(7):2119-2133.
Image C from: Prokaryotic Chromosomes
In addition to forming supercoiled structure, circular chromosomes from bacteria have been shown to undergo the processes of catenation and knotting upon the inhibition of topoisomerase enzymes. Catenation is the process by which two circular DNA strands are linked together like chain links, whereas DNA knotting is the interlooping structures occurring within a single circular DNA structure. In vivo, the action of topoisomerase enzymes is critical to keep knots and catenoids from tangling the DNA structure.
Figure 4.11 DNA Catenation and Knotting. Upper structure shows the negative supercoiled form of bacterial DNA. The inhibition of topoisomerase enzyme activity leads to the relaxation, catenation and knotting of the chromosomal structure.
Image from: Harms, A. et al. (2015) Cell Reports 12(9):1497-1507.
Note the circular nature of chloroplast and mitochondrial DNA, suggesting a bacterial origin for both of these organelle structures. Sequence alignments further lend support for the endosymbiotic theory, which proposes that bacteria were engulfed by early eukaryotic organisms and subsequently became symbiotic to their eukaryotic counterpart, rather than being digested.
In the cells of extant organisms, the vast majority of the proteins present in the mitochondria (numbering approximately 1500 different types in mammals) are coded for by nuclear DNA. However, sequencing of the human mitochondrial genome has revealed 16,569 base pairs encoding for 13 proteins (Figure 4.12). Many of the mitochondrially produced proteins are required for electron transport during the production of ATP (Figure 4.12).
Figure 4.12 Mitochondrial Genome. Mitochondria are organelle structures containing a double membrane, thought to have originated as an independent prokaryotic organism that was originally engulfed by a eukaryotic organism, where it became a symbiotic counterpart. Mitochondria contain circular chromosomal DNA that shares high sequence similarity with alphaprotobacteria. The human mitochondrial genome contains 16,569 base pairs encoding for 13 proteins and ribosomal RNA (rRNA) components.
Within eukaryotic chromosomes, chromatin proteins, known as histones, compact and organize DNA. These compacting structures guide the interactions between DNA and other proteins, helping control which parts of the DNA are transcribed.
Histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and playing a role in gene regulation. Without histones, the unwound DNA in chromosomes would be very long (a length to width ratio of more than 10 million to 1 in human DNA). For example, each human diploid cell (containing 23 pairs of chromosomes) has about 1.8 meters of DNA; wound on the histones, the diploid cell has about 90 micrometers (0.09 mm) of chromatin.
Five major families of histones exist: H1/H5, H2A, H2B, H3, and H4. Histones H2A, H2B, H3 and H4 are known as the core histones, while histones H1/H5 are known as the linker histones.
The core histones all exist as dimers, which are similar in that they all possess the histone fold domain: three alpha helices linked by two loops (Figure 4.13). It is this helical structure that allows for interaction between distinct dimers, particularly in a head-tail fashion (also called the handshake motif). The resulting four distinct dimers then come together to form one octameric nucleosome core, approximately 63 Angstroms in diameter. Around 146 base pairs (bp) of DNA wrap around this core particle 1.65 times in a left-handed super-helical turn to give a particle of around 100 Angstroms across, called a nucleosome.
Figure 4.13 Nucleosome Core Structure. Histones H2A and H2B dimerize, and Histones H3 and H4 dimerize. Two dimers of each join to form a histone core octomer. The DNA double helix winds 1.65 times around the octomer core forming the nucleosome structure.
Image adapted from: Nucleosome Structure
The linker histone H1 binds the nucleosome at the entry and exit sites of the DNA, thus locking the DNA into place and allowing the formation of higher order structure (Figure 4.14). The most basic such formation is the 10 nm fiber or beads on a string conformation. This involves the wrapping of DNA around nucleosomes with approximately 50 base pairs of DNA separating each pair of nucleosomes (also referred to as linker DNA).
The nucleosome contains over 120 direct protein-DNA interactions and several hundred water-mediated ones. Direct protein – DNA interactions are not spread evenly about the octamer surface but rather located at discrete sites. These are due to the formation of two types of DNA binding sites within the octamer; the α1α1 site, which uses the α1 helix from two adjacent histones, and the L1L2 site formed by the L1 and L2 loops. Salt links and hydrogen bonding between both side-chain basic and hydroxyl groups and main-chain amides with the DNA backbone phosphates form the bulk of interactions with the DNA. This is important, given that the ubiquitous distribution of nucleosomes along genomes requires it to be a non-sequence-specific DNA-binding factor. Although nucleosomes tend to prefer some DNA sequences over others, they are capable of binding practically to any sequence, which is thought to be due to the flexibility in the formation of these water-mediated interactions. In addition, non-polar interactions are made between protein side-chains and the deoxyribose groups, and an arginine side-chain intercalates into the DNA minor groove at all 14 sites where it faces the octamer surface. The distribution and strength of DNA-binding sites about the octamer surface distorts the DNA within the nucleosome core. The DNA is non-uniformly bent and also contains twist defects. The twist of free B-form DNA in solution is 10.5 bp per turn. However, the overall twist of nucleosomal DNA is only 10.2 bp per turn, varying from a value of 9.4 to 10.9 bp per turn.
The histone tail extensions constitute up to 30% by mass of histones, but are not visible in the crystal structures of nucleosomes due to their high intrinsic flexibility, and have been thought to be largely unstructured (Figure 4.14). The N-terminal tails of histones H3 and H2B pass through a channel formed by the minor grooves of the two DNA strands, protruding from the DNA every 20 bp. The N-terminal tail of histone H4, on the other hand, has a region of highly basic amino acids (16-25), which, in the crystal structure, forms an interaction with the highly acidic surface region of a H2A-H2B dimer of another nucleosome, being potentially relevant for the higher-order structure of nucleosomes. This interaction is thought to occur under physiological conditions also, and suggests that acetylation of the H4 tail distorts the higher-order structure of chromatin.
Figure 4.14 Overall Nucleosome Structure. (A) Side view diagram of the nucleosome structure with the histone octomer shown in blue, the DNA double helix in red, and the histone H1 linker in green. (B) Shows a top view rendering of the histone octomer with the associated DNA helix. Note that the Histone tails from H3 and H2B protude from the DNA.
The formation of the DNA double helix represents the first order packaging of the chromosome structure (Figure 4.15). The formation of nucleosomes represent the second level of packaging for eukaryotic chromosomes. In vitro data suggests that nucleosomes are then arranged into either a solenoid structure which consists of 6 nucleosomes linked together by the Histone H1 linker proteins or a zigzag structure that is similar to the solenoid construct (Figure 4.15). Both the solenoid and zigzag structures are approximately 30 nm in diamater. The solenoid and zigzag structures reported from in vitro data have not yet been confirmed to occur in vivo.
During interphase, each chromosome occupies a spatially limited, roughly elliptical domain which is known as a chromosome territory (CT). Each chromosome territory is comprised of higher order chromatin units of ~1 Mb each. These units are likely built up from smaller loop domains that contain the solenoid/zigzag structural motifs. On the other hand, 1Mb domains can themselves serve as smaller units in higher-order chromatin structures.
Chromosome territories are known to be arranged radially around the nucleus. This arrangement is both cell and tissue-type specific and is also evolutionary conserved. The radial organization of chromosome territories was shown to correlate with their gene density and size. In this case, the gene-rich chromosomes occupy interior positions, whereas larger, gene-poor chromosomes, tend to be located around the periphery. Chromosome territories are also dynamic structures, with genes able to relocate from the periphery towards the interior once they have been ‘switched on’. In other cases, genes may move in the opposite direction, or simply maintain their position. The eviction of genes from their chromosome territories into the interchromatin compartment or a neighboring chromosome territory is often accompanied by the formation of large decondensed chromatin loops.
Figure 4.15 Chromosome Structure. (1) DNA double helix is approximately 2 nm in diameter. (2) The nucleosome core structure is approximately 11 nm in diameter. (3) The solenoid/zigzag structure is approximately 30 nm in diameter and is proposed to form chromosome loops (4) during cellular interphase and more condensed chromosome territories (5) during mitosis.
Image by: MBInfo
Models describing chromosome territory arrangement
With the development of high-throughput biochemical techniques, such as 3C (‘chromosome conformation capture’) and 4C (‘chromosome conformation capture-on-chip’ and ‘circular chromosome conformation capture’), numerous spatial interactions between neighbouring chromatin territories have been described (Figure 4.16). These descriptions have been supplemented with the construction of spatial proximity maps for the entire genome (e.g., for a human lymphoblastoid cell line). Together, these observations and physical simulations have led to the proposal of various models that aim to define the structural organization of chromosome territories:
1. The chromosome territory-interchromatin compartment (CT-IC) model describes two principal compartments: chromosome territories (CTs) and an interchromatin compartment (IC). In this model, chromosome territories build up an interconnected chromatin network that is associated with an adjacent 3D space called the interchromatin compartment. The latter can be observed using both light and electron microscopy.
Within a single chromosome territory, the interphase chromosome is divided into defined regions based on the level of chromosome condensation. Here, the inner part of the interphase chromosome is comprised of more condensed chromatin domains or higher-order chromatin fibers, while a thin (<200 nm) layer of more decondensed chromatin, known as the perichromatin region, can be found around the chromosomal periphery. Functionally, the perichromatin region represents the major transcriptional compartment, and is also the region where most co-transcriptional RNA splicing takes place. DNA replication  and DNA repair  is also predominately carried out within the perichromatin region. Finally, nascent RNA transcripts, referred to as perichromatin fibrils, are also generated in the perichromatin region. Perichromatin fibrils are then subjected to the splicing events by the factors, provided from the interchromatin compartment.
The lattice model, proposed by Dehgani et al. is based on reports that transcription also occurs within the inner, more condensed chromosome territories and not only at the interface between the interchromatin compartment and the perichromatin region. Using ESI (electron spectroscopic imaging), Dehgani et al. showed that chromatin was organized as an array of deoxy-ribonucleoprotein fibers of 10–30 nm in diameter. In this study, the interchromatin compartments, which are described in the CT-IC model as large channels between chromosome territories, were not apparent. Instead, chromatin fibers created a loose meshwork of chromatin throughout the nucleus that intermingled at the periphery of chromosome territories. Thus, inter- and intra-chromosomal spaces within this meshwork are essentially contiguous and together form the intra-nuclear space.
2. The interchromatin network (ICN) model predicts that intermingling chromatin fibers/loops can make both cis- (within the same chromosome) and trans- (between different chromosomes) contacts. This intermingling is uniform and makes distinction between the chromosome territory and interchromatin compartment functionally meaningless. The advantage of the ICN model is that it permits high chromatin dynamics and diffusion-like movements. The authors propose that ongoing transcription influences the degree of intermingling between specific chromosomes by stabilizing associations between particular loci. Such interactions are likely to depend on the transcriptional activity of the loci, and are therefore cell-type specific.
The cell type-specific organization of chromosome territories has been studied by measuring the volume and frequency of intermingling between heterologous chromosomes. By using 3C (chromosome conformation capture) and FISH (fluorescence in situ hybridization) to map the regions of chromosome intermingling, it was revealed that these regions contain a higher density of active genes and are enriched with markers of transcriptional activation and repression, such as activated RNAPII. By comparing the positions of the CTs in undifferentiated mouse embryonic stem (ES) cells, ES cells in early stages of differentiation, and terminally differentiated NIH3T3 cells, it was shown that fully differentiated cells had a higher enrichment of RNAPII, compared to undifferentiated or less-differentiated cells. The findings support the notion that the intermingling regions have functional significance in the nucleus and provide a basis for understanding how the radial and relative positions of chromosomal territories evolve during the process of differentiation, explaining their organization in a cell type-dependent manner.
3. The Fraser and Bickmore model emphasizes the functional importance of giant chromatin loops, which originate from chromosome territories and expand across the nuclear space in order to share transcription factories. In this case, both cis- and trans- loops of decondensed chromatin can be co-expressed and co-regulated by the same transcription factory.
4. The Chromatin polymer models assume a broad range of chromatin loop sizes and predict the observed distances between genomic loci and chromosome territories, as well as the probabilities of contacts being formed between given loci. These models apply physics-based approaches that highlight the importance of entropy for understanding nuclear organization. By proposing the existence of conformational chromatin ensembles with structures based on three possible homopolymer states, these models also provide alternative structures to the traditional 30 nm chromatin fiber, which has been brought into question following recent studies.
With a lack of experimental evidence to support these described models, it must be remembered that they serve only to hypothesize the structural and chemical properties of intermediate chromatin structures, and to highlight unanswered questions. For example, the mechanisms that exist to control the rate and the extent of chromatin movement remain to be defined
At the ends of the linear chromosomes are specialized regions of DNA called telomeres (Figure 4.17). The main function of these regions is to allow the cell to replicate chromosome ends using the enzyme telomerase, as the enzymes that normally replicate DNA cannot copy the extreme 3′ ends of chromosomes. These specialized chromosome caps also help protect the DNA ends, and stop the DNA repair systems in the cell from treating them as damage to be corrected. In human cells, telomeres are usually lengths of single-stranded DNA containing several thousand repeats of a simple TTAGGG sequence.
During DNA replication, the double stranded DNA is unwound and DNA polymerase synthesizes new strands. However, as DNA polymerase moves in a unidirectional manner (from 5’ to 3’), only the leading strand can be replicated continuously. In the case of the lagging strand, DNA replication is discontinuous. In humans small RNA primers attach to the lagging strand DNA, and the DNA is synthesized in small stretches of about 100-200 nucleotides, which are termed Okazaki fragments. The RNA primers are removed, replaced with DNA and the Okazaki fragments ligated together. At the end of the lagging strand, it is impossible to attach an RNA primer, meaning that there will be a small amount of DNA lost each time the cell divides. This ‘end replication problem’ has serious consequences for the cell as it means the DNA sequence cannot be replicated correctly, with the loss of genetic information.
In order to prevent this, telomeres are repeated hundreds to thousands of times at the end of the chromosomes. Each time cell division occurs, a small section of telomeric sequences are lost to the end replication problem, thereby protecting the genetic information. At some point, the telomeres become critically short. This attrition leads to cell senescence, where the cell is unable to divide, or apoptotic cell death. Telomeres are the basis for the Hayflick limit, the number of times a cell is able to divide before reaching senescence.
Telomeres can be restored by the enzyme telomerase, which extends telomeres length (Figure 4.17). Telomerase activity is found in cells that undergo regular division, such as stem cells and lymphocyte cells of the immune system. Telomeres can also be extended through the Alternative Lengthening of Telomeres (ALT) pathway. In this case, rather than being extended, telomeres are switched between chromosomes by homologous recombination. As a result of the telomere swap, one set of daughter cells will have shorter telomeres, and the other set will have longer telomeres.
A downside to telomere extension is the potential for uncontrolled cell division and cancer. Abnormally high telomerase activity has been found in the majority of cancer cells, and non-telomerase tumors often exhibit ALT pathway activation. As well as the potential for losing genetic information, cells with short telomeres are at a high risk for improper chromosome recombination, which can lead to genetic instability and aneuploidy (an abnormal number of chromosomes).
These guanine-rich telomere sequences may also stabilize chromosome ends by forming structures of stacked sets of four-base units, rather than the usual base pairs found in other DNA molecules (Figure 4.17). Here, four guanine bases form a flat plate and these flat four-base units then stack on top of each other, to form a stable G-quadruplex structure. These structures are stabilized by hydrogen bonding between the edges of the bases and chelation of a metal ion in the centre of each four-base unit. Other structures can also be formed, with the central set of four bases coming from either a single strand folded around the bases, or several different parallel strands, each contributing one base to the central structure.
In addition to these stacked structures, telomeres also form large loop structures called telomere loops, or T-loops. Here, the single-stranded DNA curls around in a long circle stabilized by telomere-binding proteins. At the very end of the T-loop, the single-stranded telomere DNA is held onto a region of double-stranded DNA by the telomere strand disrupting the double-helical DNA and base pairing to one of the two strands. This triple-stranded structure is called a displacement loop or D-loop.