2.10: Regulation of Gene Expression

Last updated
Save as PDF

Page ID: 8427

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Introduction to gene regulation

Regulation is all about decision making. Gene regulation is, therefore, all about understanding how cells make decisions about which genes to turn on, turn off or to tune up or tune down. In the following section we discuss some of the fundamental mechanisms and principles used by cells to regulate gene expression in response to changes in cellular or external factors. This biology is important for understanding how cells adjust changing environments, including how some cells, in multicellular organisms, decide to become specialized for certain functions (e.g. tissues).

Since the subject of regulation is both a very deep and broad topic of study in biology, in Bis2a we don't try to cover every detail - there are simply too many. Rather, as we have done for all other topics, we try to focus on (a) outlining some of the core logical constructs and questions that you must have when you approach ANY scenario involving regulation, (b) learning some common vocabulary and ubiquitous mechanisms and (c) examining a few concrete examples that illustrate the points made in a and b.

Gene Expression

Introduction

All cells control when and how much each one of its genes are expressed. This simple statement - one that could be derived simply from observing cellular behavior - brings up many questions that we can begin to lay out using the Design Challenge.

Trying to define "gene expression"

The first thing we need to do, however, is to define what it means when we say that a gene is "expressed". The ultimate "expression" of a gene is its effect on phenotype. If the gene encodes a protein, one might reasonably propose that "expression" of a gene means how much functional protein is made, and that measuring the amount of that protein might be a good measure of "gene expression". Many molecular biologists refer to the level of that gene's transcript as an easily measured proxy for its expression. By that definition, one might want to count how many full-length transcripts are present in each cell. In practice we often find that the definition depends on the context of the discussion. Keep that in mind. In Bis2A we'll try to use the term "expression" primarily to describe the creation of the final functional product(s).

The design challenge of regulating gene expression

To drive this discussion from a design challenge perspective, we can formally stipulate that the "big problem" we are interested in is that of regulating protein abundance in a cell. Problem: The abundance of each functional protein must be regulated. We can then start by posing subproblems:

Let's take a moment, though, first to reload a couple of ideas. The process of gene expression requires multiple steps depending on what the fate of the final product will be. In the case of structural and regulatory RNAs (i.e. tRNA, rRNA, etc.) the process requires that a gene be transcribed and that any needed post-transcriptional processing take place. In the case of a protein-coding gene, the transcript must also be translated into protein and if required, modifications to the protein must also be made. Of course, both transcription and translation are multi-step processes and most those sub-steps are also potential sites of control.

Some of the subproblems might therefore be:

There must be some mechanism(s) to regulate the first step of this multi-step process, the initiation of transcription (just getting things started). So, we could state, "we need a mechanism to regulate the initiation of transcription, in a particular gene or group of genes." We could also turn this into a question and ask, "how can the initiation of transcription be accomplished"?
We can use similar thinking to state, "we need a mechanism for stopping transcription" or to ask "how do we switch off transcription?".
Using this convention we can state, "we need to switch translation of a particular type of transcript on or off".
We've talked only about synthesis of protein and RNA. It is quite reasonable to also state, "we need a mechanisms to regulate the degradation of particular RNAs and particular proteins."

Focusing on transcription

In this course we begin by focusing primarily on examining the first couple of problems/questions, the regulation of transcription initiation and termination - from genomic information to a functional RNA, either ready as is (e.g. in the case of a functional RNA) or ready for translation. This allows us to examine some fundamental concepts regarding the regulation of gene expression and to examine a few real examples of those concepts in action.

Suggested discussion

Why is it important to regulate gene expression- why not just express all genes all of the time? Why have them if you don't want to express them?

Create a list of hypotheses with your classmates of reasons why the regulation of gene expression is important for prokaryotes and for eukaryotes. You may also want to consider contrasting reasons gene regulation is important for unicellular organisms versus multi-cellular organisms or communities of unicellular organisms (like colonies of bacteria).

Activation and Repression of Transcription

Some basics

Let us consider a protein coding gene and work through some logic. We know that to transcribe this gene an RNA polymerase will need to be recruited to the start of the coding region. There needs to be some mechanism, based on chemical logic, to help recruit the RNA polymerase to the start of the protein-coding gene. Likewise, if this process is to be regulated, there needs to be some mechanism, or mechanisms, to dictate when an RNA polymerase should be recruited to the start of a gene, when it should not, and/or if it is recruited to the DNA, whether or not it should actually begin transcription and how many times this process should happen. Note, that the previous sentence, has several distinct subproblems/questions (e.g. when is the polymerase recruited?, if recruited should it start transcription? how often should this happen?). We can also reasonably infer, that there will need to be some mechanisms to "instruct" (more anthropomorphisms) the polymerase to stop (stop initiating!) transcription. Finally, since the role of transcription is to create RNA copies of the genome segments, we should also consider problems/questions related to other factors that influence the abundance of RNA, like mechanisms of degradation. There must be some mechanism for each of these steps, and any of these may be involved in the regulation of this process.

A schematic showing a protein coding gene and some of the questions or problems that we need to ask ourselves or alternatively problems we need to know solutions for if we are to understand how regulation of the transcriptional portion of the gene's expression is regulated. Attribution: Marc T. Facciotti (own work)

Recruiting RNA polymerase to specific sites

To initiate transcription, the RNA polymerase must be recruited to a segment of DNA near the start of a region of DNA encoding a functional transcript. The function of the RNA polymerase, as a polymerase, is to move along any segment of DNA, making an RNA transcript, guided by the template strand. Finding a way to recruit this "sequence agnostic" polymerase to a specific site therefore seems contradictory to its usual behavior, which displays no particular preference for a particular sequence. Explaining this contradiction requires us to invoke something new. Either transcription can start anywhere and just those events that lead to a full productive transcript do anything useful or something other than the RNA polymerase itself helps to recruit the enzyme to the beginning of a gene. The latter, we now take for granted, is indeed the case, and this is true for both prokaryotes and eukaryotes.

The recruitment of the RNA polymerase is mediated by proteins called general transcription factors. In bacteria, these are called sigma factors.

In eukaryotes, important general transcription initiation factors include TATA binding protein (TBP) and TFIIB, which function in conjunction with numerous other protein complexes (for a total of nearly 100 proteins) to recruit RNA polymerase II. You'll recall that the single Archean RNA polymerase is more similar to (all three) eukaryotic RNA polymerases than to the bacterial polymerase. Archeans employ a stripped down version of this eukaryotic preinitiation complex to recognize promoters.

The general transcription factors have at least two basic functions: (1) They (in eukaryotes, as a multi-protein complex) are able to chemically recognize a specific sequence of DNA and (2) they are able to load RNA polymerase at that site. Together these two functions of general transcription factors solve the problem of recruiting an enzyme that is otherwise not capable of binding a specific DNA site. In some texts, the general transcription factors (and particularly the sigma factor varieties) are said to be part of the RNA polymerase. While they are certainly part of the complex when they help to target the RNA polymerase they do not (usually) continue with the RNA polymerase after it starts transcription. Each bacterial RNA polymerase is loaded onto a promoter by sigma factor. A bacterial genome may encode several sigma factors, differentially expressing them under different conditions, and as a result selecting a different range of promoters to help the bacterium adjust to those conditions.

The DNA site to which an RNA polymerase is recruited is called a promoter. While the DNA sequences of different promoters need not be exactly the same, different promoter sequences typically do have some special chemical properties in common. Obviously, one property is that they are able to associate with the general transcription factors mentioned above. In addition, the promoter usually has a DNA sequence that facilitates the dissociation of the double stranded DNA such that the polymerase can begin transcribing the coding region. (Note: technically we could have broken down the properties of the promoter into design challenge subproblems. In this case we skipped it, but you should still be able to step backwards and create the problem statements and or relevant questions once you find out about promoters).

Particularly in eukaryotic systems, the complex of proteins that assembles with the RNA polymerase at promoters (typically called the pre-initiation complex) can number in the tens of proteins. Each of these other proteins has specific function but this is far to too much detail to dive into for Bis2A.

A model of the E. coli pre-initiation complex. The sigma factor is colored red. The DNA is depicted as orange tubes and opposing blue:green bases. The rest of the pre-initiation complex is colored pink. Note that the DNA has regions of double helix and an open structure inside the PIC. Attribution: Structure derived from PDB coordinates (4YLN) Marc T. Facciotti (own work)

States of a regulated promoter

Since promoters recruit an RNA polymerase these sites and the assembly of the pre-initiation complex are obvious choices for regulating the first steps of gene expression. At the level of transcription initiation, we often classify promoters into one of three classes. The first is called constitutive. Constitutive promoters are generally not regulated very strongly. Their base state is "on". When the constitutive transcription from a promoter is very high (relative to most other promoters), we will colloquially call that promoter a "strong constitutive" promoter. By contrast, if the amount of transcription from a constitutive promoter is low (relative to most other promoters) we will call that promoter a "weak constitutive" promoter.

A second way to classify promoters by the use of the term activated or equivalently, induced. These interchangeable terms are used to describe promoters that are sensitive to some external stimulus and respond to said stimulus by increasing transcription. Activated promoters have a base state that exhibits little to no transcription. Transcription is then "activated" in response to a stimulus - the stimulus turns the promoter "on". This regulation is going to require that activities of regulatory proteins- the sequence of the promoter itself does not change!

Finally, the third term used to classify promoters is by the use of the term repressed. These promoters also respond to stimuli but do so by decreasing transcription. The base state for these promoters is some basal level of transcription, and the stimulus acts to turn down or repress transcription. Transcription is "repressed" in response to a stimulus - the stimulus turns the promoter "off". Again, this will require the activity of some protein that recognizes both the stimulus and the DNA sequence of the specific promoter(s) it needs to regulate.

The examples given above assumed that a single stimulus acts to regulate promoters. While this is the simplest case, many promoters may integrate different types of information and may be alternately activated by some stimuli and repressed by other stimuli.

Transcription factors help to regulate the behavior of a promoter

How are promoters sensitive to external stimuli? In both activation and repression, gene regulation requires specialized proteins to change the transcriptional output of the gene being observed. The proteins responsible for helping to regulate expression are called transcription factors. The specific DNA sequences bound by transcription factors are often, in bacteria, called operators and in many cases the operators are very close to the promoter sequences. This can result some ambiguity in the definition of the term "promoter". In some cases scientists are referring to the specific location at which RNA polymerase will bind to initiate transcription. In other cases, scientists will be referring to ALL of the regulatory sequences near the promotor (including, for example, an operator sequence) that result in the regulatory qualities characteristic of that promoter- for example, the "lac promoter", as we'll see below, is positively regulated by lactose. A genetic engineer could place the lac promoter 5' of any coding region of interest, and the lac promoter would confer lactose-inducibility on that coding region.

Suggestion: describe the difference between a "transcription factor", as describe immediately above, and the "general transcription factor"s described previously.

In bacterial research, if the transcription factor acts by binding DNA and the RNA polymerase in a way that increases transcription, then it is typically called an activator. If, by contrast, the transcription factor acts by binding DNA to repress or decrease transcription of the gene then it is called a repressor.

Why are the classifications of activator and repressor potentially problematic? These terms describe idealized single functions. While this may be true in the case of some transcription factors, in reality other transcription factors may act to activate gene expression in some conditions while repressing in other conditions. Some transcription factors will simply act to modulate expression either up or down depending on context rather than shutting transcription "off" or turning it completely "on". To circumvent some of this possible confusion, some of your instructors prefer to avoid using the terms activator and repressor and instead prefer to simply discuss the activity of transcription various transcription factors as either a positive or a negative influence on gene expression in specific cases. If these terms are used, you might hear your instructor saying that the transcription factor in question ACTS LIKE/AS a repressor or that it ACTS LIKE/AS an activator, taking care not to call it simply an activator or repressor. It is more likely however that you will hear them say that a transcription factor is acting to positively or negatively influence transcription.

Suggested discussion

What types of interactions do you think happen between the amino acids of the transcription factor and the double helix of the DNA? How do transcription factors recognize their binding site on the DNA?

Allosteric Modulators of Regulatory Proteins

The activity of many proteins, including regulatory proteins and various transcription factors, can be allosterically modulated by various factors, including by the relative abundance of small molecules in the cell. These small molecules are often referred to as inducers or co-repressors or co-activators and are often metabolites, such as lactose or tryptophan or small regulatory molecules, such as cAMP or GTP. These interactions allow the TF to be responsive to environmental conditions and to modulate its function accordingly. It is helping to make a decision about whether to transcribe a gene or not depending on the abundance of the environmental signal.

Let us imagine a negative transcriptional regulator. In the most simple case we've considered so far, transcription of gene with a binding site for this transcription factor would be low when the TF is present and high when the TF is absent. We can now add a small molecule to this model. In this case the small molecule is able to bind the negative transcriptional regulator through sets of complementary hydrogen and ionic bonds. In this first example we will consider the case where the binding of the small molecule to the TF induces a conformational change to the TF that severely reduces its ability to bind DNA. If this is the case, the negative regulator - once bound by its small molecule - would release from the DNA. This would thereby relieve the negative influence and lead to increased transcription. This regulatory logic might be appropriate to have evolved in the following scenario: a small molecule food-stuff is typically absent from the environment. Therefore, genes encoding enzymes that will degrade/use that food should be kept "off" most of the time to preserve the cellular energy that their synthesis would use. This could be accomplished by the action of a negative transcriptional regulator. When the food-stuff appears in the environment it would be appropriate for the enzymes responsible for its processing to be expressed. The food-stuff could then act by binding to the negative regulator, changing the TF's conformation, causing its release from the DNA and thereby turning on transcription of the processing enzymes.

An abstract model of a generic transcriptional unit regulated by a negative regulator whose activity is modulated by a small molecule (depicted by a star). In this case, binding of the small molecule causes the TF to release from the DNA. Attribution: Marc T. Facciotti (own work)

We can consider a second model for how a negatively acting TF might interact with a small molecule. In this case, the TF alone is unable to bind its regulatory site on the DNA. However, when a small molecule binds to the TF a conformational change occurs that reorients DNA binding amino-acids into the "correct" orientation for DNA binding. The TF-small molecule complex now binds to the DNA and acts to negatively influence transcription.

An abstract model of a generic transcriptional unit regulated by a negative regulator whose activity is modulated by a small molecule (depicted by a star). In this case, binding of the small molecule causes the TF to bind to the DNA. Attribution: Marc T. Facciotti (own work)

Note how the activity of the TF can be modulated in distinctly different ways by a small molecule. Depending on the logic of the regulatory system, the binding of this external signal can either cause binding of the TF-small molecule complex to DNA OR binding of the small molecule can cause the release of the TF-small molecule complex from the DNA. The same types of examples can be worked up for a positive regulator (try making one up, and draw the components).

In both cases proposed above, the binding of a small molecule to a TF will be dependent on how strongly the TF interacts with the small molecule. This will depend on the types and spatial orientation of the protein's chemical functional groups and the complementary functional groups on the small molecule. It should not be surprising, therefore, to learn that the binding of the small molecule to the TF will be dependent on various factors, including but not limited to the concentration of the small-molecule and the TF.

Is a trancription factor a positive or negative regulator?

Resolving a common point of confusion

At this point, it is not uncommon for many Bis2a students to be slightly confused about how to determine if a transcription factor is acting as a positive or negative regulator. This confusion often comes after a discussion of the possible modes that stimulus (i.e. small molecule) can influence the activity of a transcription factor. This is not too surprising. In the examples above, the binding of a effector molecule to a transcription factor could have one of two different effects: (1) binding of the effector molecule could induce a DNA-bound transcription factor to release from its binding site, derepressing a promoter, and turning on gene expression. (2) binding of the effector molecule to the transcription factor could cause the TF to bind to its DNA binding site, repressing a promoter and therefore turning off gene expression. In the first case the small molecule is acting to positively regulate expression because it inhibits the biochemical activity of the TF (its ability to bind a specific sequence and thereby block polymerase loading), while in the second example the small molecule is acting to negatively regulate gene expression because it activates the TF's biochemical activity (again, sequence-specific DNA binding that blocks polymerase loading).

In both examples above, the TF itself is acting as a negative regulator. To determine this we look at what happens when the TF binds DNA (whether a small molecule is bound to the TF or not). In both cases, binding of the TF to DNA represses transcription. The TF is therefore acting as a negative regulator. A similar analysis can be done with positively acting TFs- that is, TF's that help promote polymerase loading at the promoter and/or initiation of transcription.

Note that in some cases a TF may act as a positive regulator at one promoter and negative regulator at a different promoter so describing the behavior of the TF on a per case basis is often important (reading too much from the name it has been assigned can be misleading sometimes). Other TF protein can act alternately as both positive or negative regulators of the same promoter depending on conditions. Again, describing the behavior of the TF specifically for each case is advised. In this class we try to avoid these more complex examples!

A genetic test for positive or negative regulatory function of a TF

How does one determine if a regulatory protein functions in a positive or negative way? A simple genetic test is to ask "what happens to expression if the regulatory protein is absent?" This can be accomplished by removing the coding gene for the transcription factor from the genome. If a transcription factor acts positively, then its presence is required to activate transcription. In its absence, there is no regulatory protein, therefore no activation, and the outcome is lower transcription levels of a target gene. The opposite is true for a transcription factor acting negatively. In its absence expression should be increased, because the gene keeping expression low is no longer around.

Termination of Transcription and RNA degradation

Degradation of RNA

The lifetimes of different RNA species in the cell can vary dramatically, from seconds to hours. The mean lifetime of mRNA can also vary dramatically depending on the organism. For instance, the median lifetime for mRNA in E. coli is ~5 minutes. The half-life of mRNA in yeast is ~20 minutes and 600 minutes for human cells. Some of the degradation is "targeted". That is, some transcripts include a short sequence that targets them for RNA degrading enzymes, speeding the degradation rate. It doesn't take too much imagination to infer that this process might also be evolutionarily tuned for different genes. Simply realizing that degradation - and the tuning of degradation - can also be a factor in controlling the expression of a gene is sufficient for Bis2a.

Summary of gene regulation

In the preceding text we have examined several ways to start solving some of the design challenges associated with regulating the amount of transcript that is created for a single coding region of the genome. We have looked in abstract terms at some of the processes responsible for controlling the initiation of transcription, how these may be made sensitive to environmental factors, and very briefly at the processes that terminate transcription and handle the active degradation of RNA. Each of these processes can be quantitatively tuned by nature to be "stronger" or "weaker". It is important to realize that the real values of "strength" (e.g. promoter strength, degradation rates, etc.) influence the behavior of the overall process in potentially functionally important ways.

Examples of Bacterial Gene Regulation

This section describes two examples of transcriptional regulation in bacteria. These are presented as illustrative examples. Use these examples to learn some basic principles about mechanisms of transcriptional regulation. Be on the lookout in class, in discussion, and in the study-guides for extensions of these ideas and use these to explain the regulatory mechanisms used for regulating other genes.

Gene Regulation Examples in E. coli

The DNA of bacteria and archaea are usually organized into one or more circular chromosomes in the cytoplasm. The dense aggregate of DNA that can be seen in electron micrographs is called the nucleoid. In bacteria and archaea, genes, whose expression needs to be tightly coordinated (e.g. genes encoding proteins that are involved in the same biochemical pathway) are often grouped closely together in the genome (this, as we will see, is a good idea if genes- aka replicators- are transferred from one species to another). When the expression of multiple genes is controlled by the same promoter and a single transcript is produced these expression units are called operons. For example, in the bacterium Escherichia coli all of the genes needed to utilize lactose are encoded next to one another in the genome. This arrangement is called the lactose (or lac) operon. It is often the case in bacteria and archaea that nearly 50% of all genes are encoded into operons of two or more genes.

The Role of the Promoter

The first level of control of gene expression is at the promoter itself. Some promoters recruit RNA polymerase and turn those DNA-protein binding events into transcripts more efficiently than other promoters. This intrinsic property of a promoter, it's ability to produce transcript at a particular rate, is referred to as promoter strength. The stronger the promoter, the more RNA is made in any given time period. Promoter strength can be "tuned" by Nature in very small or very large steps by changing the nucleotide sequence the promoter (e.g. mutating the promoter). This results in families of promoters with different strengths that can be used to control the maximum rate of gene expression for certain genes.

UC Davis Undergraduate Connection

A group of UC Davis students interested in synthetic biology used this idea to create synthetic promoter libraries for engineering microbes as part of their design project for the 2011 iGEM competition.

Example #1: Trp Operon

Logic for regulating tryptophan biosynthesis

E. coli, like all organisms, needs to either synthesize or consume amino acids to survive. The amino acid tryptophan is one such amino acid. E. coli can either import tryptophan from the environment (eating what it can scavenge from the world around it) or synthesize tryptophan de novo using enzymes that are encoded by five genes. These five genes are encoded next to each other in the E. coli genome into what is called the tryptophan (trp) operon (Figure below). If tryptophan is present in the environment, then E. coli does not need to synthesize it and the switch controlling the activation of the genes in the trp operon is switched off. However, when environmental tryptophan availability is low, the switch controlling the operon is turned on, transcription is initiated, the genes are expressed, and tryptophan is synthesized. See the figure and paragraphs below for a mechanistic explanation.

**Organization of the trp operon**

Five genomic regions encoding tryptophan biosynthesis enzymes are arranged sequentially on the chromosome and are under the control of a single promoter. All five enzymes are encoded by a single transcript- they are organized into an operon. Just before the coding region is the transcriptional start site. This is, as the name implies, the location where the RNA polymerase starts a new transcript. The promoter sequence is further upstream of the transcriptional start site.

A DNA sequence called an "operator" is also encoded between the promoter and the first trp coding gene. This operator is the DNA sequence to which the regulatory transcription factor protein will bind.

A few more details regarding TF binding sites

It should be noted that the use of the term "operator" is limited to just a few regulatory systems and almost always refers to the binding site for a negatively acting transcription factor. Conceptually what you need to remember is that there are sites on the DNA that interact with regulatory proteins allowing them to perform their appropriate function (e.g. repress or activate transcription). This theme will be repeated universally across biology whether the "operator" term is used or not.

Moreover, while the specific examples you will be show depict TF binding sites in their known locations, these locations are not universal to all systems. Transcription factor binding sites can vary in location relative to the promoter. There are some patterns (e.g. positive regulators are often upstream of the promoter and negative regulators bind downstream), but these generalizations are not true for all cases. Again, the key thing to remember is that transcription factors (both positive and negatively acting) have binding sites with which they interact to help regulate the initiation of transcription by RNA polymerase.

The five genes that are needed to synthesize tryptophan in E. coli are located next to each other in the trp operon. When tryptophan is plentiful, two tryptophan molecules bind to the transcription factor and allow the TF-tryptophan complex to bind at the operator sequence. This physically blocks the RNA polymerase from transcribing the tryptophan biosynthesis genes. When tryptophan is absent, the transcription factor does not bind to the operator and the genes are transcribed. Attribution: Marc T. Facciotti (own work).

**Regulation of the trp operon**

When tryptophan is present in the cell it binds to the trp repressor protein. When tryptophan binds to this transcription factor it causes a conformational change in the protein which now allows the TF-tryptophan complex to bind to the trp operator sequence. Binding of the tryptophan–repressor complex at the operator physically prevents the RNA polymerase from binding and transcribing the downstream genes. When tryptophan is not present in the cell, the transcription factor does not bind to the operator; therefore, the transcription proceeds, the tryptophan utilization genes are transcribed and translated, and tryptophan is thus synthesized.

Since the transcription factor actively binds to the operator to keep the genes turned off, the trp operon is said to be "negatively regulated". The proteins that bind to the operator to silence trp expression are negative regulators.

Suggested discussion

Do you think that the trp repressor protein's expression is regulated by trp, or is the protein constitutively expressed?

Suggestion discussion

Suppose nature took a different approach to regulating the trp operon. Design a method for regulating the expression of the trp operon with a positive regulator instead of a negative regulator. (motivator: professors ask this kind of question all of the time on exams)

External link

Watch this video to learn more about the trp operon.

**Example #2: The lac operon**

**Rationale for studying the lac operon**

In this example, we examine the regulation of genes encoding proteins whose physiological role is to import and assimilate the disaccharide lactose, the lac operon. The story of the regulation of lac operon is a common example used in many introductory biology classes to illustrate basic principles of inducible gene regulation. We choose to describe this example second because it is, in our estimation, more complicated than the previous example involving the activity of a single negatively acting transcription factor. By contrast, the regulation of the lac operon is, in our opinion, a wonderful example of how the coordinated activity of both positive and negative regulators around the same promoter can be used to integrate multiple different sources of cellular information to regulate the expression of genes.

As you go through this example, keep in mind the last point. For most Bis2a instructors it is more important for you to understand how the logic of the lac operon than it is to memorize the input/output table presented below. The benefit of understanding the logic of gene regulation is that the concepts can be applied to many different regulatory systems. This goal may be reflected on exams.

The utilization of lactose

Lactose is a disaccharide composed of the hexoses glucose and galactose. It is commonly found in high abundance in milk and some milk products. As one can imagine, the disaccharide can be an important food-stuff for microbes that are able to utilize its two hexoses. E. coli is able to use multiple different sugars as energy and carbon sources, including lactose and the lac operon is a structure that encodes the genes necessary to acquire and process lactose from the local environment. Lactose, however, has not been frequently encountered by E. coli during its evolution and therefore the genes of the lac operon must typically be repressed (i.e. "turned off") when lactose is absent. Driving transcription of these genes when lactose is absent would waste precious cellular energy. By contrast, when lactose is present, it would make logical sense for the genes responsible for the utilization of the sugar to be expressed (i.e. "turned on"). So far the story is similar to that of the tryptophan operon described above. Except... the cell must recognize the presence of a small molecule (lactose) so that it can switch on production of an enzyme to degrade it (and another to transport it into the cell). In the trp operon, the cell must recognize the presence of a small molecule (trp) to that is can switch off production of enzymes that produce it.

Question: In both cases a repressor protein is employed. How is this possible, when opposing results are achieved?

However, there is a catch. Experiments conducted in the 1950's by Jacob and Monod clearly demonstrated that E. coli prefers to utilize all the glucose present in the environment before it begins to utilize lactose. This means that the mechanism used to decide whether or not to express the lactose utilization genes must be able to integrate two types of information (1) the concentration of glucose and (2) the concentration of lactose. While this could theoretically be accomplished in multiple ways, we will examine how the lac operon accomplishes this by using multiple transcription factors.

**The transcriptional regulators of the lac operon**

**The lac repressor - a direct sensor of lactose**

As noted, the lac operon normally has very low to no transcriptional output in the absence of lactose. This is due to two factors: (1) the constitutive promoter strength for the operon is relatively low and (2) the constant presence of the LacI repressor protein negatively influences transcription. This protein binds to the operator site near the promoter and blocks RNA polymerase from transcribing the lac operon genes. By contrast, if lactose is present, lactose will bind to the LacI protein, inducing a conformational change that prevents LacI-lactose complex from binding to its binding sites. Therefore, when lactose is present the negative regulatory LacI is not bound to the its binding site and transcription of lactose utilizing genes can proceed.

CAP protein - an indirect sensor of glucose

In E. coli, when glucose levels drop, the small molecule cyclic AMP (cAMP) begins to accumulate in the cell. cAMP is a common signaling molecule that is involved in glucose and energy metabolism in many organisms. When glucose levels decline in the cell, the increasing concentrations of cAMP allow this compound to bind to the positive transcriptional regulator called catabolite activator protein (CAP) - also referred to as CRP. cAMP-CAP complex has many sites located throughout the E. coli genome and many of these sites are located near the promoters of many operons that control the processing of various sugars.

In the lac operon, the cAMP-CAP binding site is located upstream of the promoter. Binding of cAMP-CAP to the DNA helps to recruit and retain RNA polymerase to the promoter. The increased occupancy of RNA polymerase to its promoter, in turn, results in increased transcriptional output. In this case the CAP protein is acting as a positive regulator.

Note that the CAP-cAMP complex can, in other operons, also act as a negative regulator depending upon where the binding site for CAP-cAMP complex is located relative to the RNA polymerase binding site.

Putting it all together: Inducing expression of the lac operon

For the lac operon to be activated, two conditions must be met. First, the level of glucose must be very low or non-existent. Second, lactose must be present. Only when glucose is absent and lactose is present will the lac operon be transcribed. When this condition is achieved the LacI-lactose complex dissociates the negative regulator from near the promoter, freeing the RNA polymerase to transcribe the operon's genes. Moreover, high cAMP (indirectly indicative of low glucose) levels trigger the formation of the CAP-cAMP complex. This TF-inducer pair now bind near the promoter and act to positively recruit the RNA polymerase. This added positive influence boosts transcriptional output and lactose can be efficiently utilized. The mechanistic output of other combinations of binary glucose and lactose conditions are descried in the table below and in the figure that follows.

*Truth Table for Lac Operon: Signals that Induce or Repress Transcription of the lac Operon*
	CAP binds	Lactose	Repressor binds	Transcription
+	-	-	+	No
+	-	+	-	Some, not much
-	+	-	+	No
-	+	+	-	Yes, lots

Transcription of the lac operon is carefully regulated so that its expression only occurs when glucose is limited and lactose is present to serve as an alternative fuel source.
Attribution: Marc T. Facciotti (own work)

Search

Text Color

Text Size

Margin Size

Font Type

A genetic test for positive or negative regulatory function of a TF

Degradation of RNA

**The lac repressor - a direct sensor of lactose**

CAP protein - an indirect sensor of glucose