Skip to main content
Biology LibreTexts

19: pET expression system

  • Page ID
    143406
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    Summary

    The pET system is a protein expression system developed by Novagen Inc. to express a protein of interest in special strains of E. coli upon treatment with the chemical IPTG (isopropyl-β-D-thiogalactoside).

    Also known as 

    Can be used as a type of heterologous expression system, wherein a protein from one organism is expressed in a different organism.

    Samples needed 

    Expression systems require two components:

    1. A plasmid vector carrying the gene (or cDNA) of interest. In this case, that is a pET vector.
    2. The cells to express the gene/cDNA. In this case, those are E. coli BL21 (DE3) cells.

    Method 

    Heterologous expression systems using bacteria

    One of the best reasons to transfer foreign DNA into another organism is to have the gene product expressed. Heterologous expression systems offer several advantages:

    • High yields of normally scarce proteins can be obtained through over-expression.

    • Proteins derived from organisms that are difficult to obtain, slow to grow, or dangerous to handle can be expressed in high yield in simple, safe, and cheap culture systems.

    • Proteins for which no known purification schemes exist can be obtained as preponderant components of cultures or extracts.

    In fact, even if the gene product of a particular gene is unknown, the gene can be expressed and its product studied by expression in a heterologous system.

    E. coli has several advantages as an expression host. It is cheap and easy to grow in large quantity. Also, many well-studied bacterial promoters can be included in this system to drive expression of the cloned genes by known inducers. Expression of eukaryotic genes in a prokaryote, however, has some major disadvantages. These include the failure of prokaryotic cells to make appropriate post-translational modifications for eukaryotic proteins. This means that the induced proteins may not function correctly if their biological activity depends on correct glycosylation, disulfide bond formation, phosphorylation, or proteolytic processing.

    Another disadvantage of prokaryotic expression systems is the inability of bacteria to transcribe correct mRNAs from cloned eukaryotic genes containing introns. Message processing systems that will correctly splice out intervening sequences are usually restricted to eukaryotes.

    Bacterial expression systems can be used to produce either intact native proteins or fusion proteins. Native proteins require that the foreign gene be cloned downstream of a strong bacterial promoter and ribosome-binding site. Fusion proteins are exactly what their name implies: they are proteins that contain some amino acid sequence from two different proteins. Often, fusion proteins include the entire protein of interest and a short sequence from another protein, called an affinity tag, an epitope tag, or simply a tag. Proteins can be tagged on either the N- or C-terminus depending on the relative positions of the transcription start site, the coding sequences for the protein of interest and tag, and the stop codon. When cloning to create a tagged protein, it is imperative that the tag is in frame with the protein of interest. 

    Both types of expression system have advantages. Tagged proteins can be isolated easily even if you know nothing about the product of the cloned gene. Since part of the fusion protein consists of a known peptide tag, you can isolate the fusion protein using, for instance, antibodies to the tag. Also, fusion proteins are frequently more stable (less readily degraded) than native proteins expressed in the same host. 

    On the other hand, native proteins may fold in a conformation that more closely resembles the physiological protein structure. This would be important for enzymes or proteins with specific binding properties. Tags can also be problematic if the purified protein is being used raise antibodies against the protein of interest. To raise polyclonal antibodies for lab use, the purified protein is injected into an animal (such as a mouse, rat, or rabbit), and the animal’s immune system produces the antibodies. However, the tag is often more immunogenic than the protein of interest itself, which would result too many anti-tag antibodies and not enough antibodies specific to the protein of interest. Therefore, a tag is often cleaved from the purified protein before using the protein to generate antibodies.

    The pET expression system

    The pET system was published by Studier and Moffatt in 19861 and further developed by Novagen2. Expression systems have two components – the DNA vector and the host cells.

    Vector

    Here, we will discuss the specific vector pET28b as an example. Some important features of the vector (opens PDF of vector map from Millipore Sigma↗):

    • multiple cloning site (MCS) that allows the researcher to easily insert a gene or cDNA of interest using one of several restriction enzyme cut sites.
       
    • Regions encoding epitope tags. The pET28b vector allows the researcher to incorporate N-terminal polyhistidine (His) and T7 tags and/or a C-terminal His tag.

      The pET28b plasmid has been engineered such that foreign genes are cloned into a site just downstream of an ATG initiation codon. Shortly after the initiation codon (but still 5′ to the cloned gene) comes a series of six triplets in the DNA sequence that code for histidine residues. Upon translation, this polyhistidine tag will form part of the amino terminus of the protein expressed from this cloned gene. The expressed protein is, therefore, a kind of fusion protein, with a short peptide fused to the protein of interest.

      The polyhistidine tag is useful for purification of the fusion protein. Polyhistidine binds with relatively high affinity to divalent cations, particularly Ni2+, Co2+, and Zn2+. These cations can be "immobilized" on a resin, and the complex can be used to purify the recombinant protein based on the affinity of the polyhistidine tag for the divalent cations. This is one form of affinity chromatography, a standard technique useful for protein purification.

      T7 is an epitope tag derived from the T7 major capsid protein with the amino acid sequence MASMTGGQQMG. The T7 tag can be recognized by a specific antibody, allowing detection and purification of pET28b expressed products by a second method (unrelated to the polyhistidine tag).

      Lastly, there is a second polyhistidine tag in pET28b downstream of the cloning site. This can be used to create a C-terminally His-tagged fusion protein if the stop codon is left out of the coding sequence for the cloned gene and reading frame is maintained. If the stop codon is left in the coding sequence of the cloned gene, the C-terminal His tag is not translated. 
       
    • A thrombin cleavage site.The amino acid sequence following the tag contains a 6-residue stretch (L-V-P-R-G-S) that is the recognition site for a protease known as thrombin. Thrombin can be used to enzymatically cleave off the N-terminal His tag after purification if the tag’s presence would impede downstream applications.
       
    • A promoter derived from bacteriophage T7. Genes or cDNAs are cloned into the plasmid next to this promoter and therefore are under its control. The result is that cloned genes cannot be transcribed by the normal bacterial RNA polymerase, but only by T7 RNA polymerase.
       
    • An origin of replication. This allows the plasmid to be replicated by the E. coli host machinery, independently of the chromosomal DNA.
       
    • A kanamycin resistance gene (KanR). This encodes the enyzme NPTII (neomycin phosphotransferase II), which inactivates kanamycin3, making any cells expressing the gene resistant to this antibiotic. 
       
    • A lacI gene, encoding the lac repressor protein. The lac repressor is able to bind to a sequence from the lac operon, repressing its transcription. In the pET system, inclusion of lacI allows the researcher to control when the E. coli express the protein of interest (see below).
       

    Cells & protein expression

    Molecular biologists can transform a plasmid into E. coli to achieve different goals, and they need to use the right E. coli strain for the intended job. For instance, strains like JM109 E. coli, have good characteristics for cloning. However, to use E. coli to produce protein with the pET system, BL21(DE3) cells are used, which have been genetically modified for this purpose.

    These cells have had the T7 RNA polymerase gene cloned into their chromosomal DNA (signified by the (DE3) designation) (Figure 2). This chromosomal copy of the T7 RNA polymerase gene has been placed under the control of a promoter known as lacUV5. This promoter, like the normal lac operon promoter in wild type E. coli, is repressed by the lac repressor protein and induced (released from repression) by lactose. Thus, T7 RNA polymerase is not transcribed in most situations, and therefore, the gene introduced in the pET28b plasmid is also not transcribed. Administering lactose (or its analogues) can "turn on" the expression of T7 RNA polymerase by causing a conformational change that renders the lac repressor protein inactive. Traditionally, isopropyl-β-D-thiogalactoside (IPTG) is used to induce expression because it binds to the repressor protein as lactose does, but cannot be hydrolyzed by the lacZ product β-galactosidase.  

    Once T7 RNA polymerase expression has been induced and T7 RNA polymerase synthesized by the bacterial cells, the gene or cDNA that was cloned into the plasmid vector (under control of the T7 promoter) can be transcribed. This allows the researcher to control timing of expression of the cloned gene. 

    When BL21 E. coli cells transformed with a pET plasmid are treated with IPTG, we say call it an induction, because the researcher is inducing expression of the gene of interest.

    Overview of the pET expression system. Image description available.
    Figure 1. Overview of the pET expression system. Steps to express the cloned gene: (1) IPTG is administered and binds to the lac repressor protein, rendering it inactive. In the absence of IPTG, the lac repressor protein prevents expression of the T7 RNA pol. (2) BL21(DE3) E. coli contain a chromosomal copy of the T7 RNA polymerase gene under the control of the lacUV5 promoter. (3) With the lac repressor protein no longer bound to the lacUV5 promoter, E. coli cells’ usual transcriptional and translational machinery produce T7 RNA polymerase mRNA and protein (orange/left). (4) T7 RNA polymerase binds to the T7 lac promoter in the expression plasmid. (5) With the help of T7 RNA polymerase, E. coli produce the target gene mRNA and target protein (blue/right). Notes: IPTG is the chemical isopropyl β-d-1-thiogalactopyranoside, which is similar in structure to a naturally occurring lactose metabolite. The promoter upstream of the T7 RNA polymerase gene is the lacUV5 promoter, a mutated version of the natural lac operon promoter from E. coli. Illustration made with BioRender4[Image description]

    Controls 

    An empty pET vector can be used as a negative induction control, and a validated pET vector containing a gene/cDNA can be used as a positive induction control.


    Image Descriptions 

    Figure 1 image description:  

    A diagram of an E. coli cell labeled "E. coli BL21 (DE3)." Various parts of the diagram are numbered to show the steps in the cell resulting in expression of the cloned gene. (1) shows the lac repressor protein bound to IPTG, not associated with any DNA. (2) shows the the E. coli chromosome with an inset showing the T7 RNA pol gene under the control of the lacUV5 promoter. (3) shows mRNA and protein for T7 RNA pol being expressed from the gene. (4) shows the expression plasmid with T7 RNA pol bound. The inset shows the polymerase bound at the T7 promoter, which controls the expression of the target gene. (5) shows mRNA and protein for the target being expressed from the target gene. 

    Thumbnail 

    "Laboratory bacterial culture.jpg"↗ by Soledad Mirand-Rottmann is licensed under CC BY-SA 4.0↗.

    Image description: Bacterial cultures of transgenic E. coli grown in glass flasks overnight. 

    Author 

    Juliet Fuhrman, Erica Polleys, and Katherine Mattaini, Tufts University


    1. Studier, F. W., and B. A. Moffatt. 1986. Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes. Journal of Molecular Biology 189:113–130. 
    2. (2011). pET System Manual 11th Edition (User Protocol TB055 Rev. C 0611JN). Novagen. 
    3. Gallego, Adriana. “19 Common Questions about Kanamycin.” GoldBio, goldbio.com/articles/article/19-Common-Questions-About-Kanamycin↗. Accessed 23 Sept. 2024. 

    4. Created in BioRender. Mattaini, K. (2023) BioRender.com/n55c975↗.

    19: pET expression system is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?