Skip to main content
Biology LibreTexts

7.3: Glycoconjugates - Proteoglycans, Glycoproteins, and Glycolipids

  • Page ID
  • (Slow load)

    Many proteins, especially those destined for secretion or insertion into membranes, are post-translationally modified by attachment of carbohydrates. They are usually attached through either Asn or Ser side chains. Carbohydrate modifications on the protein appear to be involved in recognition of other binding molecules, prevention of aggregation during protein folding, protection from proteolysis, and increases half-life of the proteins. In contrast to a protein sequence which is determined by a DNA template, sugars are attached to proteins by enzymes which recognize appropriate sites on proteins and attach the sugars. Since there are many sugars which contain many functional groups that can serve as potential attachment sites, the structures of the oligosaccharides attached to proteins are enormously varied, complex, and hence "information rich" compared to linear or folded polymers like DNA and proteins.

    N-linked Glycoproteins

    These contain carbohydrates attached through either a GlcNAc or GalNAc to an Asn in a X-Asn-X-Thr sequence of the protein. There are three types of N-linked glycoproteins, high mannose, complex, and hybrid. They all contain the same core oligosaccharide - (Man)3(GlcNAc)2 attached to Asn.

    Figure: N-linked Glycoproteins

    N linked glycoprotein


    Here is the SNFG diagram for the core glycan in N-linked glycoproteins.

    N-linked Glycoproteins – Core.svg


    The figures below give the symbolic structures of the three main types.  Note that the designation of α2 implies an α1→2 linkage.  Unless otherwise stated the linkage is presumed to start from carbon 1.

    N-linked high mannose glycoproteins

    N-linked Glycoproteins – High Man.svg


    N-linked complex glycoproteins

    N-linked Glycoproteins - Complex.svg

     N-linked hybrid glycoproteins

    N-linked Glycoproteins – Mixed Hybrid.svg

    Complex N-linked glycans don't contain mannose outside of the core glycan and have GlcNac attached to the branching mannoses in the cores structure.   The complex glycan shown above has a Gal(β1,4)GlcNAc sequence which could be named as the disaccharide lactosamine.  Often lactosamines repeat in the sequence. 

    Hybrid glycans have both unsubstituted terminal mannoses (as in the high-mannose type) and substituted mannoses with an N-acetylglucosamine attached (as in the complex type.  GlcNAc residues added to the core in the hybrid and complex N-glycoproteins are called antennae.  The figure below shows an example of a biantennary N-linked glycan with two GlcNAc branches linked to the core. The core is outlined in red and the two GlcNAcs are labeled 1 and 2.  

    N-linked Glycoproteins – biantennary.svg

    Complex glycans also have bi-, tri- and tetraantennary forms and comprise most of N-glycans.  As shown above, they usually end with sialic acids residues.   About 50% of the surface area of the COVID Sars-2 spike protein is covered with glycans as shown one model structure shown below. The protein surface is in gray and the glycans (biantennary LacNAc N-glycans) are shown in spacefill CPK with carbon in cyan  (PDB file coordinates 5.Swiss.3.M3F1.CYX.TER from Analysis of the SARS-CoV-2 spike protein glycan shield: implications for immune recognition. Oliver C. Grant, David Montgomery, Keigo Ito, Robert J. Woods.  doi:



    In the hybrid oligosaccharide shown above, one terminus contains Gal(β1,4)GlcNAc. However, in all other mammals except man, apes, and old world monkey, an additional Gal is often connected in an α1,3 link to the Gal to give a terminus of: Gal(α1,3)Gal(β1,4)GlcNAc. These animals have an additional enzyme, an α1,3 Gal transferase. Bacteria also have this enzyme and since we have been exposed to this link through bacterial infection, we mount an immune response against it. Why is this important? Pig hearts turn out to be similar to human hearts, so they might be good candidates for transplantation into humans (xenotransplants). However, the Gal-α1,3-Gal link is recognized as foreign, and we mount a significant immune response against it. Several biotech firms are trying to delete the pig α1,3 Gal transferase which would prevent the addition of the terminal Gal, and make them good donors for transplanted hearts.


    Here is an example of an N-linked glycprotein, human beta-2-glycoprotein-I (Apolipoprotein-H),

    Exercise \(\PageIndex{1}\)

    The glycan structures for the beta-2-glycoprotein-I are shown below.  Identify the monosaccharides in each and specific to which asparagine they are linked.

    human beta-2-glycoprotein-I_GlycanSymbols.svg


    add answer here.


    Here is a model of the GP120 HIV protein that contains a high mannose, complex and hybrid N-linked glycans.  Most glycoproteins in the Protein Data Bank do not contain attached glycans.  The glycans here were added with the program GlyProt at 3 of 17 possible Asn residues that would presumably have attached glycans.  Use your mouse or key paid to hover over the monomers in the attached glycans.  Abbreviations for the give residue will appear (adm = alpha-D-Man, bdg= beta-D-Glc or Gal, adn = alpha-D-Neuramindase). 

    Exercise \(\PageIndex{1}\)

    Which Asn chains contain the high Man, complex, and hybrid glycans?


    Add answer here

    The coronavirus pandemic of 2020-21 has been deadly (almost 600,000 deaths in the USA alone). However the 1918 influenza pandemic was far worse.  Avian version of the influenza virus could jump to humans, creating far more lethal pandemic than the COVID 19.

    Influenza and the Avian Flu

    The influenza virus is a simple yet deadly virus (shown below) . It interacts with human cells through a surface protein, hemagglutinin (HA).

    credit: � Paul Digard, Dept Pathology, University of Cambridge

    The virus binds to host cells through interaction of HA with cell surface carbohydrates. Once bound the virus internalizes, ultimately leading to release of the RNA genome of the virus into the host cell.

    Animation: Influenza entering cell

    The hemagglutin protein is the most abundant protein on viral surface (as surmised by antibody formation). 15 avian and mammalian variants have been identified (based on antibody studies). Only 3 adapt to humans in last 100 yr, giving pandemic strains H1 (1918), H2 (957) and H3 (1968). Three recent avian variants (H5, H7, and H9) jump directly to humans recently but have low human to human transmissibility.

    The influenza hemagglutinin protein has the following characteristics:

    • mature form is homotrimer (3 identical protein subunits), MW 220,000 with multiple sites for covalent attachment of sugars. Hemagglutinin is a glycoprotein.
    • each monomer synthesized as single polypeptide chain precursor (HA0) that is cleaved into HA1 and HA2 subunits by the protease trypsin in epithelial cells of lung.
    • structure known for human (H3), swine (H9), avian (H5) subtypes.

    Hemagluttinin bind to sialic acid (Sia), which is covalently attached to many cell membrane glycoproteins. The sialic acid is usually connected through an α(2,3) or α(2,6) link to galactose on N-linked glyocproteins. The subtypes found in avian (and equine) influenza isolates bind preferentially to Sia (α2,3) Gal which predominates in avian GI tract where viruses replicate. Human influenza isolates prefer Sia α(2,6)

    Sia (α2,3) Gal which predominates in avian GI tract where viruses replicate. Human influenza isolates prefer Sia α(2,6) Gal. Human virus of H1, H2, and H3 subtype (cause 1918, 1957, and 1968 pandemics) recognize Sia α(2,6) Gal, major form in human respiratory tract. The swine influenza HA bind to Sia α(2,6) Gal and some Sia (α2,3) Gal both of which found in swine.

    Sia α(2,6) Gal (Human)

    Sia α(2,3) Gal (Avian and some Swine)

    Sialic 26 Gal

    Sialic 23 Gal

    (made with Sweet, with an OH, not AcNH on sialic acid on C5)

    (made with Sweet, with an OH, not AcNH on sialic acid on C5)

    Structures from:

    The present avian flu (H5N1) is deadly but lacks human to human transmissibility at the moment. Why? One reason is that it appears to bind deep in the lungs and is not released easily on coughing or sneezing. It appears that cell surface glycoproteins deeper in the respiratory tract have Sia (α2,3) Gal which accounts for this pathology.

    The virus, before it leaves the cell, forms a bud on the intracellular side of the cell with the HA and NA in the cell membrane of the host cell. The virus in this state would not leave the cell since its HA molecules would interact with sialic acid residues in the host cell membrane, holding the virus in the membrane. Neuraminidase hydrolyzes sialic acid from cell surface glycoproteins, allowing the virus to complete the budding process and be released from the cell as new viruses. The drugs Oseltamivir (Tamiflu) and zanamivir (Relenza) bind to and inhibit neuraminidase, whose activity is necessary for viral release from infected cells. Tamiflu appears to work against N1 of the present H5N1 avian influenza viruses. Governments across the world are stock piling this drug in case of a pandemic caused by the avian virus jumping directly to humans and becoming transmissible from human to human.


    O-linked Glycoproteins

    The CHOs are usually attached from a Gal (β 1→3) GalNAc to a Ser or Thr of a protein.

    Figure: O-linked Glycoproteins

    O linked glycoprotein

    The blood group antigens (CHOs on cells attached to either proteins or lipids) are examples . The sugars shown as chairs (in contrast to structures found in many texts) in the figure below are the blood group antigens.They are attached to a core heterosaccharide (shown as red elipse below) which is connected to either a membrane glycoprotein or glycolipid.

    Blood Group Antigens

    Here is a symbolic diagram of the A antigen in the glycolipid form.


    The trimeric branched residues on the left hand side represent the A antigen shown above.  The red triangle is L-fucose.  Yellow represents galactose or GalNac, while blue is glucose or GlcNAc.


    Some proteins are so modified with CHOs that they contain more CHOs than amino acids. Proteins linked to glycosoaminoglycans are together called proteoglycans (PGs).  The consists of a core protein linked to one or more glycosoaminoglycans. GAGs are linear sulfated glycans which we described in Chapter 7.2.  The structures of a few proteoglycans are known. The GAGs are O-linked to the protein, typically to a Ser of a Ser-Gly dipeptide often repeated in the protein. Some of the proteoglycans also contained N-linked oligosaccharide groups.

    PGs can be soluble and are found in the extracellular matrix, or as integral membrane proteins. There are about 43 genes for proteoglycans.  Some, through differential splicing of the RNA transcript, give soluble or transmembrane forms. Given the diversity of sugars and the varying extent of sulfation, the CHO part of PGs provide an incredible variety of binding structures at or near to the cell surface. The figure below shows the variety of proteoglycans found in mammalian cells. PGs help form the extracellular matrix which provides the rich and support environment between cells.


    One PG, syndecan, binds through its intracellular domain to the internal cytoskeleton of the cell, while interacting with another protein - fibronectin - in the extracelluar matirx. Fibronectin also binds other molecules which can regulate cellular growth and other interactions. PGs act like glue in connecting the extracellular and intracellular functions of the cell.  There are four different core syndecan proteins (SDCs 1–4), with SDC4 lacking the cytoplasmic and transmembrane teather to the membrane and thus exists in soluble form in the intracellular matrix. The glycan components of syndecans are mostly heparan sulfate while SDC 1 and 3 also have two chondroitin sulfate chains. 

    Most proteins bind PGs through a PG binding motif of BBXB or BBBXXB where B is a basic amino acid. Some proteins bind to specific sequences in specific GAGs. For instance, antithrombin 3, an inhibitor of blood clotting, binds specifically to heparin, which enhances its interaction with the clotting proteins thrombin and Factor Xa.  The models below show a 5 residue fragment of heparin interacting with the key amino acids side chains of Factor Xa.

    For those more chemically oriented, the extracellular matrix (ECM)might appear to be a nondescript mess, since chemists are used to defined structures.  The figure below shows a cartoon of the ECM and may alleviate to clarify the components.  Few structure files exists for them given the inherent flexibility of the glycan components.



    Frontiers in Neuroscience 9(50):98 (2015).  DOI: 10.3389/fnins.2015.00098.  CC BY 4.0

    Cell Walls and Glycolipids

    In contrast to eukaryotic cells, bacteria and plant cells have a cell wall in addition to a lipid bilayer membrane. These are essentially carbohydrate polymers which determine cell shape, offer protection from exterior pathogens, hypotonic conditions and the high internal osmotic pressures, preventing swelling and bursting of the cells.  This is especially important in plants, which need strength and rigidity against the "turgor" pressure of the aqueous cytoplasm against the cell membrane.  This prevents wilting in plants.  The cell wall in plants and probably bacteria are involved in cell signaling across the cell membrane.

    Bacteria Cell Walls

    Two types of cell walls exists.

    a. Gram positive bacteria-

    These bacterial can be stained with Gram stain. The wall consists of a GlcNAc (β 1→4) MurNAc repeat. (GlcNAc is often abbreviaed as NAG while MurNAc is abbreviaed as NAM.) This is similar to the GlcNAc (β 1→4) GlcNAc homopolymer chitin, except that every other GlcNAc contains a lactate molecule covalently attached in an ether-linkage to the C3 hydroxyl to form the monomer N-Acetylmuramic acid. A pentapeptide (Ala-D-isoGlu-Lys-D-Ala-D-Ala) is attached in amide link to the carboxyl group of the lactate in MurNAc. The GlcNAc (β 1→4) MurNAc strands are covalently connected by a pentaglycine bridge through the epsilon amino group of the pentapeptide Lys on one strand and the terminal D-Ala of a pentapeptide on another strand. A small part of the structure of a gram positive bacterial cell wall is shown in the figure below.  It shows one repeating GlcNAc-MurNAc disaccaharide unit in front (darker) and one in back (lighter) connected through the peptides shown.



    A symbolic structure of a larger section of the gram positive cell well is shown below, using the correct glycan symbolic notation.


    One final structure is found in Gram + peptidoglycan cell walls. Techioic acids are often attached to the carbon 6 of MurNAc. Teichoic acid is a polymer of glycerol or ribitol to which alternative GlcNAc and D-Ala are linked to the middle C of the glycerol. Multiple glycerols are linked through phosphodiester bonds. These teichoic acids often make up 50% of the dry weight of the cell wall, and present a foreign (or antigenic) surface to infected hosts. These often serve as receptors for viruses that infect bacteria (called bacteriophages).

    Notice that all monomeric units of peptidoglycan and attached teichoic acid derivatives are covalently attached on form one large molecule comprising the entire cell wall!  This structure (along with the gram negative cell wall structures) are the largest single macromolecules in nature.

    Figure: Teichoic Acid

    Teichoic Acid


    b. Gram negative bacteria

    These bacterial can NOT be stained with Gram stain. The wall consists of the same structure as in Gram positive bacteria, but the GlcNAc (β 1→4) MurNAc strands are covalently connected through a direct amide bond between derivative of Lys, meso-diaminopimelic acid (m-A2pm), on one peptide strand and to the last D-Ala of a pentapeptide on another strand. (i.e. there is no pentaGly spacer).  The connector peptide is Ala-D-isoGlu-m-A2pm-D-Ala-D-Ala

    m-A2pm replaces Lys 3 of the peptide in most Gram-negative species and in Gram-positive bacteria of the genus Bacillus and mycobacteria. The stereochemistries at each chiral center are different (R and S), but because the molecule has a plane of symmetry, it is an example of a meso-compound, a diastereomomer of the molecule which does not have a different enantiomeric version and is hence a meso compound. 



    A small part of the structure of a gram negative bacterial cell wall is shown in the figure below



    Here is an iCn3D model (not a crystal or NMR structure) of the Gram negative peptidoglycan of E. Coli.  The PDB coordinates were kindly provided by Jame Gumbart.  The peptide part of the peptidoglycan is represented in spacefill. The repeating (GlcNac-MurNac)n and pentapeptide 

    In addition, Gram negative bacterial don't have teichoic acid polymers. Rather they have a second, outer lipid bilayer. The cell wall peptidoglycan (PG) is sandwiched between the inner and outer bilayers. The space between the lipid bilayers is called the periplasm. The outer membrane is coated with a lipopolysaccharide (LPS) of varying composition. The LPS determines the antigenicity of the bacteria. The different LPS are called the O-antigens. The figure below shows the overall structure of the gram negative bacterial membrane organization.  (PS is LPS, PG is peptidoglycan)



    Gram negative wall labelled  Can be used for non-commercial purposes, when acknowledged 

    A detailed view of the structure of the lipopolysaccharide (LPS) from Salmonella tryphimurium is shown below. 





    Exercise \(\PageIndex{1}\)

    questions on this:diff btw g+ and g- cells


    Add texts here. Do not delete this text first.

    c.  Archaeal Cell Membranes and Walls

    We have already discussed that the lipids in Archeal cell membranes contain L (instead of D) glycerol derivatives and that ether links (more stable in reactive environments) replace ester links with isoprenoid (sometimes branched) chains replacing fatty acid chains.  The cell wall is quite different as well and some don't have one.  The type of cell wall is depending on the environmental needs for stability. They don't contained peptidoglycans. The figure below shows four different types.


    • pseudomurein - This is the closest to the peptidoglycans presented above.  Instead of repeating disaccharide units of (NAM-NAG)n, they have a repeating dissacharide unit of  N-acetylalosaminuronic acid (NAT)-NAG.  The structure of NAT is shown below.


    • methanochondroitin - This is similar to the glycosaminoglycan chondrotin sulfate
    • S-Layer
    • Sheath/S-Layer
    d. Plant Cell Wall

    If you thought bacterial cell walls were complicated, wait until you see plant cell walls!  There are about 35 different types of plant cells and each may have a different cell wall depending on local needs of a given cell.  Cells synthesize thin cell wall that extends and stay thin as the cell grows.  

    The figure below shows the primary cell wall of plants.  The primary cell wall contains cellulose microfibrils (no surprise) and two other polymers, pectin and hemicellulose.   The middle lamella consisting of pectins is somewhat analogous to the extracellular matrix discussed above.


    After cell growth, the cell often synthesizes a secondary cell wall that is thicker than the first for extra rigidity.  Since the enzymatic machinery to synthesize it is in the cytoplasm and in the cell membrane, it is deposited between the cell membrane and the primary cell wall, as shown in the animated gif below.



    Here is a representation that shows both the primary and secondary cell well.


    NAC-MYB-based transcriptional regulation of secondary cell wall biosyn in land plantsIMAGE-01.svg

    Nakano Yoshimi et al. Frontiers in Plant Science (6), 288 (2015)  Creative Commons AttributionLicense (CC BY).

    The middle lamella, which contains pectins, lignins and some proteins, helps "glue together" the primary cell walls of surrounding plants.

    Primary Cell Wall:

    The main component of the primary plant wall is the homopolyer cellulose (40% -60% mass) in which the glucose monomers are linked β(1→4)-linked into strands that collect into microfibrils through hydrogen bonds interactions.  Two other groups of polymers, hemicellulose and pectin make up the plant cell wall.

    Hemicellulose can make up to 20-40% by the mass  These polymers have β(1→4) backbones of glucose, mannose, or xylose (called xyloglucans, xylans, mannans, galactomannans, glucomannans, and galactoglucomanannans along with some β(1→3 and 1→4)-glucans.  The most abundant hemicellulose in higher plants higher plants are the xyloglucans and have a cellulose backbone linked at O6 to α-D-xylose. Pectin consists of linked galacturonic acids forming homogalacturonans, rhamnogalacturonans, and rhamnogalacturonans II (RGII) [12] [13]. Homogalacturonans (α1→4) linked D-GalA making up more than 50% of the pectin 

    The figure below shows some of the cell wall components of a plant.


    Costa and Plazanet.  Advances in Biological Chemistry 06(03):70-105.  DOI: 10.4236/abc.2016.63008.  License CC BY 4.0

    Secondary Cell Wall

    The structure of the secondary cell wall depends on the function and environment of the cell. It contains cellulose fibers, hemicellulose and in addition a new polymer, lignin.  It abundant in xylem vessels and fiber cells of woody plants.  It gives the plant extra stability and new functions, such for transport of fluids within the plant through channels.

    Lignins, which can make up to 25% of the biomass weight, are made from derivatives of phenylalanine, more directly from cinnamic acid.  This derives from  is made from phenylalanine which is hydroxylated and converted through other steps to hydroxycinnamyl alcohols called monolignols as shown in the figure below.  Three common monomer (M) derivatives, p-coumaryl, coniferyl, and sinapyl alcohols can polymerize into lignins, with the units in the polymer (P) names  hydroxyphenyl, guaiacyl and syringly, respectively. 


    Lignols are activated phenolic compounds, which form phenoxide free radical (catalyzed by enzymes called peroxidases), which can attack another lignols to form covalent dimers.  Reaction mechanisms for the dimerization of the MS sinapyl alcohol free radical are shown as an example in the figure below.


    Now imagine this polymerization continuing through formation of additional phenolic free radicals and coupling at a myriad of site to form a huge covalent lignin polymer.  Here is one example of a larger lignin.


    Lignin.png   Creative Commons Attribution-Share Alike 3.0 Unported license.

    Biofuels and Climate Change  

    "Plant cell walls represent an enormous biomass resource for the generation of biofuels and chemicals. As lignocellulose property principally determines biomass recalcitrance, the genetic modification of plant cell walls has been posed as a powerful solution. Here, we review recent progress in understanding the effects of distinct cell wall polymers (cellulose, hemicelluloses, lignin, pectin, wall proteins"  continue

    Finally here is an image of a poplar (a tree) cell wall, made using surface raman scattering, showing lignin, cellulose, and lipids in secondary xylem cell cell walls.  



    SRS images of lignin, cellulose, and lipids in xylem cell walls. d-e are SRS images of lignin, cellulose, and lipids in the secondary xylem cell walls of poplar, respectively.  The 3D surface plots are shown in gi

    Xu, H., et al. A label-free, fast and high-specificity technique for plant cell wall imaging and composition analysis. Plant Methods 17, 29 (2021).


    The Extracellular Matrix (ECM) and Basement Membranes

    We won't formally discuss cell membranes until Chapter 11, but since anyone reading this book has previously seen biological membranes (including the Gram negative and positive bilayers discussed above), let's explore a term that most chemistry students, but perhaps not biology students, will find very confusing.  That topic is the basement membrane.  The basement membrane is encountered often so often, that we will explore its overall structure here even though it is not a lipid bilayer.  It fits well here since it is a complex structure consisting of proteins and proteoglycans.  Its very amorphous which makes its structure difficult to those hoping for crystal structures or even complex bilayers.  Its appropriate to discus it after the plant cell wall since they are somewhat similar.  We will offer a cursory explanation.  For a great overall introduction, please visit Introduction to Extracellular Matrix and Cell Adhesion in BioLibre texts.  Some of the images (when noted) below come from that Cell Biology book chapter.

    From Introduction to Extracellular Matrix and Cell Adhesion

    The extracellular matrix (ECM) is a general term for the extremely large proteins and polysaccharides that are secreted by some cells in a multicellular organism, and which acts as connective material to hold cells in a defined space. Cell density can vary greatly between different tissues of an animal, from tightly-packed muscle cells with many direct cell-to-cell contacts to liver tissue, in which some of the cells are only loosely organized, suspended in a web of extracellular matrix, shown in the figure below.  

    Screen Shot 2019-01-07 at 8.16.56 PM.png

    Extracellular matrix (ECM). Typical components include collagen, proteoglycans (with hydration shell depicted around sugars), bronectin, and laminin. The cellular receptors for a number of these ECM components are integrins, although the exact integrin αβ pair may differ.

    The ECM is a generic term encompassing mixtures of polysaccharides and proteins, including collagens, bronectins, laminins, and proteoglycans, all secreted by the cell. The proportions of these components can vary greatly depending on tissue type. Two, quite different, examples of ECM are the basement membrane underlying the epidermis of the skin, a thin, almost two-dimensional layer that helps to organize the skin cells into a nearly-impenetrable barrier to most simple biological insults, and the massive three-dimensional matrix surrounding each chondrocyte in cartilaginous tissue. The ability of the cartilage in your knee to withstand the repeated shock of your footsteps is due to the ECM proteins in which the cells are embedded, not to the cells that are actually rather few in number and sparsely distributed. Although both types of ECM share some components in common, they are clearly distinguishable not just in function or appearance, but in the proportions and identity of the constituent molecules

    Here is the general structure of the basement membrane. Think of it as an amphorpous polymer mixture (somewhat similar to a polyacrylamide gel).  


    Mentor and DuBois. Journal of Cell Biology · February 2012.  DOI: 10.1155/2012/723419 · Source: PubMedCreative Commons License. 


    • Was this article helpful?