Skip to main content
Biology LibreTexts

3.3: APPENDIX B- Protein and Nucleotide Sequences

  • Page ID
    26471
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    B1: Protein Sequence of c-Abl(229-511) (283 AA; 32,730 MW) from ABL1_HUMAN Swissprot accession number: P00519 (http://www.expasy.org/uniprot/P00519). Numbering is for isoform 1A. For isoform 1B numbering, add 19 (Abl(248-530)). We will use 1A numbering throughout this course.

    clipboard_e009a3ada872badc9ace475e8eee8c85a.png

    B2: Nucleotide sequence encoding Abl(229-511)

    The nucleotide sequence of the Abl(229-511) construct that we are working with is identical to the kinase domain of the Bcr-Abl protein. The DNA sequence for the kinase domain is shown below.

    Genbank accession number for the full human Bcr-Abl protein: NM_005157

    (http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=NM_005157)

    Note: amino acid 229 corresponds to nucleotides 688-689 (not 685-687 as might be expected) because there are 3 bases prior to the start of the open reading frame (ORF). Before working with any nucleotide sequence, you should confirm that the DNA is inframe and encodes the expected peptide or protein sequence using a DNA to protein translation tool (ie. http://www.expasy.ch/tools/dna.html).

    (688) tcc cccaactacg acaagtggga gatggaacgc

    721 acggacatca ccatgaagca caagctgggc gggggccagt acggggaggt gtacgagggc

    781 gtgtggaaga aatacagcct gacggtggcc gtgaagacct tgaaggagga caccatggag

    841 gtggaagagt tcttgaaaga agctgcagtc atgaaagaga tcaaacaccc taacctggtg

    901 cagctccttg gggtctgcac ccgggagccc ccgttctata tcatcactga gttcatgacc

    961 tacgggaacc tcctggacta cctgagggag tgcaaccggc aggaggtgaa cgccgtggtg

    1021 ctgctgtaca tggccactca gatctcgtca gccatggagt acctggagaa gaaaaacttc

    1081 atccacagag atcttgctgc ccgaaactgc ctggtagggg agaaccactt ggtgaaggta

    1141 gctgattttg gcctgagcag gttgatgaca ggggacacct acacagccca tgctggagcc

    1201 aagttcccca tcaaatggac tgcacccgag agcctggcct acaacaagtt ctccatcaag

    1261 tccgacgtct gggcatttgg agtattgctt tgggaaattg ctacctatgg catgtcccct

    1321 tacccgggaa ttgacctgtc ccaggtgtat gagctgctag agaaggacta ccgcatggag

    1381 cgcccagaag gctgcccaga gaaggtctat gaactcatgc gagcatgttg gcagtggaat

    1441 ccctctgacc ggccctcctt tgctgaaatc caccaagcct ttgaaacaat gttccaggaa

    1501 tccagtatct cagacgaagt ggaaaaggag ctgggg

    B3: Point mutations in the kinase domain of Abl detected in leukemia patients

    Amino acid substitution locations in mutant Bcr-Abl are indicated in red with the amino acid substitution(s) in bold directly above the wild type residue:

    clipboard_e3ba18cae4b232e08afe2f4ad72069291.png

    List of mutations:

    The amino acid substitutions are indicated in bold, followed by the corresponding nucleotide sequence* in parenthesis, and a fraction (Y/Z), where X = the number of patient cases in which the given base pair mutation was detected and Z = the number of cases tested for the given mutation.

    * This nucleotide numbering has been converted from GenBank entry M1472 numbering to the nucleotide numbering found in the GenBank entry in Appendix B2 (entry NM_005157) and used throughout Modules 4 and 5.

    M244V (A733G) in 3/125, L248V in 2/29, G250E (G752A) in 6/87, G250R (G751A) in 1/117, Q252R in 1/32, Q252H (G759C/T) in 12/125, Y253H (T760C) in 9/154, Y253F (A761T) in 6/125, E255K (G766A) in 28/182, E255V (A767T) in 3/101, D276G (A830G) in 1/33, T277A in 1/117, V289A, F311L in 1/24, T315I (C947T) in 27/194, T315N (C947A) in 1/33, F317L (C954G) in 4/60, M343T (T1031C) in 1/32, M351T (T1055C) in 24/204, E355G in 4/25, F359V (T1078G) in 4/59, V379I (G1138A) in 1/32, F382L (T1147C) in 1/32, L387M (T1162A) in 2/149, L387F in 3/117, H396P (A1190C), H396R (A1190G) in 5/12, A397P in 1/117, S417Y (C1253A) in 1/27, E459K (G1378A) in 1/27, F486S (T1460C) in 1/27


    This page titled 3.3: APPENDIX B- Protein and Nucleotide Sequences is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Elizabeth Vogel Taylor (MIT OpenCourseWare) .