Skip to main content
Biology LibreTexts

9.2: Bioinformatics

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    5.2 Bioinformatics

    An unprecedented revolution has been observed in science with recent technological advances, which have provided a large amount of “omic” data. The crescent generation and availability of this information available in public databases were, and still are, a challenge for professionals from different areas. However, what is the challenge? In biology, the main challenge is to make sense of the enormous amount of structural data and sequences that have been generated at multiple levels of biological systems. Still, in bioinformatics, development of tools is necessary (statistical and computational) capable of assisting in understanding the mechanisms underlying biological questions in the study. Besides, if we consider the complexity of science, this is a highly reductionist view. The era of a “new biology” emerges accompanied by the birth/development of other sciences, such as bioinformatics and computational biology, which have an integrated interface of molecular biology. Although considered recently, bioinformatics and genomics have evolved interdependently and promoted a historical impact on the available knowledge.

    Bioinformatics, a hybrid science that links biological data with techniques for information storage, distribution and analysis to support multiple areas of scientific research including biomedicine. It mainly involving molecular biology and genetics, computer science, mathematics and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. Bioinformatics is fed by high-throughput data-generating experiments, including genomic sequence determinations and measurements of gene expression patterns. Database projects curate and annotate the data and then distribute it via the World Wide Web. Mining these data leads to scientific discoveries and to the identification of new clinical applications.

    A bioinformatics solution usually involves the following steps:

    • Collect statistics from biological data
    • Build a computational model
    • Solve a computational modeling problem
    • Test and evaluate a computational algorithm

    It also addresses the following aspects:

    • Types of biological information and databases
    • Sequence analysis and molecular modeling
    • Genomic analysis
    • Systems biology

    In the field of medicine in particular, a number of important applications for bioinformatics have been discovered. For example, it is used to identify correlations between gene sequences and diseases, to predict protein structures from amino acid sequences, to aid in the design of novel drugs, and to tailor treatments to individual patients based on their DNA sequences (pharmacogenomics). In bioinformatics, we can now conduct global analyses of all the available data with the aim of uncovering common principles that apply across many systems and highlight novel features.

    Some applications of bioinformatics in biotechnology are given below:

    • Genomics

    To manage an escalating amount of genomic information, bioinformatic tools are required to maintain and analyze the DNA sequences from different organism. Determination of sequence homology, gene finding, coding region identification, structural and functional analyses of genomic sequences etc, all this is possible by the use of different bioinformatics tools and software packages.

    Given below is a list of few bioinformatics tools used in genomics (Table 5.1).

    Table 5.1 Bioinformatics Tools/Databases used in Genomics

    Bioinformatics tools Purpose
    Carrie Transcriptional regulatory networks database
    CisML Motif detection tool
    ICSF Identification of conserved structural features in TF binding sites
    oPossum Tool for motif searching
    Promoser Promoter extraction tool from eukaryotic organisms
    REPFIND Determine clustered repeats in DNA fragment
    Cluster‐Buster Tool for predicting motifs cluster in DNA sequences
    Cister Finds regulatory regions in DNA fragments
    Clover Find overrepresented motifs in DNA sequences
    GLAM Tool for predicting functional motifs
    MotifViz Identification of overrepresented motifs
    NECorr Tool for analysing gene expression data
    ROVER Predicts overrepresented motifs in DNA fragments
    SeqVISTA Sequences viewer tool
    DNADynamo Tool to find transcription factors with over‐represented binding sites in the upstream regions of co‐expressed human genes

    Table from Kahn, N.T. (2018)

    • Comparative Genomics

    Bioinformatics plays an important role in comparative genomics by determing the genomic structural and functional relationship between different biological species.

    Given below is a list of few bioinformatics tools used in comparative genomics (Table 5.2).

    Table 5.2 Bioinformatics Tools/Databases used in Comparative Genomics

    Bioinformatics tools Purpose
    BLAST DNA or protein sequence alignment tool
    HMMER Homologous protein sequences searching tool
    Clustal Omega Multiple sequence alignments tool
    Sequerome Sequence profiling tool
    ProtParam Predicts the physico-chemical properties of proteins
    novoSNP Predicts single point mutation in DNA sequences
    ORF Finder Find open reading frame in putative genes
    Virtual Foorprint Analysis of whole prokaryotic genome
    WebGeSTer Predicts gene termination sites during transcription
    Genscan Find exon-intron sites in DNA sequences
    Softberry Tools Genomes annotation tool along with the structure and function prediction of biological molecules
    MEGA Study evolutionary relationship
    MOLPHY Maximum likelihood based phylogenetic analysis tool
    PHYLIP Tool for phylogenetic studies
    JStree Tool for viewing and editing phylogenetic trees
    Jalview It is an alignment editing tool
    The DNA Data Bank of Japan Resources for nucleotide sequences
    Rfam Database contains collection of RNA families
    Uniprot Protein sequence database
    Protein Data Bank Database provide data on structures of nucleic acids, proteins etc
    SWISS PROT Database containing the manually annotated protein sequences
    InterPro Provide information on protein families, its conserved domains and actives sites
    Proteomics Identifications Database Contains data on functional characterization and post-translation modification of proteins and peptides
    Ensembl Database containing annotated genomes of eukaryotes including human, mouse and other vertebrates
    Medherb Database for medicinally herbs

    Table from Kahn, N.T. (2018)

    Advanced molecular based techniques led to the accumulation of huge proteomic data of protein activity patterns, interactions, profiling, composition, structural information, image analysis, peptide mass fingerprinting, peptide fragmentation fingerprinting etc. This enormous data could be managed by using different tools of bioinformatics.

    Given below is a list of few bioinformatics tools used in proteomics (Table 5.3).

    Table 5.3 Bioinformatics Tools/Databases used in Proteomics.

    Bioinformatics tools Purpose
    K2 / FAST Protein structure alignment tool
    SMM Tool for determining peptides binding to major histocompatibility complex
    ZDOCK Protein‐protein docking tool
    Docking Benchmark Tool to evaluate docking algorithms performance
    ZDOCK Server An automated server for running ZDOCK
    MELANIE Proteomic analysis for analysing 2D-Gel images

    Table from Kahn, N.T. (2018)

    • Drug Discovery

    Clinical bioinformatics is an emerging new field of bioinformatics that employs various bioinformatics tool such as computer aided drug designing to design novel drugs, vaccines, DNA drug modelling, and in silico drug testing to produce new and effective drugs in a shorter time frame with lower risks.

    • Cancer Research and Analysis

    Bioinformatic tools such as NCI, NCIP (part of NCI) and CBIIT have played an important role in genomics, proteomics, imaging, and metabolomics to increase our knowledge of the molecular basis of cancer.

    • Phylogenetic Studies

    Using numerous bioinformatics tools, phylogenetic analysis of the molecular data can easily be achieved in a short period of time by constructing phylogenetic trees to study its evolutionary relationship based on sequence alignment.

    • Forensic Science

    A number of databases consists of DNA profiles of known delinquents. Advancement in microarray technology, bayesian networks, and programming algorithms provides an effective method of evidence organization and interpretation.

    • Bio-Defense

    Though bioinformatics has limited impact on forensic since there is a need for more advanced algorithms and computational applications so that the established databases may exhibit interoperability with each other.

    • Nutrigenomics

    Progressions in structural /functional genomics and molecular technologies such as genome sequencing and DNA microarrays generates valuable knowledge which explains nutrition in relation of an individual’s genetics which directly influences its metabolism. Because of the influx of bioinformatics tools, nutrition-related research is tremendously increased.

    • Gene Expression

    Regulation of gene expression is the core of functional genomics allowing researchers to apply genomic data to molecular technologies that can quantify the amount of actively transcribing genes in any cell at any time (e.g. gene expression arrays).

    Given below is a list of few bioinformatics tools used in gene expression study Table 5.4.

    Table 5.4 Bioinformatics Tools/Databases Used in Gene Expression

    Bioinformatics tools Purpose
    GeneChords Conserved gene retrieval tool
    Bioconductor Provides tools for the analysis of high throughput genomic data
    GXD Gene expression database for the laboratory mouse
    Inverted Repeats Finder Find inverted repeats in genomic DNA
    BU ORChID Database stores hydroxyl radical cleavage data of DNA sequences
    ODB Predicts functional gene clusters
    RNA Fold Support Predicts RNA structure based on mutations in alleles
    CellNetVis Visualizing tool for biological complexes and networks
    Tandem Repeat Finder Finds tandem repeats in genomic DNA
    VisANT Tools for visualizing and analysing many biological interactions
    PROMO Identification of transcription factor binding sites
    ConTra V.3 Detection of transcription factor binding site

    Table remixed from Kahn, N.T. (2018)

    • Food Quality

    New improvements in computing algorithms and available structural simulation databases of recognized structures has brought molecular modeling into conventional food chemistry. Such simulations will make it possible to improve food quality by developing new food additives by comprehending the basis of taste tenacity, antagonism and complementation.

    • Predicting Protein Structure and Function

    Protein topology prediction is now so much easy thanks to bioinformatics which helps in the prediction of 3D structure of a protein to gain an insight into its function as well.

    Given below is a list of few bioinformatics tools used in protein structure and function prediction Table 5.

    Table 5.5 Bioinformatics Tools/Databases Used in Protein Structure and Function Prediction

    Bioinformatics tools Purpose
    CATH Tool for the categorized organization of proteins
    Phyre2 Tool for protein structure prediction
    HMMSTR For the prediction of sequence-structure correlations in proteins
    MODELLER Predicts 3D structure of protein
    JPRED/APSSP2 Predicts secondary structures of proteins
    RaptorX Predicts protein structure
    QUARK Predicts Protein Structure

    Table remixed from Kahn, N.T. (2018)

    • Personalized Medicine

    Doctors will be able to analyze a patient's genetic profile and prescribe the best available drug therapy and dosage from the beginning by employing bioinformatics tool.

    • Microbial Genome Applications

    Microbes have been studied at very basic level with the help of bioinformatics tools required to analyze their unique set of genes that enables them to survive under unfavorable conditions.

    9.2: Bioinformatics is shared under a not declared license and was authored, remixed, and/or curated by Henry Jakubowski and Patricia Flatt.

    • Was this article helpful?