9.2: Bioinformatics
- Page ID
- 14970
5.2 Bioinformatics
An unprecedented revolution has been observed in science with recent technological advances, which have provided a large amount of “omic” data. The crescent generation and availability of this information available in public databases were, and still are, a challenge for professionals from different areas. However, what is the challenge? In biology, the main challenge is to make sense of the enormous amount of structural data and sequences that have been generated at multiple levels of biological systems. Still, in bioinformatics, development of tools is necessary (statistical and computational) capable of assisting in understanding the mechanisms underlying biological questions in the study. Besides, if we consider the complexity of science, this is a highly reductionist view. The era of a “new biology” emerges accompanied by the birth/development of other sciences, such as bioinformatics and computational biology, which have an integrated interface of molecular biology. Although considered recently, bioinformatics and genomics have evolved interdependently and promoted a historical impact on the available knowledge.
Bioinformatics, a hybrid science that links biological data with techniques for information storage, distribution and analysis to support multiple areas of scientific research including biomedicine. It mainly involving molecular biology and genetics, computer science, mathematics and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. Bioinformatics is fed by high-throughput data-generating experiments, including genomic sequence determinations and measurements of gene expression patterns. Database projects curate and annotate the data and then distribute it via the World Wide Web. Mining these data leads to scientific discoveries and to the identification of new clinical applications.
A bioinformatics solution usually involves the following steps:
- Collect statistics from biological data
- Build a computational model
- Solve a computational modeling problem
- Test and evaluate a computational algorithm
It also addresses the following aspects:
- Types of biological information and databases
- Sequence analysis and molecular modeling
- Genomic analysis
- Systems biology
In the field of medicine in particular, a number of important applications for bioinformatics have been discovered. For example, it is used to identify correlations between gene sequences and diseases, to predict protein structures from amino acid sequences, to aid in the design of novel drugs, and to tailor treatments to individual patients based on their DNA sequences (pharmacogenomics). In bioinformatics, we can now conduct global analyses of all the available data with the aim of uncovering common principles that apply across many systems and highlight novel features.
Some applications of bioinformatics in biotechnology are given below:
-
Genomics
To manage an escalating amount of genomic information, bioinformatic tools are required to maintain and analyze the DNA sequences from different organism. Determination of sequence homology, gene finding, coding region identification, structural and functional analyses of genomic sequences etc, all this is possible by the use of different bioinformatics tools and software packages.
Given below is a list of few bioinformatics tools used in genomics (Table 5.1).
Table 5.1 Bioinformatics Tools/Databases used in Genomics
Bioinformatics tools | Purpose |
---|---|
Carrie | Transcriptional regulatory networks database |
CisML | Motif detection tool |
ICSF | Identification of conserved structural features in TF binding sites |
oPossum | Tool for motif searching |
Promoser | Promoter extraction tool from eukaryotic organisms |
REPFIND | Determine clustered repeats in DNA fragment |
Cluster‐Buster | Tool for predicting motifs cluster in DNA sequences |
Cister | Finds regulatory regions in DNA fragments |
Clover | Find overrepresented motifs in DNA sequences |
GLAM | Tool for predicting functional motifs |
MotifViz | Identification of overrepresented motifs |
NECorr | Tool for analysing gene expression data |
ROVER | Predicts overrepresented motifs in DNA fragments |
SeqVISTA | Sequences viewer tool |
DNADynamo | Tool to find transcription factors with over‐represented binding sites in the upstream regions of co‐expressed human genes |
Table from Kahn, N.T. (2018)
-
Comparative Genomics
Bioinformatics plays an important role in comparative genomics by determing the genomic structural and functional relationship between different biological species.
Given below is a list of few bioinformatics tools used in comparative genomics (Table 5.2).
Table 5.2 Bioinformatics Tools/Databases used in Comparative Genomics
Bioinformatics tools | Purpose |
---|---|
BLAST | DNA or protein sequence alignment tool |
HMMER | Homologous protein sequences searching tool |
Clustal Omega | Multiple sequence alignments tool |
Sequerome | Sequence profiling tool |
ProtParam | Predicts the physico-chemical properties of proteins |
novoSNP | Predicts single point mutation in DNA sequences |
ORF Finder | Find open reading frame in putative genes |
Virtual Foorprint | Analysis of whole prokaryotic genome |
WebGeSTer | Predicts gene termination sites during transcription |
Genscan | Find exon-intron sites in DNA sequences |
Softberry Tools | Genomes annotation tool along with the structure and function prediction of biological molecules |
MEGA | Study evolutionary relationship |
MOLPHY | Maximum likelihood based phylogenetic analysis tool |
PHYLIP | Tool for phylogenetic studies |
JStree | Tool for viewing and editing phylogenetic trees |
Jalview | It is an alignment editing tool |
The DNA Data Bank of Japan | Resources for nucleotide sequences |
Rfam | Database contains collection of RNA families |
Uniprot | Protein sequence database |
Protein Data Bank | Database provide data on structures of nucleic acids, proteins etc |
SWISS PROT | Database containing the manually annotated protein sequences |
InterPro | Provide information on protein families, its conserved domains and actives sites |
Proteomics Identifications Database | Contains data on functional characterization and post-translation modification of proteins and peptides |
Ensembl | Database containing annotated genomes of eukaryotes including human, mouse and other vertebrates |
Medherb | Database for medicinally herbs |
Table from Kahn, N.T. (2018)
Advanced molecular based techniques led to the accumulation of huge proteomic data of protein activity patterns, interactions, profiling, composition, structural information, image analysis, peptide mass fingerprinting, peptide fragmentation fingerprinting etc. This enormous data could be managed by using different tools of bioinformatics.
Given below is a list of few bioinformatics tools used in proteomics (Table 5.3).
Table 5.3 Bioinformatics Tools/Databases used in Proteomics.
Bioinformatics tools | Purpose |
---|---|
K2 / FAST | Protein structure alignment tool |
SMM | Tool for determining peptides binding to major histocompatibility complex |
ZDOCK | Protein‐protein docking tool |
Docking Benchmark | Tool to evaluate docking algorithms performance |
ZDOCK Server | An automated server for running ZDOCK |
MELANIE | Proteomic analysis for analysing 2D-Gel images |
Table from Kahn, N.T. (2018)
-
Drug Discovery
Clinical bioinformatics is an emerging new field of bioinformatics that employs various bioinformatics tool such as computer aided drug designing to design novel drugs, vaccines, DNA drug modelling, and in silico drug testing to produce new and effective drugs in a shorter time frame with lower risks.
-
Cancer Research and Analysis
Bioinformatic tools such as NCI, NCIP (part of NCI) and CBIIT have played an important role in genomics, proteomics, imaging, and metabolomics to increase our knowledge of the molecular basis of cancer.
-
Phylogenetic Studies
Using numerous bioinformatics tools, phylogenetic analysis of the molecular data can easily be achieved in a short period of time by constructing phylogenetic trees to study its evolutionary relationship based on sequence alignment.
-
Forensic Science
A number of databases consists of DNA profiles of known delinquents. Advancement in microarray technology, bayesian networks, and programming algorithms provides an effective method of evidence organization and interpretation.
-
Bio-Defense
Though bioinformatics has limited impact on forensic since there is a need for more advanced algorithms and computational applications so that the established databases may exhibit interoperability with each other.
-
Nutrigenomics
Progressions in structural /functional genomics and molecular technologies such as genome sequencing and DNA microarrays generates valuable knowledge which explains nutrition in relation of an individual’s genetics which directly influences its metabolism. Because of the influx of bioinformatics tools, nutrition-related research is tremendously increased.
-
Gene Expression
Regulation of gene expression is the core of functional genomics allowing researchers to apply genomic data to molecular technologies that can quantify the amount of actively transcribing genes in any cell at any time (e.g. gene expression arrays).
Given below is a list of few bioinformatics tools used in gene expression study Table 5.4.
Table 5.4 Bioinformatics Tools/Databases Used in Gene Expression
Bioinformatics tools | Purpose |
---|---|
GeneChords | Conserved gene retrieval tool |
Bioconductor | Provides tools for the analysis of high throughput genomic data |
GXD | Gene expression database for the laboratory mouse |
Inverted Repeats Finder | Find inverted repeats in genomic DNA |
BU ORChID | Database stores hydroxyl radical cleavage data of DNA sequences |
ODB | Predicts functional gene clusters |
RNA Fold Support | Predicts RNA structure based on mutations in alleles |
CellNetVis | Visualizing tool for biological complexes and networks |
Tandem Repeat Finder | Finds tandem repeats in genomic DNA |
VisANT | Tools for visualizing and analysing many biological interactions |
PROMO | Identification of transcription factor binding sites |
ConTra V.3 | Detection of transcription factor binding site |
Table remixed from Kahn, N.T. (2018)
-
Food Quality
New improvements in computing algorithms and available structural simulation databases of recognized structures has brought molecular modeling into conventional food chemistry. Such simulations will make it possible to improve food quality by developing new food additives by comprehending the basis of taste tenacity, antagonism and complementation.
-
Predicting Protein Structure and Function
Protein topology prediction is now so much easy thanks to bioinformatics which helps in the prediction of 3D structure of a protein to gain an insight into its function as well.
Given below is a list of few bioinformatics tools used in protein structure and function prediction Table 5.
Table 5.5 Bioinformatics Tools/Databases Used in Protein Structure and Function Prediction
Bioinformatics tools | Purpose |
---|---|
CATH | Tool for the categorized organization of proteins |
Phyre2 | Tool for protein structure prediction |
HMMSTR | For the prediction of sequence-structure correlations in proteins |
MODELLER | Predicts 3D structure of protein |
JPRED/APSSP2 | Predicts secondary structures of proteins |
RaptorX | Predicts protein structure |
QUARK | Predicts Protein Structure |
Table remixed from Kahn, N.T. (2018)
-
Personalized Medicine
Doctors will be able to analyze a patient's genetic profile and prescribe the best available drug therapy and dosage from the beginning by employing bioinformatics tool.
-
Microbial Genome Applications
Microbes have been studied at very basic level with the help of bioinformatics tools required to analyze their unique set of genes that enables them to survive under unfavorable conditions.