BMIF 310: Foundations of Bioinformatics
Instructor: David L. Tabb, PhD
In this course, students will be introduced to the algorithms and concepts fundamental to the field of bioinformatics. The experimental problems addressed by these algorithms will be part of the examination of the software.
Prerequisites
Ideally, students will have prior exposure to computer programming, though software development is not a requirement of the class. Students who are likely to develop software tools (ranging from Perl scripts to number-crunching code) in support of their research are likely to benefit most from this class, though users of publicly available web utilities will also find it useful.
Graded Elements
Students will be evaluated on the basis of two scored elements, each comprising 50% of the final grade:
- A brief quiz at the start of each class will test each student’s understanding of material presented in the previous class and any assigned readings.
- Students will create a written report for a project and present their work to the class at the close of the semester. Example projects include a review of literature on a bioinformatics topic or a newly developed algorithm from one of the areas described in the course. Project plans must be approved by the course director no later than one month before the final class.
Overview of topics
Introduction
- Biochemistry basics: nucleic acids, proteins, lipids, carbohydrates
- Molecular biology basics: cells and organelles, transcription and translation, mutation anddamage repair, cellular signaling, etc.
- Molecular underpinnings of example diseases
- Types of data in molecular biology: DNA electropherograms,sequences, microarrays, gels, mass spectrometry, NMR, X-ray crystallography, etc.
- Defining bioinformaticsand differentiating from computational biology
Sequence Analysis
- Sequence alignment: Dot plots, Needleman-Wunsch, Smith-Waterman, Lipman-Pearson, BLAST
- Multiple sequence alignment: ClustalW / phylograms / cladograms
- Hidden Markov Models (HMMs) for motif detection
- Protein families and domains: Interpro and Blocks
- PAM and BLOSUM substitution matrices
Genome Bioinformatics
- Phred: assessing error rates from sequencing electropherograms
- Phrap: building sequence contigs from sequencing reads
- History of NCBI
- Polymorphism detection
Microarray Bioinformatics
- Fundamentals of cDNA arrays.
- Clustering genes: Quality Threshold Clustering
- MIAME: standards for communication of microarray data
- LIMS development
Proteome Bioinformatics
- Protein structure inference
- Predicting migration in 2D gel electrophoresis
- Finding peaks in MALDI-TOF profiles
- Statistical models for MS/MS peptide identification
- MIAPE: standards for communication of proteomics data
- Searching for biomarkers
Systems Bioinformatics
- Genetic regulatory networks
- Functional annotation of genes
- Gene Ontology (GO) terms
- ANNs, SVMs, and CART decision trees