Reading material for Essentials of Bioinformatics
SCHEDULE | COURSE HOME | PRESENTATIONS
To download the following course materials, right click on the link and select "Save Target As..."
Lecture 1: History and Introduction
-
Christos A. Ouzounis and Alfonso Valencia (2003).
Early bioinformatics: the birth of a discipline a personal view.
Bioinformatics 19, 2176-2190.
PMID: 14630646
-
Denis Noble (2003).
Will genomics revolutionise pharmaceutical R&D?
Trends in Biotechnology 21, 333-337.
PMID: 12902169
-
Hannon GJ (2002).
RNA interference.
Nature Jul 11; 418:244-51.
PMID: 12110901
-
Lai EC et al (2003).
Computational identification of Drosophila microRNA genes.
Genome Biology 4:R42.
PMID: 12844358
-
Enright AJ et al (2003).
MicroRNA targets in Drosophila.
Genome Biology 4:P8.
PMID: 14709173
-
Lewis BP et al (2003).
Prediction of Mammalian MicroRNA Targets.
Cell 115:787-798.
PMID: 14697198
-
Lewis BP et al (2005).
Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets.
Cell 120:15-20.
PMID: 15652477
-
Bartel DP (2004).
MicroRNAs: Genomics, Biogenesis, Mechanism, and Function.
Cell 116:281-297.
PMID: 14744438
-
Zhou J et al (2006).
Composite microRNA target predictions and comparisons of several prediction algorithms.
MBI Technical Report No. 51.
Lecture 2: Dynamic Programming Algorithms
- David Mount 'Bioinformatics' - chapter 3
- Durbin et al. 'Biological Sequence Analysis' - chapter 2
-
Reif JH and Tate SR (1997).
On Dynamic Algorithms for Algebraic Problems.
Journal of Algorithms 22:347-371.
DOI: doi:10.1006/jagm.1995.0807
- [PDF]
Dayhoff MO, Schwartz RM and Orcutt BC (1978).
A Model of Evolutionary Change in Proteins.
In: Atlas of Protein Sequence and Structure.
-
Eddy S (2004).
Where did the BLOSUM62 alignment score matrix come from?
Nature Biotechnology 22(8):1035-1036.
PMID: 15286655
Lecture 3: Markov Chains and HMMs
- David Mount 'Bioinformatics' - chapter 4
- Durbin et al. 'Biological Sequence Analysis' - chapters 3-5
- [PDF]
Hull TE and Dobell AR (1962).
Random Number Generators.
SIAM Review 4:230-254.
- [PDF]
Doob JL (1942).
What is a Stochastic Process?
American Mathematical Monthly 49:648-653.
- [PDF]
Kac M (1954).
Signal and Noise Problems.
American Mathematical Monthly 61:23-26.
Lecture 4: Statistics #1
- A.M. Campbell, L.J. Heyer 'Discovering Genomics, Proteomics, and Bioinformatics' (2002)
-
[website]
Berkeley Course: Statistics for Bioinformatics (2003) - Julia Brettschneider
-
Rosner B 'Fundamentals of Biostatistics' (2002)
-
[PDF]
Lee WW (2003).
Core Statistics for Bioinformatics.
-
[website]
Neal RM, Professor, Dept. of Statistics and Dept. of Computer Science, University of Toronto.
-
A.C. Wardlaw 'Practical Statistics for Experimental Biologists' (2001) 2nd Edition, John Wiley and Sons
-
B. Dawson, R.G. Trapp 'Basic and Clinical Biostatistics' (2004) Fourth Edition, McGraw Hill
-
L. Gonick, W. Smith 'The Cartoon Guide to Statistics' (1993) Harper Resource
- [website] Statistics Decision Tree
- [website] Statistics Selector
- [website] Online statistics tools
Lecture 5: Phylogenetic Models
- Durbin et al. 'Biological Sequence Analysis' - chapters 7 and 8
- Felsenstein J 'Inferring Phylogenies' 2003
- Li and Graur 'Fundamentals of Molecular Evolution' 1991
-
Saitou N and Imanishi T (1989).
Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree.
Mol Biol Evol 6(5):514-525.
MBE link
-
Li W-H (1989).
A Statistical Test of Phylogenies Estimated from Sequence Data.
Mol Biol Evol 6(4):424-435.
PMID: 2615641
-
Whelan S et al (2001).
Molecular phylogenetics: state-of-the-art methods for looking into the past.
Trends in Genetics 17(5):262-272.
PMID: 11335036
Lecture 6: Protein Functional Annotation
-
Tatusov RL et al (2003).
The COG database: an updated version includes eukaryotes.
BMC Bioinformatics. Sep 11;4(1):41.
PMID: 12969510
-
Harris MA et al (2004).
Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource.
Nucleic Acids Res. Jan 1;32 Database issue:D258-61
PMID: 14681407
-
Eisen MB, Spellman PT, Brown PO, Botstein D (1998).
Cluster analysis and display of genome-wide expression patterns.
Proc Natl Acad Sci U S A.
1998 Dec 8;95(25):14863-8.
PMID: 9843981
-
von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B (2003).
STRING: a database of predicted functional associations between proteins.
Nucleic Acids Res. Jan 1;31(1):258-61.
PMID: 12519996
-
Marx JL (1983).
Onc gene related to growth factor gene.
Science Jul 15;221(4607): 248.
PMID: 6304882
-
Vogelstein B and Kinzler KW (2004).
cancer genes and the pathways they control.
Nature Medicine Aug; 10(8):789-799.
PMID: 15286780
-
Dayhoff MO (1976).
The origin and evolution of protein superfamilies.
Fed Proc Aug; 35(10):2132-2138.
PMID: 181273
Lecture 7: Statistics #2
- David Mount 'Bioinformatics' - p121 onwards
- A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin (1996). Bayesian Data Analysis. Chapman and Hall, New York.
- N. Breslow (1990). Biostatistics and Bayes (with discussion). Statistical Science, 5: 269-298.
-
R.E. Tarone (1982). The use of historical control information in testing for a trend in proportions. Biometrics, 38: 215-220.
PMID: 7115873
-
C.M. Kendziorski, M.A. Newton, H. Lan, M.N. Gould (2003).
On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles.
Statistics in Medicine, 22: 3899-3914.
PMID: 14673946
Lecture 8: Protein Secondary and Tertiary Structure
-
Baker, D (2006).
Prediction and design of macromolecular structures and interactions.
Philos Trans R Soc Lond, B, Biol Sci 361(1467):459-463.
PMID: 16524834
-
Dahiyat BI and Mayo SL (1997).
De novo protein design: fully automated sequence selection.
Science 278(5335):82-87.
PMID: 9367772
-
Ginalski et al (2005).
Practical lessons from protein structure prediction.
Nucl Acids Res 33:1874-1891.
PMID: 15805122
-
Madhusudhan MS et al Comparative Protein Structure Modeling
In: The Proteomics Protocols Handbook. Ed: JM Walker. Humana Press Inc., Totowa, NJ, 831-860, 2005
DOI: 10.1385/1-59259-890-0:831
-
Poole AM and Ranganathan R (2006).
Knowledge-based potentials in protein design.
Current Opinion in Structural Biology 16(4):508-513.
PMID: 16843652
Lecture 9: Next Gen Sequencing and Analysis
Lecture 9: Comparative Genomics
-
Clayton RA et al (1998).
Findings emerging from complete microbial genome sequences.
Curr Opin Microbiol 1:562-566.
PMID: 10755848
-
Osterman A and Overbeek R (2003).
Missing genes in metabolic pathways: a comparative genomics approach.
Curr Opin Chem Biol 7:238-251.
PMID: 12714058
-
Carroll SB (2003).
Genetics and the making of Homo sapiens.
Nature 422:849-857.
PMID: 12712196
-
Wolfe K (2001).
Yesterday's polyploids and the mystery of diploidization.
Nature Reviews Genetics 2:333-341.
PMID: 11331899
Lecture 11: Regulation at the RNA and DNA Levels
-
Qiu P (2003).
Recent advances in computational promoter analysis in understanding the transcriptional regulatory network.
Biochem Biophys Res Commun 309:495-501.
PMID: 12963016
-
Choo Y and Klug A (1997).
Physical basis of a protein-DNA recognition code.
Curr Opin Struct Biol 7:117-125.
PMID: 9032060
-
Harrison A (1991).
Structural taxonomy of DNA-binding domains.
Nature 353:715-719.
PMID: 1944532
- Carey M and Smale ST. Transcriptional Regulation in Eukaryotes. Cold Spring Harbor Laboratory Press (2000)
Book link
-
Black DL (2000).
Protein Diversity from Alternative Splicing: A Challenge for Bioinformatics and Post-Genome Biology.
Cell 103:367-370.
PMID: 11081623
Lecture 12: Protein-Protein Interactions, Networks and Clustering
-
Pavlovic V et al (2002).
A Bayesian framework for combining gene predictions.
Bioinformatics 18:19-27.
PMID: 11836207
-
Jansen R (2003).
A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data.
Science 302:449-453.
PMID: 14564010
-
Legrain P et al (2001).
Protein-protein interaction maps: a lead towards cellular functions.
TIG 17:346-352.
PMID: 11377797
-
Chen Y and Xu D (2003)
Computational analyses of high-throughput protein-protein interaction data.
Curr Prot Pept Sci 4:159-181
PMID: 12769716
-
Edwards AM et al (2002).
Bridging structural biology and genomics: assessing protein interaction data with known complexes.
TIG 8:529-536.
PMID: 12350343
-
Auerbach D et al (2002).
The post-genomic era of interactive proteomics: Facts and perspectives.
Proteomics 2:611-623.
PMID: 12112840
-
Drewes G and Bouwmeester T (2003).
Global approaches to protein-protein interactions.
Curr Opin Cell Biol 15:199-205.
PMID: 12648676
-
Tucker C et al (2001).
Towards an understanding of complex protein networks.
Trends Cell Biol 11:102-106.
PMID: 11306254
-
Valencia A and Pazos F (2002).
Computational methods for the prediction of protein interactions.
Curr Opin Struct Biol 12:368-373.
PMID: 12127457
-
Brun C et al (2003).
Approach of the functional evolution of duplicated genes in Saccharomyces cerevisiae using a new classification method based on protein-protein interaction data.
J Struct Func Genomics 3:213-224.
PMID: 12836700
-
Enright AJ, Skrabanek L, Bader GD (2007).
Computational Prediction of Protein-Protein Interactions.
In: The Proteomics Protocols Handbook. Humana Press.
DOI: 10.1385/1-59259-890-0:629
-
Bader G et al (2003).
Functional genomics and proteomics: charting a multidimensional map of the yeast cell.
Trends Cell Biol 13:344-56.
PMID: 12837605
-
Uetz P et al (2002).
Visualization and Integration of Protein-Protein Interactions.
In: Protein-Protein Interactions: A Molecular Cloning Manual Edited by E Golemis. Cold Spring Harbor, NY: CSHL Press.
Book link
-
Karp P (2001).
Pathway Databases: A Case Study in Computational Symbolic Theories.
Science 293:2040-2044.
PMID: 11557880
-
Phizicky EM and Fields S (1995).
Protein-Protein Interactions: Methods for Detection and Analysis.
Microbiol Rev 59:94-123.
PMID: 7708014
Lecture 13: Dynamic Systems
Lecture 14: Cardiac Modeling
For further information please contact Lucy Skrabanek
|