Reading material for Essentials of Bioinformatics


To download the following course materials, right click on the link and select "Save Target As..."

Lecture 1: History and Introduction

  1. Christos A. Ouzounis and Alfonso Valencia (2003). Early bioinformatics: the birth of a discipline a personal view. Bioinformatics 19, 2176-2190. PMID: 14630646
  2. Denis Noble (2003). Will genomics revolutionise pharmaceutical R&D? Trends in Biotechnology 21, 333-337. PMID: 12902169
  3. Hannon GJ (2002). RNA interference. Nature Jul 11; 418:244-51. PMID: 12110901
  4. Lai EC et al (2003). Computational identification of Drosophila microRNA genes. Genome Biology 4:R42. PMID: 12844358
  5. Enright AJ et al (2003). MicroRNA targets in Drosophila. Genome Biology 4:P8. PMID: 14709173
  6. Lewis BP et al (2003). Prediction of Mammalian MicroRNA Targets. Cell 115:787-798. PMID: 14697198
  7. Lewis BP et al (2005). Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120:15-20. PMID: 15652477
  8. Bartel DP (2004). MicroRNAs: Genomics, Biogenesis, Mechanism, and Function. Cell 116:281-297. PMID: 14744438
  9. Zhou J et al (2006). Composite microRNA target predictions and comparisons of several prediction algorithms. MBI Technical Report No. 51.

Lecture 2: Dynamic Programming Algorithms

  1. David Mount 'Bioinformatics' - chapter 3
  2. Durbin et al. 'Biological Sequence Analysis' - chapter 2
  3. Reif JH and Tate SR (1997). On Dynamic Algorithms for Algebraic Problems. Journal of Algorithms 22:347-371. DOI: doi:10.1006/jagm.1995.0807
  4. [PDF] Dayhoff MO, Schwartz RM and Orcutt BC (1978). A Model of Evolutionary Change in Proteins. In: Atlas of Protein Sequence and Structure.
  5. Eddy S (2004). Where did the BLOSUM62 alignment score matrix come from? Nature Biotechnology 22(8):1035-1036. PMID: 15286655

Lecture 3: Markov Chains and HMMs

  1. David Mount 'Bioinformatics' - chapter 4
  2. Durbin et al. 'Biological Sequence Analysis' - chapters 3-5
  3. [PDF] Hull TE and Dobell AR (1962). Random Number Generators. SIAM Review 4:230-254.
  4. [PDF] Doob JL (1942). What is a Stochastic Process? American Mathematical Monthly 49:648-653.
  5. [PDF] Kac M (1954). Signal and Noise Problems. American Mathematical Monthly 61:23-26.

Lecture 4: Statistics #1

  1. A.M. Campbell, L.J. Heyer 'Discovering Genomics, Proteomics, and Bioinformatics' (2002)
  2. [website] Berkeley Course: Statistics for Bioinformatics (2003) - Julia Brettschneider
  3. Rosner B 'Fundamentals of Biostatistics' (2002)
  4. [PDF] Lee WW (2003). Core Statistics for Bioinformatics.
  5. [website] Neal RM, Professor, Dept. of Statistics and Dept. of Computer Science, University of Toronto.
  6. A.C. Wardlaw 'Practical Statistics for Experimental Biologists' (2001) 2nd Edition, John Wiley and Sons
  7. B. Dawson, R.G. Trapp 'Basic and Clinical Biostatistics' (2004) Fourth Edition, McGraw Hill
  8. L. Gonick, W. Smith 'The Cartoon Guide to Statistics' (1993) Harper Resource
  9. [website] Statistics Decision Tree
  10. [website] Statistics Selector
  11. [website] Online statistics tools

Lecture 5: Phylogenetic Models

  1. Durbin et al. 'Biological Sequence Analysis' - chapters 7 and 8
  2. Felsenstein J 'Inferring Phylogenies' 2003
  3. Li and Graur 'Fundamentals of Molecular Evolution' 1991
  4. Saitou N and Imanishi T (1989). Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree. Mol Biol Evol 6(5):514-525. MBE link
  5. Li W-H (1989). A Statistical Test of Phylogenies Estimated from Sequence Data. Mol Biol Evol 6(4):424-435. PMID: 2615641
  6. Whelan S et al (2001). Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends in Genetics 17(5):262-272. PMID: 11335036

Lecture 6: Protein Functional Annotation

  1. Tatusov RL et al (2003). The COG database: an updated version includes eukaryotes. BMC Bioinformatics. Sep 11;4(1):41. PMID: 12969510
  2. Harris MA et al (2004). Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. Jan 1;32 Database issue:D258-61 PMID: 14681407
  3. Eisen MB, Spellman PT, Brown PO, Botstein D (1998). Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8. PMID: 9843981
  4. von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B (2003). STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. Jan 1;31(1):258-61. PMID: 12519996
  5. Marx JL (1983). Onc gene related to growth factor gene. Science Jul 15;221(4607): 248. PMID: 6304882
  6. Vogelstein B and Kinzler KW (2004). cancer genes and the pathways they control. Nature Medicine Aug; 10(8):789-799. PMID: 15286780
  7. Dayhoff MO (1976). The origin and evolution of protein superfamilies. Fed Proc Aug; 35(10):2132-2138. PMID: 181273

Lecture 7: Statistics #2

  1. David Mount 'Bioinformatics' - p121 onwards
  2. A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin (1996). Bayesian Data Analysis. Chapman and Hall, New York.
  3. N. Breslow (1990). Biostatistics and Bayes (with discussion). Statistical Science, 5: 269-298.
  4. R.E. Tarone (1982). The use of historical control information in testing for a trend in proportions. Biometrics, 38: 215-220. PMID: 7115873
  5. C.M. Kendziorski, M.A. Newton, H. Lan, M.N. Gould (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine, 22: 3899-3914. PMID: 14673946

Lecture 8: Protein Secondary and Tertiary Structure

  1. Baker, D (2006). Prediction and design of macromolecular structures and interactions. Philos Trans R Soc Lond, B, Biol Sci 361(1467):459-463. PMID: 16524834
  2. Dahiyat BI and Mayo SL (1997). De novo protein design: fully automated sequence selection. Science 278(5335):82-87. PMID: 9367772
  3. Ginalski et al (2005). Practical lessons from protein structure prediction. Nucl Acids Res 33:1874-1891. PMID: 15805122
  4. Madhusudhan MS et al Comparative Protein Structure Modeling In: The Proteomics Protocols Handbook. Ed: JM Walker. Humana Press Inc., Totowa, NJ, 831-860, 2005 DOI: 10.1385/1-59259-890-0:831
  5. Poole AM and Ranganathan R (2006). Knowledge-based potentials in protein design. Current Opinion in Structural Biology 16(4):508-513. PMID: 16843652

Lecture 9: Next Gen Sequencing and Analysis

Lecture 9: Comparative Genomics

  1. Clayton RA et al (1998). Findings emerging from complete microbial genome sequences. Curr Opin Microbiol 1:562-566. PMID: 10755848
  2. Osterman A and Overbeek R (2003). Missing genes in metabolic pathways: a comparative genomics approach. Curr Opin Chem Biol 7:238-251. PMID: 12714058
  3. Carroll SB (2003). Genetics and the making of Homo sapiens. Nature 422:849-857. PMID: 12712196
  4. Wolfe K (2001). Yesterday's polyploids and the mystery of diploidization. Nature Reviews Genetics 2:333-341. PMID: 11331899

Lecture 11: Regulation at the RNA and DNA Levels

  1. Qiu P (2003). Recent advances in computational promoter analysis in understanding the transcriptional regulatory network. Biochem Biophys Res Commun 309:495-501. PMID: 12963016
  2. Choo Y and Klug A (1997). Physical basis of a protein-DNA recognition code. Curr Opin Struct Biol 7:117-125. PMID: 9032060
  3. Harrison A (1991). Structural taxonomy of DNA-binding domains. Nature 353:715-719. PMID: 1944532
  4. Carey M and Smale ST. Transcriptional Regulation in Eukaryotes. Cold Spring Harbor Laboratory Press (2000) Book link
  5. Black DL (2000). Protein Diversity from Alternative Splicing: A Challenge for Bioinformatics and Post-Genome Biology. Cell 103:367-370. PMID: 11081623

Lecture 12: Protein-Protein Interactions, Networks and Clustering

  1. Pavlovic V et al (2002). A Bayesian framework for combining gene predictions. Bioinformatics 18:19-27. PMID: 11836207
  2. Jansen R (2003). A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data. Science 302:449-453. PMID: 14564010
  3. Legrain P et al (2001). Protein-protein interaction maps: a lead towards cellular functions. TIG 17:346-352. PMID: 11377797
  4. Chen Y and Xu D (2003) Computational analyses of high-throughput protein-protein interaction data. Curr Prot Pept Sci 4:159-181 PMID: 12769716
  5. Edwards AM et al (2002). Bridging structural biology and genomics: assessing protein interaction data with known complexes. TIG 8:529-536. PMID: 12350343
  6. Auerbach D et al (2002). The post-genomic era of interactive proteomics: Facts and perspectives. Proteomics 2:611-623. PMID: 12112840
  7. Drewes G and Bouwmeester T (2003). Global approaches to protein-protein interactions. Curr Opin Cell Biol 15:199-205. PMID: 12648676
  8. Tucker C et al (2001). Towards an understanding of complex protein networks. Trends Cell Biol 11:102-106. PMID: 11306254
  9. Valencia A and Pazos F (2002). Computational methods for the prediction of protein interactions. Curr Opin Struct Biol 12:368-373. PMID: 12127457
  10. Brun C et al (2003). Approach of the functional evolution of duplicated genes in Saccharomyces cerevisiae using a new classification method based on protein-protein interaction data. J Struct Func Genomics 3:213-224. PMID: 12836700
  11. Enright AJ, Skrabanek L, Bader GD (2007). Computational Prediction of Protein-Protein Interactions. In: The Proteomics Protocols Handbook. Humana Press. DOI: 10.1385/1-59259-890-0:629
  12. Bader G et al (2003). Functional genomics and proteomics: charting a multidimensional map of the yeast cell. Trends Cell Biol 13:344-56. PMID: 12837605
  13. Uetz P et al (2002). Visualization and Integration of Protein-Protein Interactions. In: Protein-Protein Interactions: A Molecular Cloning Manual Edited by E Golemis. Cold Spring Harbor, NY: CSHL Press. Book link
  14. Karp P (2001). Pathway Databases: A Case Study in Computational Symbolic Theories. Science 293:2040-2044. PMID: 11557880
  15. Phizicky EM and Fields S (1995). Protein-Protein Interactions: Methods for Detection and Analysis. Microbiol Rev 59:94-123. PMID: 7708014

Lecture 13: Dynamic Systems

Lecture 14: Cardiac Modeling

For further information please contact Lucy Skrabanek

Copyright (c) 2004-2010, Weill Medical College of Cornell University. All Rights Reserved