date |
topic |
notes |
homework |
recommended readings |
Major Unit. Genomics Foundations: Modeling and Measuring Evolution: Linkage Disequilibrium (LD), Urn Models |
Sept. 4 |
The Human Genome and Variation |
CSCI 2951-N Overview
SNPs and the Human Genome
Haplotype Blocks and Tagging SNPs
From Human Genome to Clinical Theraphy |
|
|
Sept. 9 |
|
Introduction to Linkage Disequilibrium
| |
R. Lewontin's seminal paper introducing D
M. Slatkin "Linkage Disequilibrium - understanding evolutionary past .." |
Sept. 11 |
|
Urn Models and the Wright-Fisher Model
The Hardy-Weinberg Model
| |
A comparison paper of most used LD measures
A rigorous review of Statistical Hypothesis Testing
Sir Ronald Fisher's Puzzle "The Lady Tasting Tea"
|
Major Unit. Algorithms: Graph-Theory Algorithms: Haplotype Phasing I. Clark Methods and the EM Algorithm
|
Sept. 16 |
|
Haplotype Phasing Algorithms: The Clark Method
Andy Clark's seminal 1990 paper introducing the Clark Method
|
Homework 1
dataset1-genotypes
|
Dan Gusfield's paper proving the NP-completeness of the problem of minimizing the number of unresolved (un-phased) genotypes
|
Sept. 18 |
|
Urn Models, the Infinite Alleles Models, and the Ewens Sampling Lemma
| |
An article presenting Ewens Sampling Lemma's proof and generalizations
Ewens Sanmpling Lemma - several other applications
|
Sept. 30 |
|
The Expectation-Maximization (EM) Algorithm for Haplotype Phasing
| |
The Excoffier-Slatkin Expectation-Maximization (EM) Algorithm paper for Computing Haplotype Frequencies and Haplotype Phasing
|
Oct. 2 |
|
Theorem (E. Hubbell, 2000) Haplotype Parsimony Phasing is NP-complete, and Haplotype Maximum Likelihood Phasing is NP-complete
| |
|
Major Unit. Algorithms: Graph-Theory Algorithms: Set Cover and Dominating Set: Haplotype Tagging SNPs Algorithms
|
Oct. 7 |
|
Tagging SNPs - The LD-Select Algorithm
| |
|
Oct. 9 |
|
Tagging SNPs - The Informativeness Algorithm
| |
Tagging SNPs and Haplotype Blocks
|
|
|
|
Homework 2 |
|
Oct. 14 |
|
Informativeness Algorithm and Haplotype Blocks
| |
|
Major Unit. Genome-Wide Association Studies: Computational Workflows
|
Oct. 16 |
|
GWAS Issues and Caveats
GWAS: Computational Workflows
|
| GWAS Tutorial on Statistical Methods for Population Genomics Studies
|
Oct. 23 |
|
Computational Workflows for GWAS - Rare variants
Genetic Mapping in Human Disease and GWAS successes
List of GWASs and diseases - to pick from for presentation
|
|
Guilt by association
GWAS and assessment of disease risk - an overview of GWAS methodology
GWAS and human disease - a review
|
Major Unit. Algorithms: Graph-Theory Algorithms: Haplotype Phasing II. Long-Range Phasing and Identical-by-Descent Inference
|
Oct. 28 |
|
deCode Long-Range Haplotype Phasing Algorithm
|
|
|
Oct. 30 |
|
The Transmission Disequilibrium Test
|
|
|
|
|
|
Homework 3 |
|
Major Unit. Algorithms: Markov Chain Monte-Carlo and the Metropolis Algorithm; application: the STRUCTURE Algorithm for Inference of Population Substructure in a sample of genotyped individuals
|
Nov. 4 |
|
The seminal paper by Jonathan Pritchard, Matthew Stephens, and Peter Donnelly "Inference of Population Structure Using Multilocos Genotype Data" Genetics, (2000). It presents the algorithm STRUCTRE based on the Markov Chain Monte Carlo method for infering population substructure in a sample of genotyped individuals such as a GWAS sample.
|
|
The seminal paper by Persi Diaconis "The Markov Chain Monte Carlo Revoultion"; it contains the "Criminal/prison code"and its breaking via an MCMC algorithm as we discussed in class.
"Many basic scientific problems are now routinely solved by simulation: a fancy random walk is performed on the system of interest. Averages computed from the walk give useful answers to formerly intractable problems."
|
Nov. 6 |
|
A tutorial on Markov Chain Monte Carlo Method. In particular it has theory on how long the chains should be run to reach the stationary distribution, the so called "Mixing-Time" theory, which we did not cover in class.
|
|
|
Nov. 11 |
|
The original Metropolis et al paper, i.e., Nicholas Metropolis, Ariana Rosenbluth, Marshall Rosenbluth, Augusta Teller, and Edward Teller "Equation of State Calculations for Fast Computing Machines" The Journal of Chemical Physics, 21, 1087 (1953)
|
|
Understanding the Metropolis-Hastings Algorithm
|
Nov. 13 |
|
A paper by Nick Metropolis, who coined the name "Monte Carlo" for the method: N. Metropolis "The Beginning of the Monte Carlo Method" Los Alamos Science, Special Isssue 1987, pp. 125-130; he gives a historic introduction of the Monte Carlo Method introduced by two famous mathematicians working at Los Alamos on the Manhattan Project: Stanislaw Ulam and John von Neumann; the method was "reinvented" by Enrico Fermi 15 years before but just as a numerical simulation technique; the mathematical development and foundations are due to Ulam and von Neumann.
|
|
Metropilis and MANIAC (one of the first electronic computers)
A perspective on the Metropolis method
|
Major Unit. The Coalescent: Theory and Applications; The Miniciello-Durbin Ancestral Recombination Graph Inference Algorithm; Hudson Simulator
|
Nov. 18 |
|
The seminal Miniciello-Durbin Ancestral Recombination Graph Algorithm; reference: Mark Miniciello and Richard Durbin, " Mapping Trait Loci by use of Ancestral Recombination Graphs", American Journal of Human Genetics, vol 79, pp. 910-922 (2006)
|
|
|
Nov. 20 |
|
A survey by Noah Rosenberg and Magnus Mordborg, "Genealogical Trees, Coalescent Theory and the Analysis of Genetic Polymorphism" Nature Reviews Genetics, 380, vol 3 (2002)
|
|
A Tutorial on the Coalescent Theory, Magnus Norburg, "Coalescent Theory" 37 pp (2000)
|
|
|
|
Homework 4
Stephens et al the PHASE Algorithm paper
The Haplotype Phasing Competition paper
|
|
Major Unit. Genome-Wide Association Studies (GWAS): Statistical associations, the missing heritabililty problem, genetic heterogeneity, genomic privacy. |
Major Unit. The Social Network of Protein Folds: Individual Preferences of Amino Acids and the Thermodynamic "Social Choice" Hypothesis |
Major Unit. Algorithms: Statistical Hypothesis Testing and Knapsack Algorithms: The Neyman-Pearson Lemma, Multiple Testing |
Major Unit. Algorithms: Voting Theory Algorithms and von Neumann-Morgenstern Utility Theory: Protein Folding Energy Function Inference |