RESOURCES
Databases
Supplementary Readings
To be populated as we progress through the course.
Chapter 1: SNPs and Haplotypes: Linkage Disequilibrium and the Haplotype Phasing Problem
- 5 Years of GWAS
- 10 Years of GWAS
- 15 Years of GWAS
- Balding GWAS Tutorial
- GWAS for Common Diseases and Complex Traits
- How to Interpret a GWAS
- Assessment of Risk and Disease Using GWAS
- A Haplotype Map of the Human Genome
- Maximum-Likelihood Estimation of Molecular Haplotype Frequencies in a Diploid Population
Chapter 2: Protein Folding, Misfolding, and Disease
- Miyazawa-Jernigan Amino Acid Contact Potentials
- Highly Accurate Protein Structure Predication with AlphaFold
- Historical Progress in the Protein Folding Problem
- A Survey of the Protein Folding Problem
- A Survey of the Combinatorial Protein Folding
Chapter 3: MCMC and Spectral Graph Theory, with Applications to Population Stratification
- The Original Metropolis Paper
- Perspective on the Initial Metropolis Paper
- The Beginning of the Monte-Carlo Method
- The MCMC Revolution
- Rapidly Mixing Markov Chains with Applications in CS and Physics
Chapter 4: Missing Heritability, Genetic Heterogeneity, and Rare and Common Genetic Variants
Chapter 5: Polygenic Risk Scores and GWAS
Final Project Ideas
- Epistatic Interactions in GWAS
- Variable Selection in Deep Learning Models for GWAS
- Bayesian Variable Selection and Regression Analysis in GWAS and Fine Mapping Studies
- Gene Sets and GWAS: From selecting SNPs to selecting Genes and Pathways
- Bayesian large-scale multiple regression with summary statistics from genome-wide association studies (RSS by M Stephens)
- Analysing biological pathways in genome-wide association studies
- Pathway analysis of genomic data: concepts, methods, and prospects for future development
- Leveraging models of cell regulation and GWAS data in integrative network-based association studies
- Random Forests, Decision Trees, and GWAS
- GWAS using Ancestral Recombination Graphs (combinatorial approach to GWAS using haplotypes)
- ARGs and Haplotype Phasing
- Tag SNPs - Unifying LD-select/Tagger and Informativeness - Dominating Set Optimization
- Minimum Informative Subset Selection Algorithm and Transferability
- Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium
- Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association Studies
- Supplemental 1
- Supplemental 2
- The Portability of tagSNPs Across Populations: A Worldwide Survey
- Codon Bias in GWAS
- Identity by Descent in GWAS
- Rigorous Algorithms for Global Maximum Likelihood Phasing, EM, and Generalized Likelihoods
- Long-Range Haplotype Phasing – “The Power of Amnesia” Variable-Length Markov Chain
- Long-Range Haplotype Phasing – The deCODE Algorithm
- Cryptic Population Structure
- Immunogenomics
- Detecting Recombination Rates and the Li-Stephens Framework
- Fast Detection of Identical by Descent Relatedness
- Parents of Origin Genetic Variation
- Transferability of Tagging SNPs Across Populations
- Generalized Family and Pedigree-Based Statistical Tests for Association
- Crypto-GWAS and the Maximum Non-Identifiability Problem
- Pedigree Inference from Genotypes/Haplotypes
- Viral Quasispecies Reconstruction and Polyploid Haplotype Assembly
- Haplotype and Genome Assembly of Metagenomes