CSCI2820 Final Projects Tentative List
Each student is required to complete a final research project as part of the requirements of CSCI2820. You should work with Professor Istrail to frame what your project will be. The final handin will include a ~20 minute presentation of your project results and your adopted GWAS, a paper describing your project results, and any accompanying source code and documentation. Students not implementing any code will deliver a more in-depth presentation, paper, and/or analysis.
Below is a set of research projects (and a sample of related papers) aligned with the goals of the class. Only one student may work on each project. Note: this represents only a sample of projects; you are encouraged to define a project aligned with your interests if none of these projects fit.
- Ancestral Recombination Graphs and Ordered Marginal Trees
- Coalescent-Based Association Mapping and Fine Mapping of Complex Trait Loci
- Mapping Trait Loci by Use of Inferred Ancestral Recombination Graphs
- Minimum informative subset selection algorithm and transferability
- Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium
- Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association Studies
- Supplemental 1
- Supplemental 2
- The portability of tagSNPs across populations: A worldwide survey
- Random forests, decision trees, and GWAS
- Epistatic Interactions in GWAS
- Ultrafast genome-wide scan for SNP-SNP interactions in common complex disease
- SIXPAC software (from the Pe'er paper)
- Variable Selection, Regression Analysis and GWAS
- ARGs and Haplotype Phasing
- GENEALOGICAL TREES, COALESCENT THEORY AND THE ANALYSIS OF GENETIC POLYMORPHISMS
- Mapping Trait Loci by Use of Inferred Ancestral Recombination Graphs
- Coalescent-Based Association Mapping and Fine Mapping of Complex Trait Loci
- Codon bias in GWAS
- Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium
- Genotype Imputation with Thousands of Genomes
- The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications
- Identity by Descent in GWAS and/or computing identity-by-decent tracts in genotypes
- A Fast, Powerful Method for Detecting Identity by Descent
- Whole population, genome-wide mapping of hidden relatedness
- Gene Sets and GWAS: From SNPs to Genes
- Analysing biological pathways in genome-wide association studies
- Pathway analysis of genomic data: concepts, methods, and prospects for future development
- Leveraging models of cell regulation and GWAS data in integrative network-based association studies
- Tag SNPs - unifying LD-select/Tagger and Informativeness - Dominating Set optimization
- Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium
- Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association Studies
- Linkage Disequilibrium in Humans: Models and Data (Jonathan K. Pritchard and Molly Przeworski)
- Rigorous algorithms for Global Maximum Likelihood Phasing, EM and generalized likelihoods
- Long-range haplotype phasing – "the power of amnesia" variable-length Markov Chain and the Browning and Browning Beagle
- Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies By Use of Localized Haplotype Clustering
- The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length
- Long-range haplotype phasing – the deCODE algorithm; haplotype sharing in closely related populations
- Cryptic population structure
- Immunogenomics
- Innate Immune and Chemically Triggered Oxidative Stress Modifies Translational Fidelity
- Comparative immunopeptidomics of humans and their pathogens
- Detecting recombination rates and the Li-Stephens framework
- Fast detection of Identical by Descent relatedness
- A Fast, Powerful Method for Detecting Identity by Descent
- Whole population, genome-wide mapping of hidden relatedness
- Parents of origin genetic variation
- Transferability of tagging SNPs across populations
- Generalized family and pedigree based statistical tests for association
- Crypto-GWAS and the maximum non-identifiability problem
- Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays
- The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis
- Pedigree inference from genotypes/haplotypes
- Pedigree Reconstruction Using Identity by Descent
- Efficient maximum likelihood pedigree reconstruction
- Viral quasispecies reconstruction and polyploid haplotype assembly
- QColors: An algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads
- Haplotype assembly in polyploid genomes and identical by descent shared tracts
- Haplotype and genome assembly of metagenomes