New technological and experimental advances have resulted in the massive accumulation of biological measurements, requiring computational analysis to facilitate data interpretation. My work lies in two distinct areas: computational genomics and computational systems biology. In both sub-fields, I develop algorithms that draw from core areas of computer science and mathematics.

Computational Genomics: Structural Variant Detection

I study large rearrangements of DNA in human genomes called structural variants (SVs) which are an important contributor to genetic variation. SVs are also associated with a host of cancers, where somatic SVs -- variants that are acquired within an individual's lifetime -- are important for determining putative driver mutations. I have developed algorithms that identify SVs from different experimental data. To accommodate new ``third-generation'' sequencing platforms, I formalized a multi-linked read that generalizes the concept of paired reads. I developed first an integer linear program and later a Markov Chain Monte Carlo (MCMC) method for SV detection from multi-linked reads based on the sequencing platforms introduced by Pacific Biosciences. In the context of cancer biology, I developed an algorithm to identify recurrent fusion genes, where an SV combines two genes into a single, ``hybrid'' gene.

Computational Systems Biology: Signaling Pathway Analysis

Signaling pathways describe the series of reactions that occur when a cell receives an external stimulus and elicits a downstream transcriptional response. I develop methods to computationally analyze signaling pathways in manually-curated databases such as KEGG, Reactome, NetPath, and SPIKE. One aspect of my work in signaling pathway analysis involves automatically reconstructing human signaling pathways from protein-protein interaction (PPI) data. I found that the notion of complexes, complex rearrangement, and regulatory interactions cannot be accurately described by directed graphs due to their inherent pairwise nature. In another line of research, I formalizing and design algorithms for signaling hypergraphs, a reaction-centric representation that better captures the complexity of complex assembly and regulation found in signaling pathways.