Genome Assembly and Comparisons, Haplotype Assembly, and Transcriptome Assembly
Research Summary
Transcriptome of American Oysters, Crassostrea virginica, in Response to Bacterial Challenge: Insights into Potential Mechanisms of Disease Resistance
The American oyster Crassostrea virginica, an ecologically and economically important estuarine organism, can suffer high mortalities in areas in the Northeast United States due to Roseovarius Oyster Disease (ROD), caused by the gram-negative bacterial pathogen Roseovarius crassostreae. The goals of this research were to provide insights into:A quantitative reference transcriptome for Nematostella vectensis earlyembryonic development: a pipeline for de novo assembly in emergingmodel systems
The de novo assembly of transcriptomes from short shotgun sequencesraises challenges due to random and non-random sequencing biases andinherent transcript complexity. We sought to define a pipeline for denovo transcriptome assembly to aid researchers working withemerging model systems where well annotated genome assemblies are notavailable as a reference.The de novo assembly of transcriptomes from short shotgun sequencesraises challenges due to random and non-random sequencing biases andinherent transcript complexity. We sought to define a pipeline for denovo transcriptome assembly to aid researchers working withemerging model systems where well annotated genome assemblies are notavailable as a reference. To detail this experimental and computationalmethod, we used early embryos of the sea anemone, Nematostellavectensis, an emerging model system for studies of animal body planevolution. We performed RNA-seq on embryos up to 24 h of developmentusing Illumina HiSeq technology and evaluated independent de novoassembly methods. The resulting reads were assembled using either theTrinity assembler on all quality controlled reads or both the Velvet andOases assemblers on reads passing a stringent digital normalization filter.A control set of mRNA standards from the National Institute of Standards andTechnology (NIST) was included in our experimental pipeline to invest ourtranscriptome with quantitative information on absolute transcript levelsand to provide additional quality control.
We generated >200 million paired-end reads from directional cDNA librariesrepresenting well over 20 Gb of sequence. The Trinity assembler pipeline,including preliminary quality control steps, resulted in more than 86% ofreads aligning with the reference transcriptome thus generated.Nevertheless, digital normalization combined with assembly by Velvet andOases required far less computing power and decreased processing time whilestill mapping 82% of reads. We have made the raw sequencing reads andassembled transcriptome publically available.
Nematostella vectensis was chosen for its strategic position in thetree of life for studies into the origins of the animal body plan, however, the challenge of reference-free transcriptome assembly is relevant to allsystems for which well annotated gene models and independently verifiedgenome assembly may not be available. To navigate this new territory, wehave constructed a pipeline for library preparation and computationalanalysis for de novo transcriptome assembly. The gene modelsdefined by this reference transcriptome define the set of genes transcribedin early Nematostella development and will provide a valuable dataset for further gene regulatory network investigations.
HapCompass: A Fast Cycle Basis Algorithm forAccurate Haplotype Assembly of Sequence Data
Genome assembly methods produce haplotype phase ambiguous assemblies due to limita-tions in current sequencing technologies. Determining the haplotype phase of an individualis computationally challenging and experimentally expensive. However, haplotype phaseinformation is crucial in many bioinformatics workflows such as genetic association studiesand genomic imputation.The Transcriptome of the Sea Urchin Embryo
The sea urchin Strongylocentrotus purpuratus is a model organism for study of the genomic control circuitry underlying embryonic development. We examined the complete repertoire of genes expressed in the S. purpuratus embryo, up to late gastrula stage, by means of high-resolution custom tiling arrays covering the whole genome.Relevant Papers
-
The Genome of the Sea Urchin Strongylocentrotus purpuratus
2006Sea Urchin Genome Sequencing Consortium, Erica Sodergren, George M. Weinstock, Eric H Davidson, R. Andrew Cameron, Richard A. Gibbs, Robert C. Angerer, Lynne M. Angerer, Maria Ina Arnone, David R. Burgess, Robert D. Burke, James A. Coffman, Michael Dean, Maurice R. Elphick, Charles A. Ettensohn, Kathy R. Foltz, Amro Hamdoun, Richard O. Hynes, William H. Klein, William Marzluff, David R. McClay, Robert L. Morris, Arcady Mushegian, Jonathan P. Rast, L. Courtney Smith, Michael C. Thorndyke, Victor D. Vacquier, Gary M. Wessel, Greg Wray, Lan Zhang, Christine G. Elsik, Olga Ermolaeva, Wratko Hlavina, Gretchen Hofmann, Paul Kitts, Melissa J. Landrum, Aaron J. Mackey, Donna Maglott, Georgia Panopoulou, Albert J. Poustka, Kim Pruitt, Victor Sapojnikov, Xingzhi Song, Alexandre Souvorov, Victor Solovyev, Zheng Wei, Charles A. Whittaker, Kim Worley, K. James Durbin, Yufeng Shen, Olivier Fedrigo, David Garfield, Ralph Haygood, Alexander Primus, Rahul Satija, Tonya Severson, Manuel L. Gonzalez-Garay, Andrew R. Jackson, Aleksandar Milosavljevic, Mark Tong, Christopher E. Killian, Brian T. Livingston, Fred H. Wilt, Nikki Adams, Robert Bell, Seth Carbonneau, Rocky Cheung, Patrick Cormier, Bertrand Cosson, Jenifer Croce, Antonio Fernandez-Guerra, Anne-Marie Genevire, Manisha Goel, Hemant Kelkar, Julia Morales, Odile Mulner-Lorillon, Anthony J. Robertson, Jared V. Goldstone, Bryan Cole, David Epel, Bert Gold, Mark E. Hahn, Meredith Howard-Ashby, Mark Scally, John J. Stegeman, Erin L. Allgood, Jonah Cool, Kyle M. Judkins, Shawn S. McCafferty, Ashlan M. Musante, Robert A. Obar, Amanda P. Rawson, Blair J. Rossetti, Ian R. Gibbons, Matthew P. Hoffman, Andrew Leone, Sorin Istrail, Stefan C. Materna, Manoj P. Samanta, Viktor et al. Stolc
-
Whole Genome Shotgun Assembly and Comparison of Human Genome Assemblies
2004Sorin Istrail, Granger G. Sutton, Liliana Florea, Aaron L. Halpern, Clark M. Mobarry, Ross Lippert, Brian Walenz, Hagit Shatkay, Ian Dew, Jason R. Miller, Michael J. Flanigan, Nathan J. Edwards, Randall Bolanos, Daniel Fasulo, Bjarni V. Halldorsson, Sridhar Hannenhalli, Russell Turner, Shibu Yooseph, Fu Lu, Deborah R. Nusskern, Bixiong Chris Shue, Xiangqun Holly Zheng, Fei Zhong, Arthur L. Delcher, Daniel H. Huson, Saul A. Kravitz, Laurent Mouchard, Knut Reinert, Karin A. Remington, Andrew G. Clark, Michael S. Waterman, Evan E. Eichler, Mark D. Adams, Michael W. Hunkapiller, Eugene W. Myers, J. Craig Venter
-
The Sequence of the Human Genome
2001J. Craig Venter, Mark D. Adams, Eugene W. Myers, Peter W. Li, Richard J. Mural, Granger G. Sutton, Hamilton O. Smith, Mark Yandell, Cheryl A. Evans, Robert A. Holt, Jeannine D. Gocayne, Peter Amanatides, Richard M. Ballew, Daniel H. Huson, Jennifer Russo Wortman, Qing Zhang, Chinnappa D. Kodira, Xiangqun H. Zheng, Lin Chen, Marian Skupski, Gangadharan Subramanian, Paul D. Thomas, Jinghui Zhang, George L. Gabor Miklos, Catherine Nelson, Samuel Broder, Andrew G. Clark, Joe Nadeau, Victor A. McKusick, Norton Zinder, Arnold J. Levine, Richard J. Roberts, Mel Simon, Carolyn Slayman, Michael Hunkapiller, Randall Bolanos, Arthur Delcher, Ian Dew, Daniel Fasulo, Michael Flanigan, Liliana Florea, Aaron Halpern, Sridhar Hannenhalli, Saul Kravitz, Samuel Levy, Clark Mobarry, Knut Reinert, Karin Remington, Jane Abu-Threideh, Ellen Beasley, Kendra Biddick, Vivien Bonazzi, Rhonda Brandon, Michele Cargill, Ishwar Chandramouliswaran, Rosane Charlab, Kabir Chaturvedi, Zuoming Deng, Valentina Di Francesco, Patrick Dunn, Karen Eilbeck, Carlos Evangelista, Andrei E. Gabrielian, Weiniu Gan, Wangmao Ge, Fangcheng Gong, Zhiping Gu, Ping Guan, Thomas J. Heiman, Maureen E. Higgins, Rui-Ru Ji, Zhaoxi Ke, Karen A. Ketchum, Zhongwu Lai, Yiding Lei, Zhenya Li, Jiayin Li, Yong Liang, Xiaoying Lin, Fu Lu, Gennady V. Merkulov, Natalia Milshina, Helen M. Moore, Ashwinikumar K Naik, Vaibhav A. Narayan, Beena Neelam, Deborah Nusskern, Douglas B. Rusch, Steven Salzberg, Wei Shao, Bixiong Shue, Jingtao Sun, Zhen Yuan Wang, Aihui Wang, Xin Wang, Jian Wang, Ming-Hui Wei, Ron Wides, Chunlin Xiao, Chunhua et al. Yan
-
Visualization challenges for a new cyberpharmaceutical computing paradigm
2001Russell J. Turner, Kabir Chaturvedi, Nathan J. Edwards, Daniel Fasulo, Aaron L. Halpern, Daniel H. Huson, Oliver Kohlbacher, Jason R. Miller, Knut Reinert, Karin A. Remington, Russell Schwartz, Brian Walenz, Shibu Yooseph, Sorin Istrail