neds.gif (1190 bytes)

New England Database Society

Friday, May 18, 2007

sponsored by Sun Microsystems

sunlogo.gif (4979 bytes)


Querying and Managing Provenance though User Views in Scientific Workflows

                 Susan Davidson
            University of Pennsylvania

          Friday, May 18, 2007, 3:00 PM
            Volen 101, Brandeis University

    (preceded by a wine-and-cheese reception at 2:00)


Workflow systems have become increasingly popular for managing large-scale in-silico experiments where many bioinformatics tasks are chained together. Due to the large amount of data products generated by these experiments and the need for reproducible results, provenance has become of paramount importance.  Several workflow systems are therefore starting to provide support for querying provenance. However, the amount of provenance information produced may be overwhelming, so there is a need for abstraction mechanisms to present the most relevant information.

The technique we pursue is that of "user views." Since bioinformatics tasks may themselves be complex sub-workflows, the notion of a user view determines what level of granularity the user can see in the workflow. For example, biologists may simply wish a view in which reformatting tasks are hidden and biologically relevant tasks are seen. Thus the user view determines what data products and tasks can be seen and queried when answering questions of provenance. This talk gives an example of a phylogenomic analysis workflow, discusses the notion of user views relative to this workflow, demonstrates how user views can be used in provenance queries, and discusses how a user view is generated based on what tasks the user perceives to be biologically relevant in the workflow specification.

Speaker Bio:

Susan B. Davidson received the B.A. degree in Mathematics from Cornell University, Ithaca, NY, in 1978, and the M.A. and Ph.D. degrees in Electrical Engineering and Computer Science from Princeton University, Princeton NJ, in 1980 and 1982.  Dr. Davidson joined the University of Pennsylvania in 1982, and is now the Weiss Professor of Computer and Information Science and Deputy Dean of the School of Engineering and Applied Science.  She is an ACM Fellow, a Fulbright scholar, and recently stepped down as founding co-Director of the Center for Bioinformatics at UPenn (PCBI). She has also helped establish undergraduate, Master's and Ph.D. degree programs in bioinformatics and computational biology.

Dr. Davidson's research interests include database systems, data modeling, information integration, workflow systems, distributed systems, and bioinformatics.