Distinguished Lecture Series


"Information Integration: From Clio to Integration Independence"

Renee J. Miller, University of Toronto

Thursday, March 4, 2010 at 4:00 P.M.

Room 368 (CIT 3rd floor)

Fueled by the growth in information sources and the increasing use of complex data in modern decision making, information integration remains a fundamental challenge for businesses and individuals alike. To integrate information, data in different formats, from different, potentially overlapping sources, must be related and transformed to meet the users' needs. Ten years ago, Clio introduced declarative schema mappings to describe the relationship between data in heterogeneous schemas. This enabled powerful tools for mapping discovery and integration code generation, greatly simplifying the integration process. This work also led to new techniques for managing inconsistent data. In this talk, I take a look at where our field was a decade ago and where it is now in terms of support for information integration. I share a vision for raising the level of abstraction further, to better isolate applications from the details of how the integration is accomplished. Integration independence allows applications to be independent of how, when, and where information integration takes place, making materialization and the timing of transformations an optimization decision that is transparent to applications. I identify a number of research challenges that remain to be addressed in order to ultimately achieve this vision.

Clio was a joint research project between the University of Toronto and IBM Almaden Research Center. Clio technology has been transferred into many IBM products including the Infosphere product line.


Renee J. Miller is a professor of computer science and the Bell Canada Chair of Information Systems at the University of Toronto. She received the US Presidential Early Career Award for Scientists and Engineers (PECASE), the highest honor bestowed by the United States government on outstanding scientists and engineers beginning their careers. She is a fellow of the ACM. She received an NSF CAREER Award, the Premier's Research Excellence Award, and an IBM Faculty Award. Her research interests are in the efficient, effective use of large volumes of complex, heterogeneous data. This interest spans data integration and exchange, inconsistent and uncertain data management, and data curation and cleaning. She serves on the Board of Trustees of the VLDB Endowment and was elected to serve as VLDB President beginning January 2010. She leads a new Canada-wide Research Network on Business Intelligence. She received her PhD in Computer Science from the University of Wisconsin, Madison and bachelor's degrees in Mathematics and Cognitive Science from MIT.

Host: Stan Zdonik