ACM Computing Surveys 28A(4), December 1996, http://www.acm.org/surveys/1996/UllmanHotTopics/. Copyright © 1996 by the Association for Computing Machinery, Inc. See the permissions statement below.


Some Hot Topics in Database Systems


Jeffrey D. Ullman

Department of Computer Science
Gates Hall 4A Wing
Stanford University
Stanford, CA 94305-9040
ullman@cs.stanford.edu
http://www-db.stanford.edu/~ullman

Deductive databases

While I waver periodically about the importance of logic, datalog, etc, in database research, right now I'm very optimistic. The technology is beginning to see the light of day in commercial ventures, there is a new SQL3 standard for recursion based on the ``datalog'' approach, and there are some new and very exciting applications in the integration/warehousing area that exploit the core datalog technology in surprising and very useful ways. More about that below.

Information Integration

Tools for combining various information sources into a useful whole are becoming real. Some of this work is ``warehousing,'' i.e., materialized views of existing information. Other approaches are ``mediated,'' i.e., virtual databases supported by issuing appropriate queries to the actual sources. The use and integration of sources found on the web, rather than conventional SQL databases as sources, seems to be very important.

Either way, theoretical results about views and constructing queries by using defined views turn out to be central. The Information Manifold project at Bell Labs is a wonderful example of what can be done by applying some database theory deep in the architecture of a system. There are many interesting questions raised about materialized views, their use in query systems, and their maintenance as the sources change. Jennifer will probably say more about this topic.

Decision Support

There is increasing interest in systems that handle complex queries --- those that touch all or almost all of a database and that usually involve grouping and aggregation --- well. The optimization theory for such queries needs to be worked out.

Especially important are ``data cubes.'' Specialized DBMS's are now available to handle certain decision support queries on single relations; they look something like large-scale spreadsheets. There are challenging questions regarding implementation of data-cube systems and design of data cubes (analogous to design of a relational schema).

Also in this area, there is increasing excitement over data mining. There have been recent successes in ``market basket'' mining, where the queries ask for sets of items that are bought together surprisingly frequently. These techniques need to be extended to allow searching for other interesting patterns in all sorts of data.


Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.

ullman@db.stanford.edu