Tech Report CS-93-16

Extensible Query Processing in an Object-Oreinted Database

Gail Anne Mitchell

May 1993


In this thesis we address the problem of providing efficient processing of queries in the extensible environment induced by object-oriented databases. We define a framework for query processing in an object-oriented database and develop designs for major components of this framework. The framework encompasses an object-oriented data model, an algebra to query over that model, transformation rules for the algebra, an internal representation for queries expressed in the algebra, a cost model for analyzing query expressions, and an architecture for an extensible query optimizer. The major contributions of this thesis are an algebra and transformation rules, a representation, and an architecture for extensible query optimization. We show how these components fit into the framework and interact with each other.

The EQUAL query algebra presented in this thesis is the first query algebra for object-oriented database systems to be completely consistent with data abstraction, and one of the few to propose operations for the creation and manipulation of objects with identity. We present transformation rules for this algebra, and a theory of equivalence for query expressions involving object-building operations. We also define an internal representation for query expressions that is annotated with information that can be used during query optimization, and can be extended to represent new operations and annotations.

We present Epoq, a novel approach to extensible query optimization. An Epoq optimizer integrates diverse strategies for controlling the optimization of a query and adapts to the addition of new strategies for control. We give a formal basis for this approach and specify an architecture that embodies the approach. We define the control problem for extensible query optimization and present a solution to this problem that is based on planning the optimization process. The planning-based control can combine the optimization strategies in different ways depending on the characteristics of the query being processed. The architecture and control are illustrated with an extensive example optimizer that integates optimization strategies described in the literature with strategies developed specifically for EQUAL.

(complete text in pdf or gzipped postscript)