CAREER: Generating Provably Correct Query Optimizers

Mitch Cherniack
Department of Computer Science
Brandeis University

Contact Information

415 South St., Mailstop 018

Waltham, MA, 02454

Phone: (781) 736-2738
Fax : (781) 736-2741
Email:

URL:

WWW PAGE

List of Supported Students and Staff

  1. Xiaoyu Wang, 2nd year Ph.D. student,
  2. Antonella Di Lillo, 1st year Masters student (summer support)

Project Award Information

  • Award Number: IIS-9984960
  • Duration: 10/01/2000 -- 09/01/2004 (presently in year 1)
  • Title: CAREER: Generating Provably Correct Query Optimizers

Keywords

Query Optimization, Formal Methods, Rule-Based Optimizers

Project Summary

The goal of this project is to assist researchers and developers in building query optimizers that are "provably correct". Specifically, this research group is building a framework which accepts specifications of optimizer components and their interactions, and generates optimizers that can be shown to satisfy the property that the plans they construct always return the data specified in a user queries. The group's approach separates the components of the optimizer into those that require correctness proofs (the safety critical components) from those that do not. Languages are under design for formally specifying those components, and tools are under construction that both generate these components according to the specifications, and generate proof obligations enabling their verification with an automated theorem prover. The experimental research is linked to the educational goal of training students in the application of formal methods in building large software systems. The results of this project will provide a sandbox for database researchers in both academia and industry, to introduce new optimizer techniques and products while providing tangible guarantees that they are free of errors.

Publications and Products

  • COKO-KOLA Query Rewriter Generator. Available for Linux and Solaris from
  • Project web site:

Several papers based on the work done under this grant are in preparation. The project web site also includes links to papers describing earlier work.

Project Impact

This project is now 7 months old and is therefore in its initial stages. One 2nd year Ph.D. student (Xiaoyu Wang) is being supported by the grant, and one 1st Masters student (Antonella Di Lillo) will be funded by the grant over the summer. A second Ph.D. student who would be paid off of this grant is being recruited for Fall, 2001. The PI is presently teaching a new graduate seminar course in query optimization that has 11 students, and is preparing a new course in formal verification of software systems that will be offered in Spring, 2002.

Goals, Objectives, and Targeted Activities

Our first year goals are as follows:

  1. Complete a port of the query rewriter generator code base (built while the PI was a graduate student at Brown University) to Linux.
  2. Replace the semantic component of the query rewriter code base (presently requiring external calls to a Prolog interpreter) with a self-contained semantic query rewriter. This work is presently being undertaken by an undergraduate student with support from an REU supplement to this grant.
  3. Build a graphical debugger for COKO. This work is presently being undertaken by an undergraduate student with support from an REU supplement to this grant.
  4. Design, formally specify and formally verify theorems about a combinatory-based plan algebra. This work is presently being undertaken by a 2nd year Ph.D. student supported by this grant.
  5. Design and give a new graduate seminar course in query optimization.

Project References

Mitch Cherniack, The COKO-KOLA Approach to Correct Query Rewriting, in preparation, 2001.

Mitch Cherniack and Stan Zdonik, Inferring Function Semantics to Optimize Queries, Proceedings of the 24th International Conference on Very Large Data Bases (VLDB), New York, NY, August, 1998.

Mitch Cherniack and Stan Zdonik, Changing the Rules: Transformations for Rule-Based Optimizers, Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, WA, June, 1998.

Mitch Cherniack and Stanley B. Zdonik, Rule Languages and Internal Algebras for Rule-Based Optimizers, Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, June, 1996.

Project software:

Area Background

Database systems are amongst the most complex software systems in existence, and query optimizers (optimizers) are amongst their most complex components. Optimizers map declarative descriptions of data (queries) to algorithmic plans that retrieve the data that the queries describe. An optimizer is correct if the plan it returns is guaranteed to produce exactly the data specified in the original query.

Formal methods are mathematically-based languages, techniques and tools for specifying and verifying complex software systems. These tools include: a formal specification language in which the behavior of a system can be formally expressed, and an automated theorem prover that assists developers in proving certain properties about their formally specified system (e.g., that the implementation is faithful to the specification). Formal methods have been successfully applied to the development of such systems as transaction processing systems, compilers, real-time systems and network management systems. Query optimizers represent another natural application of this technology.

Area References

Query Optimization:

Selinger, P.G., Astrahan, M. M., Chamberlin, D.D, Lorie, R.A., Price, T.G.,. Access path selection in a relational database management system. In Proceedings of the SIGMOD International Conference on Management of Data, pages 23-34, 1979.

Ganski, Richard A., and Wong, Harry K. T. Optimization of nested SQL queries revisited. In Uneshwar Dayal and Irv Traiger, editors, Proceedings of the SIGMOD International Conference on Management of Data, pages 173-180, San Francisco, California, May, 1987.

Pirahesh, Hamid, Hellerstein, Joseph M., and Hasan, Waqar. Extensible/rule-based query rewrite optimization in Starburst. In Proceedings of the SIGMOD International Conference on Management of Data, pages 39-48, San Diego, California, June, 1992.

Formal Methods:

Guttag, J.V., Horning, J.J., Garland, S.G., Jones, K.D., Modet, A., and Wing, J.M., Larch: Languages and Tools for Formal Specifications. Springer-Verlag, 1992.

Potential Related Projects

Projects that specifically address the formal verification of query optimizers include Leonidas Fegaras’ Lambda-DB work (also supported by a CAREER award), Grant Weddell’s Universal Data Representation work, and the Andreas Heuer’s work on the CROQUE query optimizer project.

*All award information can be found on the on the NSF

on-line Awards

Abstracts system,

Back to the IDM '01 homepage