New England Database Symposium

neds.gif (1190 bytes)

New England Database Society

Friday, May 3, 2013

sponsored by HP Vertica

NEDS

What compilers can do for databases:
Abstraction without regret in data management systems revisited

Christoph Koch
EPFL, École Polytechnique Fédérale de Lausanne

Friday, May 3, 2013, 4PM
HP/Vertica Computer Science Lounge (Volen 104), Brandeis University

(preceded by a wine and cheese reception at 3:30 pm)

Abstract:

In this talk, I show how ideas from compilers and program synthesis can be used to build better (more efficient and more maintainable) data management systems more quickly.

In systems building, abstraction and avoiding the performance penalty of indirection seem to be two desiderata that are at odds with each other, and cannot be achieved both. One reason for the famed code complexity of relational database systems is that we are traditionally unwilling to sacrifice performance for programmability. Optimizing compilers should enable us to achieve both desiderata. The powerful abstraction and composition features of modern high-level programming languages can improve productivity beyond current standards in data management systems development. Ideally, a compiler would eliminate all inefficiencies and indirections. We do not trust compilers to do this well enough yet, though. Currently we are right in this assessment, but the situation is changing. I describe a number of developments in the compilers space, including extensible compilers as libraries, staging compilers, and frameworks for easily building optimizing compilers for domain-specific languages (DSLs). I describe our own experiences with using and extending these latest-generation compilers for data management systems. I look at this both from the angle of analytics and OLTP systems; both from the viewpoint of secondary-storage and main-memory databases. I give experimental results, some of them preliminary, some mature.

Another benefit of leveraging abstraction, creating clean building blocks, and assembling those into systems is that assembly can frequently be performed automatically. In a CIDR talk earlier this year I proposed to do this for out-of-core algorithms. In that domain, fundamental algorithmic knowledge is orthogonal to data locality principles and assumptions about the system, most importantly the structure of memory hierarchy and storage devices. We have studied this problem, and did not just identify and formalize these building blocks but also built a system, Legorithmics, that automatically synthesizes out-of-core algorithms from these blocks. I will describe our recent work in a number of further contexts where program synthesis can help in systems construction, such as synthesis of concurrency control algorithms and even the synthesis of entire systems in-the-large. This work differs from most traditional research in program synthesis. We are helped by the fact that we work in a limited domain (such as out-of-core algorithms, rather than general programs), by our freedom to start from specifications in DSLs of a level of abstraction and granularity of our choosing to serve our needs and enable a synthesizer to be successful. There seems to be much potential in domain-specific program synthesis. Because of our longstanding experience with DSL design, our research community can make an important contribution in this space.

Speaker's Bio:

Christoph Koch is a professor of Computer Science at EPFL, specializing in data management. Until 2010, he was an Associate Professor in the Department of Computer Science at Cornell University. Previously to this, from 2005 to 2007, he was an Associate Professor of Computer Science at Saarland University. Earlier, he obtained his PhD in Artificial Intelligence from TU Vienna and CERN (2001), was a postdoctoral researcher at TU Vienna and the University of Edinburgh (2001-2003), and an assistant professor at TU Vienna (2003-2005). He obtained his Habilitation degree in 2004. He has won Best Paper Awards at PODS 2002, ICALP 2005, and SIGMOD 2011, an Outrageous Ideas and Vision Paper Award at CIDR 2013, a Google Research Award (in 2009), and an ERC Grant (in 2011). He is a PI of the Billion-Euro EU FET Flagship Human Brain Project. He (co-)chaired the program committees of DBPL 2005, WebDB 2008, and ICDE 2011, and was PC vice-chair of ICDE 2008 and ICDE 2009. He has served on the editorial board of ACM Transactions on Internet Technology as well as in numerous program committees. He currently serves as PC co-chair of VLDB 2013 and Editor-in-Chief of PVLDB.