|
New England Database
Society sponsored by HP Vertica |
|
|
NEDS |
What compilers can do for databases:
Abstraction without regret in data management systems
revisited
Christoph
Koch
EPFL, École Polytechnique Fédérale de Lausanne
Friday, May 3, 2013, 4PM
HP/Vertica Computer Science Lounge (Volen
104), Brandeis University
(preceded by a wine and cheese reception at 3:30 pm)
Abstract:
In this talk, I show
how ideas from compilers and program synthesis can be used to
build better (more efficient and more maintainable) data
management systems more quickly.
In systems building, abstraction and avoiding the performance
penalty of indirection seem to be two desiderata that are at
odds with each other, and cannot be achieved both. One reason
for the famed code complexity of relational database systems
is that we are traditionally unwilling to sacrifice
performance for programmability. Optimizing compilers should
enable us to achieve both desiderata. The powerful abstraction
and composition features of modern high-level programming
languages can improve productivity beyond current standards in
data management systems development. Ideally, a compiler would
eliminate all inefficiencies and indirections. We do not trust
compilers to do this well enough yet, though. Currently we are
right in this assessment, but the situation is changing. I
describe a number of developments in the compilers space,
including extensible compilers as libraries, staging
compilers, and frameworks for easily building optimizing
compilers for domain-specific languages (DSLs). I describe our
own experiences with using and extending these
latest-generation compilers for data management systems. I
look at this both from the angle of analytics and OLTP
systems; both from the viewpoint of secondary-storage and
main-memory databases. I give experimental results, some of
them preliminary, some mature.
Another benefit of leveraging abstraction, creating clean
building blocks, and assembling those into systems is that
assembly can frequently be performed automatically. In a CIDR
talk earlier this year I proposed to do this for out-of-core
algorithms. In that domain, fundamental algorithmic knowledge
is orthogonal to data locality principles and assumptions
about the system, most importantly the structure of memory
hierarchy and storage devices. We have studied this problem,
and did not just identify and formalize these building blocks
but also built a system, Legorithmics, that automatically
synthesizes out-of-core algorithms from these blocks. I will
describe our recent work in a number of further contexts where
program synthesis can help in systems construction, such as
synthesis of concurrency control algorithms and even the
synthesis of entire systems in-the-large. This work differs
from most traditional research in program synthesis. We are
helped by the fact that we work in a limited domain (such as
out-of-core algorithms, rather than general programs), by our
freedom to start from specifications in DSLs of a level of
abstraction and granularity of our choosing to serve our needs
and enable a synthesizer to be successful. There seems to be
much potential in domain-specific program synthesis. Because
of our longstanding experience with DSL design, our research
community can make an important contribution in this space.
Christoph Koch is a professor of Computer Science at EPFL, specializing in data management. Until 2010, he was an Associate Professor in the Department of Computer Science at Cornell University. Previously to this, from 2005 to 2007, he was an Associate Professor of Computer Science at Saarland University. Earlier, he obtained his PhD in Artificial Intelligence from TU Vienna and CERN (2001), was a postdoctoral researcher at TU Vienna and the University of Edinburgh (2001-2003), and an assistant professor at TU Vienna (2003-2005). He obtained his Habilitation degree in 2004. He has won Best Paper Awards at PODS 2002, ICALP 2005, and SIGMOD 2011, an Outrageous Ideas and Vision Paper Award at CIDR 2013, a Google Research Award (in 2009), and an ERC Grant (in 2011). He is a PI of the Billion-Euro EU FET Flagship Human Brain Project. He (co-)chaired the program committees of DBPL 2005, WebDB 2008, and ICDE 2011, and was PC vice-chair of ICDE 2008 and ICDE 2009. He has served on the editorial board of ACM Transactions on Internet Technology as well as in numerous program committees. He currently serves as PC co-chair of VLDB 2013 and Editor-in-Chief of PVLDB.
Maintained by Olga Papaemmanouil olga AT cs.brandeis.edu