|
New England Database Society
Friday, May
8, 2009
sponsored
by Sun Microsystems
|

|
NEDS
|
Mixed Workload Management for Enterprise-Scale Business Intelligence
Systems
Umeshwar Dayal
HP Labs
Friday, May 8, 2009, 4PM
Volen 101, Brandeis University
(preceded by a wine and
cheese reception at 3:00 pm, and followed by dinner at 6:00 pm)
Abstract:
Enterprises rely on
business intelligence technologies (data integration, data warehousing,
data mining, and analytics) to gain an understanding of how their
business is performing. The enterprise BI architecture typically
consists of a data warehouse that consolidates data from several
operational databases, and serves a variety of front-end querying,
reporting, and analytic tools. Increasingly, as enterprises become more
automated, data-driven, and real-time, the BI architecture is evolving
to support operational decision making. Operational BI workloads pose
stringent performance requirements against enterprise data warehouses,
and are notoriously difficult to manage, particularly since queries
exhibit a huge variance in response times, ranging from fractions of a
second to several hours. It is not well understood how effective
existing database workload management policies are in the face of such
complex, mixed workloads. Factors such as inaccurate cardinality
estimates, data skew, and resource contention all make it difficult to
predict how queries will behave. Experience has shown that a few
"problem" queries can have drastic effects on system performance. Our
goal is to automate the adaptive tuning and management of complex,
mixed workloads on enterprise-scale data warehouses. There are many
challenges in doing this. The first challenge is to estimate accurately
how long a query will take, and what resources it will consume. Second,
we must have effective strategies for admission control (which queries
should be allowed to run), scheduling (which queries to run and when),
and execution control (what to do when a problem query is detected).
Third, we must be able to evaluate the performance of these strategies
under different conditions, so we can design a system that
automatically and dynamically selects the best policies to use. This
talk will describe the approaches we are taking at HP Labs to address
these challenges, and promising results we have obtained. We use
machine learning techniques to predict query execution times and
resource usage; and we are developing an experimental framework to
understand the impact of existing and emerging workload management
policies under different conditions.
Speaker Bio:
Umeshwar Dayal is an HP Fellow in
Intelligent Information Management at Hewlett-Packard Laboratories,
Palo Alto, California. In this role, he leads research programs in
enterprise-scale data warehousing, business intelligence, scalable
analytics,and information visualization. Umesh has 30 years of research
experience in data management, and has made significant contributions
to the field, especially in data integration, federated databases,
active rule-based systems, query optimization, long-running
transactions, business process management, and database design. He has
published over 160 research papers and holds over 25 patents. In 2001,
he received (with two co-authors) the 10-year best paper award from the
International Conference on Very Large Data Bases for his paper on a
transactional model for long-running activities.
Prior to joining HP Labs, Umesh was a senior researcher at DEC's
Cambridge Research Lab, Chief Scientist at Xerox Advanced Information
Technology and Computer Corporation of America, and on the faculty at
the University of Texas-Austin. He received his PhD from Harvard
University.
Umesh has served on the Editorial Board of several international
journals, and has chaired and served on the Program Committees of
numerous conferences. Currently, he serves as a General Co-Chair of
ICDE 2010 in Long Beach, CA. He has been a member of the Board of the
VLDB Endowment, the Board of the International Foundation for
Cooperative Information Systems, the Executive Committee of the IEEE
Technical Committee on Electronic Commerce, and the Steering Committee
of the SIAM International Conference on Data Mining. He is an ACM
Fellow.
Maintained by Olga Papaemmanouil
olga AT cs.brandeis.edu