Thesis Defense

 

"Accelerating Interactive Data Exploration"

Alexander Galakatos

Friday, November 2, 2018 at 3:00 P.M.

Room 368 (CIT 3rd Floor)

The widespread popularity of visual data exploration tools has empowered domain experts in a broad range of fields to make data-driven decisions. However, a key requirement of these tools is the ability to provide query results at "human speed," even over large datasets. To meet these strict requirements, we propose a complete redesign of the interactive data exploration stack. Instead of simply replacing or extending existing systems, however, we propose an Interactive Data Exploration Accelerator (IDEA) that connects to existing data management infrastructures in order to speed up query processing for visual data exploration tools.

In this redesigned interactive data exploration stack, where an IDEA sits between a visual data exploration tool and a backend data source, several interesting opportunities emerge to improve interactivity for end users. In this context, we first propose a novel approximate query processing formulation that better models the conversational interaction paradigm promoted by visual data exploration tools. Since the time dimension is often a critical component of many datasets, we then present a detailed survey of existing backend data sources specifically designed for time-dependent data in the context of interactive data exploration. Finally, based on the results from our study, we propose a new approximate index structure for an interactive data exploration stack that leverages the trends that exist in the underlying data to improve interactivity while significantly reducing the storage footprint.

Host: Professor Tim Kraska