Broadly, I am interested in Database Management Systems (DBMSs) and Data Science. Specifically, my interests lie in Big Data Visualization, Query Optimization, Transaction Processing, Stream Processing,
and how the advent of new computer hardware shifts the computer system's architecture and affect the performance of DBMSs.
S-Store: Real-Time Analytics Meets Transaction Processing
The goal is to build a stream processing system that can simultaneously accommodate OLTP and streaming applications.
- Investigated how nested transactions can help preserve the data integrity in a streaming context.
- Developed an efficient nested transaction facility in S-Store to guarantee the consistency of the stored state.
- Collaboration with Carnegie Mellon University, Intel Labs (ISTC Big Data), and Massachusetts Institute of Technology.
DBNav: Query Optimization on Analytical Visualization Systems
The goal is to develop novel query optimization techniques to improve the scalability of visualization systems.
- Developed predictive checkpointing techniques to do multi-query optimization via automatic view materialization and pre-aggregation for bar chart, pie chart, histogram, and interactive k-means clustering visualizations.
- Developed a visual approximate sampling algorithm for optimizing the scatter plot visualization. The algorithm significantly reduced the back-end data-fetching latency and at the same time preserved the visual correctness of the original dataset.
- Conducted a user study with over 200 participants on Amazon Mechanical Turk to compare my algorithm against the stratified sampling algorithm. Under the same 10% sampling rate, over 99% of the participants claimed that my algorithm offered a much more accurate visual representation of the original dataset.
Seer: Predictive Middleware for Big Data Visualization
The goal is to build a predictive prefetching and caching middleware to aid the exploratory visualization of big data.
- Co-developed a profile-driven hierarchical prediction algorithm using Markov model and sequential rule mining.
- Co-developed a multidimensional predictive cache that employed predictive LRU eviction policy.
- Conducted a user study to evaluate the effectiveness of the prediction algorithm on real-world dataset. The result suggested our learning-based algorithm significantly outperformed locality-based prefetching techniques.
Publications & Presentations
- Justin DeBrabant, Chenggang Wu, Ugur Cetintemel, and Stan Zdonik. Seer: Profile-Driven Prefetching and Caching for Interactive Visualization of Large Multidimensional Data. Submitted to SIGMOD 2015.
- Justin DeBrabant, Chenggang Wu, Ugur Cetintemel, and Stan Zdonik. Seer: Profile-Driven Prefetching and Caching for Interactive Visualization of Large Multidimensional Data. Poster Presentation at Brown University Industrial Partners Program Networking Reception & Student Research Showcase.
I love teaching. I believe that a successful researcher not only knows how to produce new knowledge, but also knows how to share and propagate the discovery to others.
Below are the courses in which I have served as a teaching assistant:
- Head Teaching Assistant, Introduction to Data Science, Brown University, Spring 2015
- Head Teaching Assistant, Database Management Systems, Brown University, Fall 2014
- Teaching Assistant, Object-Oriented Programming, Illinois Institute of Technology, Fall 2011 and Spring 2012