"Submodular Maximization over Sliding Windows"

Alessandro Epasto, Google NYC

Friday, April 28, 2017 at 11:00 A.M.

Room 316 (CIT 3rd floor)

Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification and summarization. We study this question in the sliding window model of data streams, where elements arrive one at a time, and we want to design low-memory and fast update-time algorithms that maintain a good solution considering only the last $W$ items arrived.

In this context, we provide the first non-trivial algorithm that maintains a provable approximation of the optimum using space sublinear in the size of the window. In particular we give a $1/3 - \epsilon$ approximation algorithm that uses space polylogarithmic in the spread of the values of the elements, and linear in the solution size $k$ for any constant $\epsilon> 0$ while requiring polylogarithmic update time. We also show a different algorithm that, at the cost of using more memory, provides a $1/2 - \epsilon$ approximation thus matching the best known approximation guarantees for submodular optimization in insertion-only streams, a less general formulation of the problem.

We demonstrate the efficacy of the algorithms on a number of real world datasets, showing that we can preserve high quality solutions in streams with millions of items, while storing a negligible fraction of them.

Joint work with Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam. Link: In proceeding of the International World Wide Web conference, Perth, Australia, WWW, 2017.

Alessandro Epasto is a research scientist at Google NYC in the graph mining team lead by Vahab Mirrokni. Before joining Google in 2016 he was a postdoc at Brown University advised by prof. Eli Upfal. He received a Ph.D. in computer science from Sapienza University of Rome in 2015 where he was advised by prof. Alessandro Panconesi and supported by the Google European Doctoral Fellowship. His research interests include problems in large-scale data mining and graph analysis.

Host: Professor Eli Upfal