Topics in Data Science

Data is the new soil of business and (soon) at the core of essentially all domains from material science to healthcare. Mastering big data not only requires skills in a variety of disciplines from distributed systems over statistics to machine learning, but also requires an understanding of a complex ecosystem of tools and platforms. This seminar will try to shed some light into the complex space of data science covering aspects from data management, distributed algorithms, virtualization, data mining, machine learning, and statistics. We will discuss how these techniques complement each other to make sense of data at massive scale. Prerequisites: CSCI 0320 and 1270, or equivalents, or instructor permission.

