Hands-on Data Science

Not offered this year
Offered every year, last taught:

Fall 2022

Develops all aspects of the machine learning pipeline: data acquisition and cleaning, handling missing data, exploratory data analysis, visualization, feature engineering, modeling, interpretation, presentation in the context of real-world datasets. Fundamental considerations for data analysis are emphasized (the bias-variance tradeoff, training, validation, testing). Classical models and techniques for classification and regression are included (linear and logistic regression with regularization, support vector machines, decision trees, random forests, XGBoost). Uses the Python data science ecosystem (e.g., sklearn, pandas, numpy, matplotlib).

Prerequisites: A course equivalent to CSCI 0050, CSCI 0150 or CSCI 0170 are strongly recommended.
Enrollment is limited to Data Science Master's program students.

  • Andras Zsom