CSCI1390

Systems for Machine Learning

Fall 2026

Many applications, across industries varying from ecommerce to education, rely on data processing and machine learning systems for data analytics tasks. Due to how widely used these applications are, performance, specifically latency, throughput, and hardware efficiency, is very important. However, achieving high performance in these systems can be challenging. ML systems are run on different types of hardware accelerators (GPUs, TPUs) that have unique performance characteristics. Models are becoming larger and larger, and even with access to the most powerful hardware, systems must manage memory, memory bandwidth and network resources carefully when doing training and inference. This class will explore systems-related challenges related to building, training, deploying and managing large-scale data processing and machine learning systems. Hands-on projects will include implementing efficient training and inference techniques, CUDA programming, and building compound AI systems.

Instructor(s):
Meets:
TTh 10:30am-11:50am
Exam Group:TBA
Max Seats:70 Full
CRN:15945