CSCI 1520: Algorithmic Aspects of Machine Learning (Spring 2025)
Basic Information
- Lectures: Tue/Thu 1:00-2:20 pm in CIT 241.
- Instructor: Yu Cheng.
- Email: yu_cheng@brown.edu
- Office Hours: Tuesday 2:30-3:30 pm in CIT 413, or by appointment.
- TAs:
- Harys Dalvi (Head TA). Office Hours: Thursday, 3:00 – 5:00pm in CIT 203.
- Chia-Hong Hsu. Office Hours: Wednesday, 11:00am - 1:00pm in CIT 203.
- Linda Xu. Office Hours: Wednesday, 1:00pm - 3:00pm in CIT 203.
- Michael Youssef. Office Hours: Friday, 11:00am - 1:00pm in CIT 205.
- Textbook:
- [LRU] Mining of Massive Datasets (Third Edition). Jure Leskovec, Anand Rajaraman, Jeff Ullman.
- [M] Algorithmic Aspects of Machine Learning. Ankur Moitra.
- [A] Theory of Deep Learning (book draft). Sanjeev Arora et al.
Course Description
In this course, we will explore the theoretical foundations of machine learning and deep learning, with a focus on the design and analysis of learning algorithms with provable guarantees. Throughout the course, we will (1) discuss data mining and machine learning algorithms for analyzing very large amounts of data, (2) investigate why simple algorithms can solve machine learning problems that are computationally hard in the worst case, and (3) understand the success of deep learning by studying the emerging theory of deep learning. Example topics include locality-sensitive hashing, streaming algorithms, local graph algorithms, non-negative matrix factorization, non-convex optimization, and over-parameterization and implicit bias in deep learning. Prior knowledge of linear algebra, algorithms and data structures, probability, and statistics is recommended.
Assignments
Schedule
- Week 1 (Jan 20): Introduction.
- Week 2 (Jan 27): Finding Similar Items I (Chapter 3 of [LRU]).
- Week 3 (Feb 3): Finding Similar Items II (Chapter 3 of [LRU]).
- Week 4 (Feb 10): Algorithms for Data Streams (Chapter 4 of [LRU]).
- Week 5 (Feb 17): Network Analysis I (Chapters 5 and 10 of [LRU]).
- Week 6 (Feb 24): Network Analysis II (Chapters 5 and 10 of [LRU]).
- Week 7 (Mar 3): Nonnegative Matrix Factorization I (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 8 (Mar 10): Nonnegative Matrix Factorization II (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 9 (Mar 17): Non-Convex Optimization I (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Matrix Completion (Lecture 15).
- Ask Me Anything (Lecture 16).
- Week 10 (Mar 24): No classes. Spring Recess.
- Week 11 (Mar 31): Non-Convex Optimization II (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Week 12 (Apr 7): Theory of Deep Learning I (Chapters 9 and 10 of [A]).
- Week 13 (Apr 14): Theory of Deep Learning II (Chapters 9 and 10 of [A]).
- Self-Supervised Learning (Lecture 21).
- Transformers Learn In-Context (Lecture 22).
- Week 14 (Apr 21): Theory of Deep Learning III (Chapters 9 and 10 of [A]).
- TBD (Lecture 23).
- TBD (Lecture 24).
- Additional Readings (optional):
- Locality-Sensitive Hashing:
- A. Gionis, P. Indyk, R. Motwani. Similarity Search in High Dimensions via Hashing. VLDB 1999.
- A. Z. Broder, M. Charikar, A. M. Frieze, M. Mitzenmacher. Min-Wise Independent Permutations. STOC 1998.
- M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni. Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. SCG 2004.
- R. O'Donnell, Y. Wu, Y. Zhou. Optimal Lower Bounds for Locality Sensitive Hashing (Except When q Is Tiny). ICS 2011.
- A. Andoni, I. Razenshteyn. Optimal Data-Dependent Hashing for Approximate Near Neighbors. STOC 2015.
- Streaming Algorithms:
- Network Analysis:
- Nonnegative Matrix Factorization:
- Non-Convex Optimization:
- Theoretical Deep Learning:
- S. S. Du, X. Zhai, B. Póczos, A. Singh. Gradient Descent Provably Optimizes Over-parameterized Neural Networks. ICLR 2019.
- Z. Allen-Zhu, Y. Li, Z. Song. A Convergence Theory for Deep Learning via Over-Parameterization. ICML 2019.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals. Understanding Deep Learning (Still) Requires Rethinking Generalization. Commun. ACM. 2021.
- R. Balestriero et al. A Cookbook of Self-Supervised Learning. 2023.
- J. D. Lee, Q. Lei, N. Saunshi, J. Zhuo. Predicting What You Already Know Helps: Provable Self-Supervised Learning. NeurIPS 2021.
- M. Chidambaram, C. Wu, Y. Cheng, R. Ge. Hiding Data Helps: On the Benefits of Masking for Sparse Coding. ICML 2023.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin. Attention Is All You Need. NIPS 2017.
- M. Belkin, D. Hsu, S. Ma, S. Mandal. Reconciling Modern Machine Learning and the Bias-Variance Trade-off. PNAS 2019.
Grading
- Homework assignments (70%): The course includes 3 written assignments and 3 coding assignments. Students are encouraged to discuss with others, but each student must write up their solutions independently and acknowledge who they worked with.
- Final exam (30%): May 7th (Wed), 9:00-11:00am.
- Bonus points (18%): There are extra-credit questions in all assignments and the final exam.
Academic Integrity
Academic achievement is evaluated on the basis of work that a student produces independently. A student who obtains credit for work, words, or ideas which are not the products of his or her own effort is dishonest. Such dishonesty undermines the integrity of academic standards of the University. Infringement of the Academic Code entails penalties ranging from reprimand to suspension, dismissal or expulsion from the University. Students who have questions on any aspect of the Academic Code should consult the instructor or one of the deans of the Graduate School to avoid the serious charge of academic dishonesty.
Disability Policies
Brown University is committed to full inclusion of all students. Any student with a documented disability is welcome to contact the instructor as early in the semester as possible so that reasonable accommodations can be arranged. If you need accommodations around online learning or in-classroom accommodations, please be sure to reach out to Student Accessibility Services (SAS) for their assistance (sas@brown.edu, 401-863-9588, Brown SAS Website). Students may also speak with Student Accessibility Services at 401-863-9588 to discuss the process for requesting accommodations.
Religious Holidays
Students who wish to observe their religious holidays shall inform the instructor within the first four weeks of the semester, unless the religious holiday is observed in the first four weeks. In such cases, the students shall notify the faculty member at least five days in advance of the date when he/she will be absent. The instructor shall make every reasonable effort to honor the request.