CSCI 1952Q: Algorithmic Aspects of Machine Learning (Spring 2024)
Basic Information
- Lectures: Tue-Thu 1:00-2:20 pm in CIT 241.
- Instructor: Yu Cheng.
- Email: yu_cheng@brown.edu
- Office Hours: Tuesday 2:30-3:30 pm in CIT 413, or by appointment.
- TAs:
- Jay Sarva (Head TA). Office Hours: Thursday, 6:00 – 8:00pm in CIT 102.
- Suraj Anand. Office Hours: Wednesday, 8:00 – 10:00am in CIT 102.
- Nishka Pant. Office Hours: Friday, 10:00am - 12:00pm in CIT 102.
- Sagar Raichandani. Office Hours: Wednesday, 3:00 – 5:00pm in CIT 203.
- Zhengyu Zou. Office Hours: Monday, 3:00 – 5:00pm in CIT 102.
TA Hours will also support Zoom/remote attendance via hours.cs.brown.edu.
- Textbook:
- [LRU] Mining of Massive Datasets (Third Edition). Jure Leskovec, Anand Rajaraman, Jeff Ullman.
- [M] Algorithmic Aspects of Machine Learning. Ankur Moitra.
- [A] Theory of Deep Learning (book draft). Sanjeev Arora et al.
Course Description
In this course, we will explore the theoretical foundations of machine learning and deep learning, with a focus on the design and analysis of learning algorithms with provable guarantees. Throughout the course, we will (1) investigate why simple algorithms can solve machine learning problems that are computationally hard in the worst case, (2) discuss data mining and machine learning algorithms for analyzing very large amounts of data, and (3) understand the success of deep learning by studying the emerging theory of deep learning. Example topics include non-convex optimization, non-negative matrix factorization, locality-sensitive hashing, streaming algorithms, local graph algorithms, and over-parameterization and implicit bias in deep learning. Prior knowledge of linear algebra, algorithms and data structures, probability, and statistics is recommended.
Top Projects and Reviewers
The course uses a peer-review process to evaluate all final project submissions. Based on this peer-review process, the following projects/students are chosen as top projects/reviewers:
- Top Projects:
- Top Reviewers (excluding instructor and TAs):
- List of all reviewers. The following students and TAs participated in the peer-review process: Michael Fu, Chia-Hong Hsu, Rohit Mohnani, Nishka Pant, Sagar Raichandani, Jay Sarva, Boran Wang, Peter Zhu, Zhengyu Zou, Kai Zuang. Thank you for your time and effort in reviewing the final projects of CSCI 1952Q!
Assignments
Warning: the following assignments and due dates are from the past. If you are taking CSCI 1952Q this semester, please refer to the current course webpage.
Course Project
- Project guidelines.
- Important dates:
- Mar 4: Decide on teams and topics.
- Apr 18 - Apr 25: Presentations.
- May 13: Project report due.
Schedule
- Week 1 (Jan 22): Introduction.
- Week 2 (Jan 29): Non-Convex Optimization I (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Week 3 (Feb 5): Non-Convex Optimization II (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Week 4 (Feb 12): Nonnegative Matrix Factorization I (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 5 (Feb 19): Nonnegative Matrix Factorization II (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 6 (Feb 26): Finding Similar Items I (Chapter 3 of [LRU]).
- Week 7 (Mar 4): Finding Similar Items II (Chapter 3 of [LRU]).
- Locality-Sensitive Functions and Amplification (Lecture 11).
- Sampling Data in a Stream, Bloom Filter (Lecture 12).
- Week 8 (Mar 11): Algorithms for Data Streams (Chapter 4 of [LRU]).
- Counting Distinct Elements in a Stream (Lecture 13).
- PageRank and Computation of PageRank (Lecture 14).
- Week 9 (Mar 18): Network Analysis (Chapters 5 and 10 of [LRU]).
- Personalized PageRank, Graph Partitioning (Lecture 15).
- Ask Me Anything (Lecture 16).
- Week 10 (Mar 25): No classes. Spring Recess.
- Week 11 (Apr 1): Theory of Deep Learning I (Chapters 9 and 10 of [A]).
- Spectral Clustering, Local Graph Algorithms (Lecture 17).
- Implicit Bias of Gradient Descent (Lecture 18).
- Week 12 (Apr 8): Theory of Deep Learning II (Chapters 9 and 10 of [A]).
- Week 13-14 (Apr 15): Project Presentations.
- Transformers Learn In-Context (Lecture 21).
- Project Presentations.
- Additional Readings (optional):
- Non-Convex Optimization:
- Nonnegative Matrix Factorization:
- Locality-Sensitive Hashing:
- A. Gionis, P. Indyk, R. Motwani. Similarity Search in High Dimensions via Hashing. VLDB 1999.
- A. Z. Broder, M. Charikar, A. M. Frieze, M. Mitzenmacher. Min-Wise Independent Permutations. STOC 1998.
- M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni. Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. SCG 2004.
- R. O'Donnell, Y. Wu, Y. Zhou. Optimal Lower Bounds for Locality Sensitive Hashing (Except When q Is Tiny). ICS 2011.
- A. Andoni, I. Razenshteyn. Optimal Data-Dependent Hashing for Approximate Near Neighbors. STOC 2015.
- Streaming Algorithms:
- Network Analysis:
- Theoretical Deep Learning:
- S. S. Du, X. Zhai, B. Póczos, A. Singh. Gradient Descent Provably Optimizes Over-parameterized Neural Networks. ICLR 2019.
- Z. Allen-Zhu, Y. Li, Z. Song. A Convergence Theory for Deep Learning via Over-Parameterization. ICML 2019.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals. Understanding Deep Learning (Still) Requires Rethinking Generalization. Commun. ACM. 2021.
- J. D. Lee, Q. Lei, N. Saunshi, J. Zhuo. Predicting What You Already Know Helps: Provable Self-Supervised Learning. NeurIPS 2021.
- M. Chidambaram, C. Wu, Y. Cheng, R. Ge. Hiding Data Helps: On the Benefits of Masking for Sparse Coding. ICML 2023.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin. Attention Is All You Need. NIPS 2017.
- M. Belkin, D. Hsu, S. Ma, S. Mandal. Reconciling Modern Machine Learning and the Bias-Variance Trade-off. PNAS 2019.
Grading
- Homework assignments (54%): The course includes 4 written assignments and 4 coding assignments.
Students are encouraged to discuss with others, but each student must write up their solutions independently and acknowledge who they worked with.
- Course project (46%): Students will form groups to work on research projects or explore in depth a topic that we did not cover in class. Each group will do a project presentation and write a project report.
- Bonus points (21%): There are extra-credit questions in all assignments. There are bonus points for top final projects and reviewers.
Disability Policies
Brown University is committed to full inclusion of all students. Any student with a documented disability is welcome to contact the instructor as early in the semester as possible so that reasonable accommodations can be arranged. If you need accommodations around online learning or in-classroom accommodations, please be sure to reach out to Student Accessibility Services (SAS) for their assistance (sas@brown.edu, 401-863-9588, Brown SAS Website). Students may also speak with Student Accessibility Services at 401-863-9588 to discuss the process for requesting accommodations.
Religious Holidays
Students who wish to observe their religious holidays shall inform the instructor within the first four weeks of the semester, unless the religious holiday is observed in the first four weeks. In such cases, the students shall notify the faculty member at least five days in advance of the date when he/she will be absent. The instructor shall make every reasonable effort to honor the request.