CSCI 1952Q: Algorithmic Aspects of Machine Learning (Spring 2023)
Basic Information
- Lectures: Tue/Thu 2:30-3:50 pm in CIT 368.
- Instructor: Yu Cheng.
- Email: yu_cheng@brown.edu
- Office Hours: Tuesday 4-5 pm in CIT 413, or by appointment.
- TAs:
- Mattie Ji (Head TA). Office Hours: Friday 4-6 pm.
- Linghai Liu. Office Hours: Wednesday 7-9 pm.
- Rohit Mohanty. Office Hours: Tuesday 7-9 pm.
- Colin Savage. Office Hours: Thursday 8-10 pm.
- Nora Tang. Office Hours: Monday 8-10 pm.
TA office hours will be conducted remotely on https://hours.cs.brown.edu/.
- Textbook:
- [LRU] Mining of Massive Datasets (Third Edition). Jure Leskovec, Anand Rajaraman, Jeff Ullman.
- [M] Algorithmic Aspects of Machine Learning. Ankur Moitra.
- [A] Theory of Deep Learning (book draft). Sanjeev Arora et al.
Course Description
In this course, we will explore the theoretical foundations of machine learning and deep learning. We will focus on designing and analyzing machine learning algorithms with provable guarantees. More specifically, in this course we will (1) introduce basic tools in linear algebra and optimization, including the power method, singular value decomposition, matrix calculus, (matrix) concentration inequalities, and (stochastic) gradient descent, (2) cover many examples where one can design algorithms with provably guarantees for fundamental problems in machine learning (under certain assumptions), including topic modeling, tensor decomposition, sparse coding, and matrix completion, and (3) discuss the emerging theory of deep learning, including landscape analysis, generalization and over-parameterization, neural tangent kernels, generalization bounds, and implicit regularization.
Top Projects and Reviewers
The course uses a voluntary peer-review process to evaluate all final project submissions. Based on this peer-review process, the following projects/students are chosen as top projects/reviewers:
- Top Projects:
- Top Reviewers:
- Michael Freeman.
- Daniel Li.
- Jay Sarva.
- Yiyang Nan.
- List of all reviewers. The names of the Teaching Assistants (TAs) have been omitted from the list of (top) reviewers. The following students participated in the peer-review process: Jason Eveleth, Michael Freeman, Kalen Frieberg, Brayden Goldstein-Gelb, Klimentina Krstevska, Tanadol Tiger Lamlertprasertkul, Mason Lee, Daniel Li, Yiyang Nan, Mikhail Okunev, Nishka Pant, Jay Sarva, Benjamin Shih, Yixiang Sun, Serdar Sungun, Panos Syrgkanis, Michael Youssef, Jiaqi Zhang, Tony Zhu. Thank you for your time and effort in reviewing the final projects of CSCI 1952Q!
Homeworks
Warning: the following assignments and due dates are from the past. If you are taking CSCI 1952Q this semester, please refer to the current course webpage.
Course Project
- Project guidelines.
- Important dates:
- Mar 9: Decide on teams and topics.
- Apr 13 - Apr 27: Presentations.
- May 15: Project report due.
Schedule
- Week 1 (Jan 23): Introduction.
- Week 2 (Jan 30): Non-Convex Optimization I (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Week 3 (Feb 6): Non-Convex Optimization II (Chapter 7 of [A], Chapter 9 of [LRU], Chapter 8 of [M]).
- Week 4 (Feb 13): Nonnegative Matrix Factorization I (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 5 (Feb 20): Nonnegative Matrix Factorization II (Chapter 1 of [M], Chapter 11 of [LRU]).
- Week 6 (Feb 27): Finding Similar Items I (Chapter 3 of [LRU]).
- Week 7 (Mar 6): Finding Similar Items II (Chapter 3 of [LRU]).
- Locality Sensitive Hashing (Lecture 11).
- Locality-Sensitive Functions and Amplification (Lecture 12).
- Week 8 (Mar 13): Algorithms for Data Streams (Chapter 4 of [LRU]).
- Sampling Data in a Stream (Lecture 13).
- Bloom Filter, Counting Distinct Elements in a Stream (Lecture 14).
- Week 9 (Mar 20): Network Analysis (Chapters 5 and 10 of [LRU]).
- PageRank and Computation of PageRank (Lecture 15).
- Personalized PageRank, Graph Partitioning (Lecture 16).
- Week 10 (Mar 27): No classes. Spring Recess.
- Week 11 (Apr 3): Theory of Deep Learning (Chapters 9 and 10 of [A]).
- Spectral Clustering, Local Graph Algorithms (Lecture 17).
- Global Convergence of Gradient Descent (Lecture 18)
- Week 12-14 (Apr 10):
- Implicit Bias of Gradient Descent (Lecture 19).
- Project Presentations.
- Additional Readings (optional):
- Non-Convex Optimization:
- Nonnegative Matrix Factorization:
- Locality-Sensitive Hashing:
- A. Gionis, P. Indyk, R. Motwani. Similarity Search in High Dimensions via Hashing. VLDB 1999.
- A. Z. Broder, M. Charikar, A. M. Frieze, M. Mitzenmacher. Min-Wise Independent Permutations. STOC 1998.
- M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni. Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. SCG 2004.
- R. O'Donnell, Y. Wu, Y. Zhou. Optimal Lower Bounds for Locality Sensitive Hashing (Except When q Is Tiny). ICS 2011.
- A. Andoni, I. Razenshteyn. Optimal Data-Dependent Hashing for Approximate Near Neighbors. STOC 2015.
- Streaming Algorithms:
- Network Analysis:
- Theoretical Deep Learning:
- S. S. Du, X. Zhai, B. Póczos, A. Singh. Gradient Descent Provably Optimizes Over-parameterized Neural Networks. ICLR 2019.
- Z. Allen-Zhu, Y. Li, Z. Song. A Convergence Theory for Deep Learning via Over-Parameterization. ICML 2019.
- J. D. Lee, Q. Lei, N. Saunshi, J. Zhuo. Predicting What You Already Know Helps: Provable Self-Supervised Learning. NeurIPS 2021.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals. Understanding Deep Learning (Still) Requires Rethinking Generalization. Commun. ACM. 2021.
- M. Belkin, D. Hsu, S. Ma, S. Mandal. Reconciling Modern Machine Learning and the Bias-Variance Trade-off. PNAS 2019.
Grading
- Homework assignments (50%): We will have 4 homework assignments and 3 coding assignments. You are encouraged to discuss with other students, but you must acknowledge who you worked with and write up your solutions independently. Late submissions are not accepted (exceptions with good reasons may be requested in advance).
- Course project (50%): Students will form teams to work on research projects or explore in depth a topic that we did not cover in class. Each group will do a project presentation and write a project report.
- Bonus points (20%): There are bonus points for extra-credit homework questions and for top final projects and reviewers.
Academic Integrity
Academic achievement is evaluated on the basis of work that a student produces independently. A student who obtains credit for work, words, or ideas which are not the products of his or her own effort is dishonest. Such dishonesty undermines the integrity of academic standards of the University. Infringement of the Academic Code entails penalties ranging from reprimand to suspension, dismissal or expulsion from the University. Students who have questions on any aspect of the Academic Code should consult the instructor or one of the deans of the Graduate School to avoid the serious charge of academic dishonesty.
Disability Policies
Brown University is committed to full inclusion of all students. Any student with a documented disability is welcome to contact the instructor as early in the semester as possible so that reasonable accommodations can be arranged. If you need accommodations around online learning or in-classroom accommodations, please be sure to reach out to Student Accessibility Services (SAS) for their assistance (sas@brown.edu, 401-863-9588, Brown SAS Website). Students may also speak with Student Accessibility Services at 401-863-9588 to discuss the process for requesting accommodations.
Religious Holidays
Students who wish to observe their religious holidays shall inform the instructor within the first four weeks of the semester, unless the religious holiday is observed in the first four weeks. In such cases, the students shall notify the faculty member at least five days in advance of the date when he/she will be absent. The instructor shall make every reasonable effort to honor the request.