As machine learning is deployed more widely, researchers and practitioners keep running into a fundamental problem: how do we get enough labeled data? This seminar course will survey research on learning when only limited labeled data is available. Topics covered include weak supervision, semi-supervised learning, active learning, transfer learning, and few-shot learning. Students will lead discussions on classic and recent research papers, and work in teams on final research projects.
For questions, discussion, and other course-related posts, use Canvas.
If you have an atypical question that you are certain does not belong on Canvas, email the instructor.
- Introduction to Semi-Supervised Learning. Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. In Semi-Supervised Learning, MIT Press, 2006. [PDF] [Online, requires Brown login]
- Incidental Supervision: Moving beyond Supervised Learning. Dan Roth. AAAI 2017. [PDF]
- How to Read a CS Research Paper? Philip W. L. Wong. [PDF]
- How to Read a Technical Paper. Jason Eisner. [Online]
- How to Read a Paper. S. Keshav. [PDF]
[PDF] [Online, requires Brown login]
- Risks of Semi-Supervised Learning: How Unlabeled Data can Degrade Performance of Generative Classifiers. Fabio Cozman and Ira Cohen. In Semi-Supervised Learning, MIT Press, 2006. [Online, requires Brown login]
- Input-dependent Regularization of Conditional Density Models. Matthias Seeger. Institute for ANC Technical Report, 2000. [PDF]
- Analyzing the effectiveness and applicability of co-training. Kamal Nigam and Rayid Ghani. International Conference on Information and Knowledge Management (CIKM) 2000. [PDF]
- Semi-Supervised Classification with Graph Convolutional Networks. Thomas N. Kipf and Max Welling. International Conference on Learning Representations (ICLR) 2017. [PDF]
[Online, requires Brown login]
- Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. A. P. Dawid and A. M. Skene. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):20-28, 1979. [PDF]
Learning from Labeled Features using Generalized Expectation Criteria. Gregory Druck, Gideon Mann, and Andrew McCallum. Conference on Research and Development in Information Retrieval (SIGIR) 2008.
- How transferable are the datasets collected by active learners? David Lowell, Zachary C. Lipton, and Byron C. Wallace. ArXiv:1807.04801. [PDF]
- Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study. Aditya Siddhant and Zachary C. Lipton. ArXiv:1808.05697. [PDF]
[Online, requires Brown login]
- Universal Language Model Fine-tuning for Text Classification. Jeremy Howard and Sebastian Ruder. Annual Meeting of the Association for Computational Linguistics (ACL) 2018. [PDF]
[Online, requires Brown login]
- Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly. Yongqin Xian, Christoph H. Lampert, Bernt Schiele, and Zeynep Akata. TPAMI 2018. [PDF]
Apprenticeship learning via inverse reinforcement learning. Pieter Abbeel and Andrew Y. Ng. International Conference on Machine Learning (ICML) 2004.
Students who complete this course will:
- Acquire a working knowledge of the landscape of research on machine learning with limited labeled data.
- Practice identifying and critically assessing the claims, contributions, and supporting evidence in machine learning research papers.
- Develop their ability to share scientific ideas via writing and discussion.
- Gain practical experience with the course's subject matter by applying and extending it to their own research interests though an open-ended project.
The following standards will be used to assign grades.
- Participate actively in class discussions by asking questions, sharing opinions, and listening carefully to others.
- Meet all deadlines in the course schedule related to the research project.
- Submit two discussion questions to Canvas by 6 PM the evening before class for assigned readings, missing no more than 3 readings.
- Attend class meetings, missing no more than 3 meetings.
- Fulfill the requirements below to earn a B.
- Conduct an original research project related to course materials and submit a written report meeting the assignment guidelines.
- Lead the assigned class discussion demonstrating preparation and inclusion.
- Submit two discussion questions to Canvas by 6 PM the evening before class for assigned readings, missing no more than 6 readings.
- Attend class meetings, missing no more than 6 meetings.
|Submitting Discussion Questions||10|
|Preparing to Lead Discussion(s)||2|
|Project Proposal / Status||10|
|Project Final Report||5|
The Brown computer science department has made it its mission to create and sustain a diverse and inclusive environment in which all students, faculty, and staff can thrive. In this course, that responsibility falls on us all, students and teaching staff alike. In particular, Brown's Discrimination and Harassment Policy applies to all participants.
If you feel you have not been treated in an inclusive manner by any of the course members, please contact either me (Stephen) or the department chair (Prof. Cetintemel). Laura Dobler is also available as a resource for members of underrepresented groups. Additional resources are listed on the department's website. We, the computer science department, take all complaints about discrimination, harassment, and other unprofessional behavior seriously.
In addition, Brown welcomes students from all around the country and the world, and their unique perspectives enrich our learning community. To empower students whose first language is not English, an array of support is available on campus, including language and culture workshops and individual appointments. For more information, contact the English Language Learning Specialists at firstname.lastname@example.org.
Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Brown Academic and Student Conduct Codes. For project work, feel free to build on third-party software, datasets, or other resources, as long as you credit them in your report(s) and clearly state what work is solely your own. As a general policy (for this course and for the rest of your academic career): if you use any idea, text, code, or data that you did not create, then cite it.
Brown University is committed to full inclusion of all students. Please inform me if you have a disability or other condition that might require accommodations or modification of any of these course procedures. You may email me, come to office hours, or speak with me after class, and your confidentiality is respected. I will do whatever I can to support accommodations recommended by SEAS. For more information contact Student and Employee Accessibility Services (SEAS) at 401-863-9588 or SEAS@brown.edu.
Being a student can be very stressful. If you feel you are under too much pressure or there are psychological issues that are keeping you from performing well at Brown, I encourage you to contact Brown’s Counseling and Psychological Services CAPS. They provide confidential counseling and can provide notes supporting accommodations for health reasons.
I expect everyone to complete the course on time. However, I understand that there may be factors beyond your control, such as health problems and family crises, that prevent you from finishing the course on time. If you feel you cannot complete the course on time, please discuss with me (Stephen) the possibility of being given a grade of Incomplete for the course and setting a schedule for completing the course in the upcoming year.
Thanks to Tom Doeppner, Laura Dobler, and Daniel Ritchie for borrowed text.
Updated Oct. 1, 2018: Substituted two papers for project presentations to accomodate larger course enrollment. Estimated time commitment and grading updated.