Learning with Limited Labeled Data (Fall 2022)
Course Description

As machine learning is deployed more widely, researchers and practitioners keep running into a fundamental problem: how do we get enough labeled data? This seminar course will survey research on learning when only limited labeled data is available. Topics covered include semi-supervised learning, transfer learning, weak supervision, few-shot learning, and zero-shot learning. Students will lead discussions on recent research papers and develop final research projects.

Essential Info
Instructor: Stephen Bach a.k.a. Steve
Class Meetings: Tuesdays and Thursdays, 1-2:20 pm, CIT 316
Office Hours: See the Canvas homepage for information.
Textbook: None
Prerequisites: Previous experience in machine learning is required through CSCI 1420 or equivalent research experience.
Contact

For questions, discussion, and other course-related posts, use Canvas.

If you have an atypical question that you are certain does not belong on Canvas, email the instructor.

Course Schedule
Introduction
Sep 8
Introductions, an overview of the research topics we will cover during the semester, how to read a research paper.
Suplemental reading:
  • Introduction to Semi-Supervised Learning. Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. In Semi-Supervised Learning, MIT Press, 2006. [PDF] [Online, requires Brown login]
  • Incidental Supervision: Moving beyond Supervised Learning. Dan Roth. AAAI 2017. [PDF]
  • How to Read a CS Research Paper? Philip W. L. Wong. [PDF]
  • How to Read a Technical Paper. Jason Eisner. [Online]
  • How to Read a Paper. S. Keshav. [PDF]
Semi-Supervised Learning
Sep 13
MixMatch: A Holistic Approach to Semi-Supervised Learning. David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin Raffel. Neural Information Processing Systems (NeurIPS) 2019.
[PDF] [Supplemental (Zip)] [Reviews] [Code]
Suplemental reading:
  • FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A. Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. Neural Information Processing Systems (NeurIPS) 2020. [PDF] [Supplemental] [Reviews] [Code]
Sep 15
Semi-supervised Sequence Learning. Andrew M. Dai and Quoc V. Le. Neural Information Processing Systems (NeurIPS) 2015.
[PDF] [Reviews]
Suplemental reading:
  • Unsupervised Data Augmentation for Consistency Training. Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. Neural Information Processing Systems (NeurIPS) 2020. [PDF] [Supplemental] [Reviews] [Code]
Transfer Learning
Sep 20
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Journal of Machine Learning Research (JMLR) 21(140):1-67, 2020.
[PDF] [Blog] [Code]
Suplemental reading:
  • XLNet: Generalized Autoregressive Pretraining for Language Understanding. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. Neural Information Processing Systems (NeurIPS) 2019. [PDF] [Supplemental (Zip)] [Reviews]
Sep 22
Start of course survey due
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. International Conference on Learning Representations (ICLR) 2020.
[PDF] [Reviews] [Code]
Suplemental reading:
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Meeting of the North American Association for Computational Linguistics (NAACL) 2019. [PDF]
Sep 27
Big Transfer (BiT): General Visual Representation Learning. Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby. European Conference on Computer Vision (ECCV) 2020.
[PDF] [Code]
Supplemental reading:
  • TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data. Wasu Piriyakulkij, Cristina Menghini, Ross Briden, Nihal V. Nayak, Jeffrey Zhu, Elaheh Raisi, and Stephen H. Bach. Conference on Machine Learning and Systems (MLSys) 2022. [PDF] [Code]
Sep 29
Learning Transferable Visual Models From Natural Language Supervision. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. International Conference on Machine Learning (ICML) 2021.
[PDF] [Blog]
Supplemental reading:
  • SLIP: Self-supervision meets Language-Image Pre-training. Norman Mu, Alexander Kirillov, David Wagner, and Saining Xie. ArXiv 2112.12750 2021. [PDF]
Weakly Supervised Learning
Oct 4
Snorkel: Rapid Training Data Creation with Weak Supervision. Alexander Ratner, Stephen H. Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. Proceedings of the VLDB Endowment, 11(3):269-282, 2017.
[PDF] [Code]
Supplemental reading:
  • Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm. A. P. Dawid and A. M. Skene. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1):20-28, 1979. [Online, requires Brown login]
Oct 6
Weakly Supervised Sequence Tagging from Noisy Rules. Esteban Safranchik, Shiying Luo, and Stephen H. Bach. AAAI Conference on Artificial Intelligence (AAAI) 2020.
[PDF] [Code]
Supplemental reading:
  • BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition. Yinghao Li, Pranav Shetty, Lucas Liu, Chao Zhang, and Le Song Meeting of the Association for Computational Linguistics (ACL) 2021. [PDF] [Code] [Video]
Oct 11
Self-Training with Weak Supervision. Giannis Karamanolakis, Subhabrata Mukherjee, Guoqing Zheng, and Ahmed Hassan Awadallah. Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2021.
[PDF] [Code] [Video]
Supplemental reading:
  • WRENCH: A Comprehensive Benchmark for Weak Supervision. Jieyu Zhang, Yue Yu, Yinghao Li, Yujing Wang, Yaming Yang, Mao Yang, and Alexander Ratner. NeurIPS Datasets and Benchmarks Track 2022. [PDF] [Supplemental] [Code] [Reviews]
Oct 13
Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision. Mayee F Chen, Daniel Yang Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, and Christopher Ré. Uncertainty in Artificial Intelligence (UAI) 2022.
[PDF] [Supplemental (Zip)] [Reviews]
Supplemental reading:
  • Language Models in the Loop: Incorporating Prompting into Weak Supervision. Ryan Smith, Jason A. Fries, Braden Hancock, and Stephen H. Bach. ArXiv 2205.02318 2022. [PDF]
Few-Shot Learning
Oct 18
Prototypical Networks for Few-shot Learning. Jake Snell, Kevin Swersky, and Richard Zemel. Neural Information Processing Systems (NeurIPS) 2017.
[PDF] [Supplemental (Zip)] [Reviews]
Supplemental reading:
  • Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Chelsea Finn, Pieter Abbeel, and Sergey Levine. International Conference on Machine Learning (ICML) 2017. [PDF]
Oct 20
Project proposal due
Language Models are Few-Shot Learners. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Neural Information Processing Systems (NeurIPS) 2020.
[PDF]
Supplemental reading:
  • On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2021. [PDF]
Oct 25
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. Guanghui Qin and Jason Eisner. Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2021.
[PDF] [Supplemental] [Video]
Supplemental reading:
  • The Power of Scale for Parameter-Efficient Prompt Tuning. Brian Lester, Rami Al-Rfou, and Noah Constant. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021. [PDF] [Code] [Video]
Oct 27
How many data points is a prompt worth? Teven Le Scao and Alexander Rush. Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2021.
[PDF] [Supplemental] [Code] [Video]
Supplemental reading:
  • Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models. Robert Logan IV, Ivana Balazevic, Eric Wallace, Fabio Petroni, Sameer Singh, and Sebastian Riedel. Findings of the Association for Computational Linguistics 2022. [PDF] [Code]
Nov 1
Conditional Prompt Learning for Vision-Language Models. Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
[PDF] [Code]
Supplemental reading:
  • DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations. Ximeng Sun, Ping Hu, and Kate Saenko. ArXiv 2206.09541 2022. [PDF]
Nov 3
Training language models to follow instructions with human feedback. Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. ArXiv 2203.02155 2022.
[PDF]
Supplemental reading:
  • None
Nov 8
Election Day
(No class)
Zero-Shot Learning
Nov 10
DeViSE: A Deep Visual-Semantic Embedding Model. Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. In Neural Information Processing Systems (NeurIPS) 2015.
[PDF] [Supplemental (Zip)] [Reviews]
Supplemental reading:
  • Zero-Shot Learning through Cross-Modal Transfer. Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Y. Ng. In Neural Information Processing Systems (NeurIPS) 2013. [PDF] [Reviews]
Nov 15
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs. Xiaolong Wang, Yufei Ye, and Abhinav Gupta. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018.
[PDF]
Supplemental reading:
  • Rethinking Knowledge Graph Propagation for Zero-Shot Learning. Michael Kampffmeyer, Yinbo Chen, Xiaodan Liang, Hao Wang, Yujia Zhang, and Eric P. Xing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019. [PDF]
Nov 17
Project status report due
Multitask Prompted Training Enables Zero-Shot Task Generalization. Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, and Alexander M. Rush. International Conference on Learning Representations (ICLR) 2022.
[PDF] [Code] [Data] [Reviews]
Supplemental reading:
  • Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks. Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, and Daniel Khashabi. ArXiv 2204.07705 2022. [PDF] [Data]
Nov 22
Do Prompt-Based Models Really Understand the Meaning of Their Prompts? Albert Webson and Ellie Pavlick. Meeting of the North American Association for Computational Linguistics (NAACL) 2022.
[PDF] [Code]
Supplemental reading:
  • Can language models learn from explanations in context? Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, and Felix Hill. ArXiv 2204.02329 2022 [PDF]
Nov 24
Thanksgiving
(No class)
Nov 29
Image Segmentation Using Text and Image Prompts. Timo Lüddecke and Alexander Ecker. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.
[PDF] [Supplemental] [Code]
Supplemental reading:
  • GLIPv2: Unifying Localization and Vision-Language Understanding. Haotian Zhang, Pengchuan Zhang, Xiaowei Hu, Yen-Chun Chen, Liunian Harold Li, Xiyang Dai, Lijuan Wang, Lu Yuan, Jenq-Neng Hwang, and Jianfeng Gao. ArXiv 2206.05836 2022. [PDF] [Code]
Data Generation
Dec 1
MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation. Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
[PDF] [Talk] [Code] [Video]
Supplemental reading:
  • Generating Object Stamps. Youssef Alami Mejjati, Zejiang Shen, Michael Snower, Aaron Gokaslan, Oliver Wang, James Tompkin, and Kwang In Kim. ArXiv 2001.02595 2020. [PDF]
Dec 6
Zero-Shot Text-to-Image Generation. Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. ArXiv 2102.12092 2021.
[PDF] [Blog]
Supplemental reading:
  • Diffusion Models Beat GANs on Image Synthesis. Prafulla Dhariwal and Alexander Nichol. Neural Information Processing Systems (NeurIPS) 2021. [PDF] [Supplemental] [Reviews]
Dec 20
Final project report due
(No class)
Learning Goals

Students who complete this course will:

Grading

The following standards will be used to assign grades. Anyone who doesn't complete the standards to earn a B will receive NC.

To Earn an A
To Earn a B
Estimated Time Commitment
ActivityHours
Class Meetings28
Readings65
Submitting Discussion Questions10
Preparing to Lead Discussion2
Project Research60+
Project Proposal / Status                             10
Project Final Report5
Total180+
General Course Policies

Masking, COVID-19, and Other Health Issues

Everyone attending class is required to wear a high-quality mask (KN95 or better). Students who are leading discussions may optionally remove their masks while presenting. Attendance and discussion question policies will be flexible with respect to COVID-19 and other health issues. Please contact the instructor if you have any issues.

Diversity & Inclusion

The Brown computer science department has made it its mission to create and sustain a diverse and inclusive environment in which all students, faculty, and staff can thrive. In this course, that responsibility falls on us all, students and teaching staff alike. In particular, Brown's Nondiscrimination and Anti-Harassment Policy applies to all participants.

If you have not been treated in an inclusive manner by any of the course members, please contact either me (Stephen) or the department chair (Prof. Tamassia). Laura Dobler is also available as a resource for members of underrepresented groups. Additional resources are listed on the department's website. We, the computer science department, take all complaints about discrimination, harassment, and other unprofessional behavior seriously.

In addition, Brown welcomes students from all around the country and the world, and their unique perspectives enrich our learning community. To empower students whose first language is not English, an array of support is available on campus, including language and culture workshops and individual appointments. For more information, contact the English Language Learning Specialists at ellwriting@brown.edu.

Academic Integrity

Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Brown Academic Code and Brown Code of Student Conduct. For project work, feel free to build on third-party software, datasets, or other resources, as long as you credit them in your report(s) and clearly state what work is solely your own. As a general policy (for this course and for the rest of your academic career): if you use any idea, text, code, or data that you did not create, then cite it.

Accommodations

Brown University is committed to full inclusion of all students. Please inform me if you have a disability or other condition that might require accommodations or modification of any of these course procedures. You may email me, come to office hours, or speak with me after class, and your confidentiality is respected. I will do whatever I can to support accommodations recommended by SAS. For more information contact Student Accessibility Services (SAS) at 401-863-9588 or SAS@brown.edu.

Mental Health

Being a student can be very stressful. If you feel you are under too much pressure or there are psychological issues that are keeping you from performing well at Brown, I encourage you to contact Brown’s Counseling and Psychological Services CAPS. They provide confidential counseling and can provide notes supporting accommodations for health reasons.

Incomplete Policy

I expect everyone to complete the course on time. However, I understand that there may be factors beyond your control, such as health problems and family crises, that prevent you from finishing the course on time. If you feel you cannot complete the course on time, please discuss with me (Stephen) the possibility of being given a grade of Incomplete for the course and setting a schedule for completing the course in the upcoming year.

Thanks to Tom Doeppner, Laura Dobler, and Daniel Ritchie for borrowed text.