Natural language understanding is a holy grail of AI. And with the machine learning advancing at such a rapid pace, breakthroughs in automatic language understanding seem to be just around the corner. But what exactly are the current barriers in automating human-like language capabilities? This course will dissect what makes language understanding so challenging, including both theoretical aspects (logic, formal semantics, pragmatics, knowledge representation) and practical methods (graphical models, game theory, neural networks). The course will be project-based, and will emphasize reading and critiquing current research in computer science, linguistics, and cognitive science.
The focus of this course is on understanding the state of the art in compoutational representations of natural language. The material is intended to provide a survey of theories of semantic representations and the ways that those theories have been operationalized as computational models for automatic natural language understanding. The goal of the course is to help you develop the following:
This is not meant as a deep learning course or an engineering course. Although there will be applied assignments, the goal of the course is not to teach you how to build NLP systems. This is a graduate seminar with a heavy reading component–you’ll be expected to do the required reading and should be prepared to fill in background information on your own when needed. Class time will be spent discussing connections between papers and comparing current theories and models at a high level, not explaining the technical details of individual papers.
Machine Learning (CSCI 1420) or Computational Linguistics (CSCI 1460)
Grades will be based primarily on assignments and participation in in-class discussions. There will be no quizzes or exams. There will be a final project consisting of a coding and a written component, worth 20% of the final grade. The final grade will be out of 100 points, as follows. Grading rubrics will allow for fractional points.
Deliverable | Points |
---|---|
HW1: Word Vectors | 15 |
HW2: Semantic Parsing | 15 |
HW3: Neural Sentence Representations | 15 |
HW4: Computational Pragmatics | 15 |
Discussion Questions | 10 |
Discussion Lead/Deep Dive | 10 |
Final Project | 20 |
Rubrics will be released at the same time as the assignment is released. We (Instructor and TA) reserve the right to alter the rubric after assignments are turned in if needed, but only if it is in the students’ favor (e.g. down-weighting a topic ex post facto if it appears we didn’t prepare the students sufficiently well). Don’t bank on this happening, though, i.e. no colluding to make it seem like you all understand things less well than you do. :)
Regrades will be available upon request and with thorough justification. Students who want their assignment regraded must email the me with an explanation of why the regrade is necessary. If approved, the student will come to the my office hours and walk through the problem and the regrade in person. Even if I accept the justification and aggrees to review the assignment in office hours, it does not guarentee that the grade will be changed.
Assignments are due at 11:59pm EST on their listed due date. Discussion questions for readings are due at 11:59 the day before the lecture in which the readings will be covered. Students are allowed 4 “no questions asked” late days each which can be used at any point during the semester (all on one assignment, or distributed across assignments). Late days can only be used on the programming assignments (HW1, HW2, HW3, and HW4). Late days cannot be used on the final project, or on the discussion-related deliverables (questions or deep-dive writeup) since these aren’t relevant unless completed before the day of the discussion. Note that there is leniency in the grading to allow for weeks when students fall behind on reading (see Assignments section).
Regular assignments must be completed individually–i.e. every student must turn in their own assignment. It is expected and encouraged that you all discuss the assignments and work through ideas and concepts together, but the work that each student turns in must be their own. If there is doubt about whether or not an assignment was completed independently, the student will be asked to meet with me to walk through their code and demonstrate a line-by-line understanding of the assignment that they submitted.
The final project can be completed individually or in groups of two. Groups are expected to undertake two people worth of work.
Points will be allocated as laid out in the individual assignment/project rubrics, and final grades will be determined based on overall points. Grades will not be curved to fit a particular distribution. In other words, you are evaluated relative to our learning objectives, not relative to each other. Letter grades will be assigned based on the point values below. To save time and energy, there will be no “rounding up”, since it happens that, no matter where the line is, someone is always just below it. The one exception will be for students who are on the border between F and F+. In this case, we will entertain arbitrarily complex arguments for why the student deserves an F+, because, at that point, why not.
Grade | Points |
---|---|
A | [93, 100] |
A- | [90, 93) |
B+ | [87, 90) |
B | [83, 87) |
B- | [80, 83) |
C+ | [77, 80) |
C | [73, 77) |
C- | [70, 73) |
D+ | [67, 70) |
D | [63, 67) |
D- | [60, 63) |
F+ | [57, 60) |
F | [0, 57) |
There will be four technical assignments throughout the semester, which will make up 60% of the grade (15 points each). In addition, students are expected to do the readings and participate in the in-class discussions. The specific deliverables for the course are described below.
Due September 25, 11:59pm
This assignment focuses on building and manipulating distributional representations of words–i.e. representations or words as vectors determined by the context in which the word is used. There will be two pieces: one looking at sparse, high-dimensional vectors and one looking at dense, low-dimensional vectors.
Due October 11, 11:59pm
Description and rubric TDB
Due October 30, 11:59pm
Description and rubric TDB
Due November 15, 11:59pm
Description and rubric TDB
Due: Bi-Weekly (11:59pm the night before each lecture)
You are expected to do the readings each week and participate in in-class discussions. To incentivize this, we will ask you to submit at least one discussion question per reading before the class in which the reading will be discussed. These questions should focus on high-level, conceptual questions (e.g. questions regarding assumptions made by a model, or limitations of a technical approach) rather than simply clarification questions (unless the clarification questions are especially salient/relevant to understanding the work). Questions will be submitted via a Google form linked from the website alongside the reading.
It is okay not to submit questions for every reading. Grades will be assigned hollistically at the end of the semester. Every student who submits questions for at least 70% of the readings will recieve 7/10 for the discussion questions. The remaining 3 points will be assigned based on the quality and depth of the questions and/or student’s in-class contributions. Submitting questions for 100% of the readings will not necessarily guarentee you 10/10. On weeks that you are not able to complete the readings, please just own it and leave the form blank rather than trying to BS it. (Yes, I’ll be able to tell. State-of-the-art language models can produce questions that sound beautifully fluent but are devoid of content. Own your humanity. Don’t be automatable.)
Due: Varies (11:59pm night before chosen paper is discussed)
In addition to submitting questions weekly, each student will serve as a “discussion lead” for one paper during the semester. This will require choosing a paper to read in depth and submit a short (two page) write up describing the main contributions of the work and situating it in relation to the material discussed in the course so far. Students will be expected to lead the discussion on the day that their paper is being covered. Students can choose which paper they want to lead and should pick the topic that they find most exciting and interesting. It is okay if multiple students are “leading” on the same paper, however we reserve the right to cap the number of students on a paper if things get out of hand. Students will not give a formal in-class presentation of their paper, and grading will be based on the quality of the two-page writeup and/or the student’s in-class contributions on they day of their discussion.
Due December 17, 2:00pm (Final Exam Time)
For the final project, you will devise and implement an approach to solving one of the 2019 SemEval Shared Tasks. These represent a range of challenging problems for modern NLP, and will allow students to apply the ideas of representation discussed in the course to a current open research problem. The final deliverable will be an 8 page conference-style description of the motivation, approach, and results. Code must also be submitted, although the grade will be based on the final writeup. Grades on the final project will follow the spirit of the course: i.e. good final projects will devise an approach to the problem that is well-motivated and situated in relation to the semantic theories discussed in the course. It is less important that students acheive state-of-the-art performance on the task.
I will entertain custom final project proposals if students have ideas of projects/problems they are particularly excited to pursue that don’t fit within the SemEval tasks. However, designing your own final project will mean developing a clear train-dev-test loop (with data and evaluation metric) akin to what is provided by SemEval. Students who want to pitch a custom final project must have it approved by me before November 17. There is a 100% chance I will not approve the idea on the first meeting, so please talk to me well before the 17th so you have time to iterate on the concept.
This is a discussion-based course, in which the goal is to debate and critically analyze models and methods that are currently at the very forefront of NLP research. This class will only work if everyone is open-minded, respectful, and willing to take all ideas seriously. This means giving every student’s opinions equal attention and consideration, regardless of their background (social or academic). No one here (myself included!) knows everything there is to know about this topic–that is why we are studying it. It is expected that you feel over your head at some point: struggling to understand a paper doesn’t mean you don’t belong in this class. Don’t judge your colleagues or yourself for finding material difficult.
Your ability to learn and thrive in this class is my priority, and anything that prevents you from being fully present, physically and mentally, is my concern. Everyone should feel comfortable speaking freely both in class and outside of class (e.g. working in groups on assignments). If, for any reason–academic, personal, social, health, or other–you do not feel like you are able to be your unabridged, best self, please talk to me.