Instructor: James Tompkin
TAs: Eric Xiao (HTA), Jackson Gibbons, Daniel Nurieli, Eleanor Tursman, Martin Zhu.
General Course Policy
This class runs quiet hours from 9pm to 9am every day. Please do not expect a response from us via any channel. Likewise, we won't ask you to do anything between these times, either, like hand in projects.
This course provides an introduction to computer vision, including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification, scene understanding, and deep learning with neural networks. We will develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment, tracking, boundary detection, and recognition. We will develop the intuitions and mathematics of the methods in class, and then learn about the difference between theory and practice in projects.
This course is strongly based upon James Hays' computer vision course, previously taught at Brown as CS143, and currently taught at Georgia Tech as CS 4476. Significant thanks to him and his staff, across the years, for all their hard work.
No prior experience with computer vision is assumed, although previous knowledge of visual computing or signal processing will be helpful (e.g., CSCI 1230). The following skills are necessary for this class:
For those of you unfamiliar with MATLAB, please install it, run it, then go to 'Help' and then 'Getting Started with MATLAB' (or find it here online). Run through the tutorial to familiarize yourself (1 hour). Then, use this MATLAB Tutorial, from Roth et al., as a guide and reference. If you need any help, please come to office hours and we'll assist you.
Your final grade will be 100% from 6 programming projects. You will lose 10% each day for late projects. However, you have three late days for the whole course: the first 24 hours after the due date and time counts as one late day, up to 48 hours counts as two, and 72 hours counts as three. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible.
Late days cover unexpected clustering of due dates, travel commitments, interviews, hackathons, etc. Do not ask for extensions to due dates—we give you a pool of late days to manage yourself.
This class can be taken as a capstone. You will need to complete 10 points of extra credit in each of the projects.
There is no requirement to buy a textbook. The goal of the course is to be self contained, but sections from two textbooks will be suggested for more formalization and information. These two books are available free online. If you find a word or concept that you do not understand, then please also consider the computer vision dictionary listed third.
Projects are due every two weeks on Friday at 9pm. Hand-in is electronic via the cs1430_handin script.
It is strongly recommended that all projects be completed in MATLAB. All starter code will be provided for MATLAB. Students may implement projects through other means but it will be significantly more difficult.
|Tentative Assignments||Highlighted projects|
|1. Image Filtering and Hybrid Images|
|2. Local Feature Matching|
|3. Camera Calibration and Fundamental Matrix Estimation with RANSAC|
|4. Scene Recognition with Bag of Words|
|5. Face Detection with a Sliding Window|
|6. Convolutional Neural Nets|
|Wed||25||Jan||Introduction to Computer Vision||PPTX,PDF||Szeliski 1|
|Image Formation and Filtering|
|Fri||27||Jan||Light and Color||PPTX,PDF||Szeliski 2.2 and 2.3||1 out|
|Mon||30||Jan||Image Filtering||PPTX,PDF||Szeliski 3.2|
|Wed||01||Feb||Thinking in Frequency||PPTX,PDF||Szeliski 3.4|
|Fri||03||Feb||Thinking in Frequency, part 2||PPTX,PDF
MATLAB Live FFT2
Brian Pauw Live FFT2 Code
|Szeliski 3.5.2 and 8.1.1|
|Feature Detection and Matching|
|Mon||06||Feb||Edge Detection||PPTX,PDF||Szeliski 4.2|
|Wed||08||Feb||Cancelled due to instructor sickness|
|Fri||10||Feb||Interest Points and Corners||PPTX,PDF||Szeliski 4.1.1||1 due, 2 out|
|Mon||13||Feb||Local Image Features||PPTX,PDF||Szeliski 4.1.2|
|Wed||15||Feb||Feature Matching||PPTX,PDF||Szeliski 4.1.3 and 4.3.2|
|Fri||17||Feb||Model Fitting and RANSAC||PPTX,PDF||Szeliski 6.1 and 2.1|
|Mon||20||Feb||President's Day—no class|
|Cameras, Multiple Views and Motion|
|Wed||22||Feb||Cameras and Optics||PPTX,PDF||Szeliski 2.1, esp. 2.1.5|
|Fri||24||Feb||Stereo Introduction||Szeliski 11||2 due; 3 out|
|Mon||27||Feb||Camera Calibration||Szeliski 6.2.1|
|Wed||01||Mar||Epipolar Geometry and Structure from Motion||Szeliski 7|
|Fri||03||Mar||Feature Tracking and Optical Flow||Szeliski 8.1 and 8.4|
|Mon||06||Mar||Optical Flow Continued|
|Machine Learning Crash Course|
|Wed||08||Mar||Machine Learning: Unsupervised Learning||Szeliski 5.3|
|Fri||10||Mar||Machine Learning: Supervised Learning||Szeliski 5.3||3 due; 4 out|
|Mon||13||Mar||Recognition Overview and Bag of Features||Szeliski 14|
|Wed||15||Mar||Large-scale Instance Recognition||Szeliski 14.3.2|
|Fri||17||Mar||Large-scale Instance Recognition Continued|
|Mon||20||Mar||Large-scale Category Recognition and Advanced Feature Encoding|
|Wed||22||Mar||Detection with Sliding Windows: Viola Jones||Szeliski 14.1 and 14.2|
|Fri||24||Mar||Detection with Sliding Windows: Dalal Triggs||Szeliski 14.1||4 due; 5 out|
|Mon||03||Apr||Pascal VOC and Big Data||Szeliski 14.5|
|Wed||05||Apr||Big Data 2|
|Fri||07||Apr||Human Computation and Crowdsourcing*|
|Mon||10||Apr||Modern Boundary Detection and Sketches||Szeliski 4.2|
|Wed||12||Apr||Context, Spatial Layout, and Scene Parsing|
|Fri||14||Apr||Neural Networks||5 due; 6 out|
|Mon||17||Apr||Convolutional Networks for Recognition|
|Wed||19||Apr||Object Detectors Emerge in Deep Scene CNNs|
|Mon||24||Apr||MS COCO and Deeper Deep Architectures|
|Wed||26||Apr||Structured Output from Deep Learning|
|Fri||28||Apr||Reading Period Starts||6 due|
|Unsupervised Learning and Style Transfer*|
|Mon||01||May||Generative Networks - Colorization*|
|Wed||03||May||Research paper class*|
|Fri||05||May||Research paper class*|
|Mon||08||May||Research paper class*|
The materials from this class rely significantly on slides prepared by other instructors, especially James Hays, Derek Hoiem, and Svetlana Lazebnik. Each slide set and assignment contains acknowledgements. Feel free to use these slides for academic or research purposes, but please maintain all acknowledgements.
Comments and questions to James Tompkin.
Our intent is that this course provide a welcoming environment for all students who satisfy the prerequisites. Our TAs have undergone training in diversity and inclusion; all members of the CS community, including faculty and staff, are expected to treat one another in a professional manner. If you feel you have not been treated in a professional manner by any of the course staff, please contact either the instructor, James, or the department chair, Prof. Cetintemel. If you have a diversity issue, please contact Laura Dobler, or speak to a student advocate for diversity and inclusion. We will take all complaints about unprofessional behavior seriously.
Prof. Krishnamurthi has good notes on this area.
Academic dishonesty will not be tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Brown Academic and Student Conduct Codes.
Feel free to talk to your friends about the concepts in the projects, and work through the ideas behind problems together, but be sure to always write your own code and perform your own write up. You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. Feel free to include results built on other software, as long as you credit correctly in your handin and clearly demark your own work. In general, if you use an idea, text, or code from elsewhere, then cite it.
Brown University is committed to full inclusion of all students. Please inform me if you have a disability or other condition that might require accommodations or modification of any of these course procedures. You may email me, come to office hours, or speak with me after class, and your confidentiality is respected. We will do whatever we can to support accommodations recommended by SEAS. For more information contact Student and Employee Accessibility Services (SEAS) at 401-863-9588 or .
Being a student can be very stressful. If you feel you are under too much pressure or there are psychological issues that are keeping you from performing well at Brown, we encourage you to contact Brown's Counseling and Psychological Services. They provide confidential counseling and can provide notes supporting extensions on assignments for health reasons.
We expect everyone to complete the course on time. However, we certainly understand that there may be factors beyond your control, such as health problems and family crises, that prevent you from finishing the course on time. If you feel you cannot complete the course on time, please discuss with James Tompkin the possibility of being given a grade of Incomplete for the course and setting a schedule for completing the course in the upcoming year.
Thanks to Prof. Doeppner for the text on accommodation, mental health, and incomplete policy.
Laptops are discouraged, please, except for class-relevant activities, e.g., to help answer questions and show items relevant to discussion. No social media, email, etc., because it distracts not just you but other students as well. Read Shirky on this issue ("Why I Just Asked My Students to Put Their Laptops Away"), or Rockmore ("The Case for Banning Laptops in the Classroom").
We will release course lecture material online. In considering laptop use for note taking, please be aware that research has shown note taking on paper to be more efficient than on a laptop keyboard (Mueller and Oppenheimer), as it pushes you to summarize the content instead of transcribe it.