CSCI2951-B Data-Driven Vision and Graphics

Fall 2014, MWF 2:00 to 2:50, CIT 316.
Instructor: James Hays

Course Description

Course Catalog Entry
This graduate seminar course investigates current research topics in image-based, data-driven graphics and vision. We will examine data sources, features, and algorithms useful for understanding and manipulating visual data. We will pay special attention to methods that harness large-scale or Internet-derived data. Vision topics such as scene understanding and object detection will be linked to graphics applications such as photo editing and image-based rendering. These topics will be pursued through independent reading, class discussion and presentations, and state-of-the-art projects.

The goal of this course is to give students the background and skills necessary to perform research in image-based graphics and vision. Students should understand the strengths and weaknesses of current approaches to research problems and identify interesting open questions and future research directions. Students will hopefully improve their critical reading and communication skills, as well.

Course Requirements

Reading and Summaries

Students will be expected to read one paper for each class. For each assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries must be posted to the class blog by 11:59pm the day before each class. Feel free to reply to other comments on the blog and help each other understanding confusing aspects of the papers. The blog discussion will be the starting point for the class discussion. If you are presenting you don't need to post a summary to the blog.

Class participation

All students are expected to take part in class discussions. If you do not fully understand a paper that is OK. We can work through the unclear aspects of a paper together in class. If you are unable to attend a specific class please let me know ahead of time (and have a good excuse!).

Presentation(s)

Depending on enrollment, students will lead the discussion of one or two papers during the semester. Ideally, students would implement some aspect of the presented material and perform experiments that help understand the algorithms. Presentations and all supplemental material should be ready one week before the presentation date so that students can meet with the instructor, go over the presentation, and possibly iterate before the in-class discussion. For the presentations it is fine to use slides and code from outside sources (for example, the paper authors) but be sure to give credit.

Semester projects

Students are expected to complete a state-of-the-art research project on topics relevant to the course. Students will propose a research topic part way through the semester. After a project topic is finalized, students will meet occasionally with the instructor to discuss progress. Students will present their progress on their semester project twice during the course and the course will end with final project presentations. Students will also produce a conference-formatted write-up of their project. Projects will be published on the this web page. The ideal project is something with a clear enough direction to be completed in a couple of months, and enough novelty such that it could be published in a peer-reviewed venue with some refinement and extension.

Prerequisites

Strong mathematical skills (linear algebra, calculus, probability and statistics) and previous imaging (graphics, vision, or computational photography) courses are needed. It is strongly recommended that students have taken one of the following courses (or equivalent courses at other institutions):

If you aren't sure whether you have the background needed for the course, you can try reading some of the papers below or you can simply come to class during the shopping period.

Textbook

We will not rely on a textbook, although the free, online textbook "Computer Vision: Algorithms and Applications" by Richard Szeliski is a helpful resource.

Grading

Your final grade will be made up from

20% Reading summaries posted to class blog
20% Classroom participation and attendance
20% Research presentation(s)
40% Semester project

Office Hours:

James Hays, Monday and Wednesday 3:00-4:00pm, CIT 375

Schedule

Date	Paper	Paper, Project page, and Material	Presenter
Sept 3	Introduction; the state of vision and graphics		James
Image representations, mid-level vision, and their applications
Sept 5	80 million tiny images: a large dataset for non-parametric object and scene recognition. A. Torralba, R. Fergus, W. T. Freeman. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30(11), 2008.	pdf, project page	James
Sept 8	Object recognition from local scale-invariant features, David Lowe, ICCV 1999.	pdf, project page	Gen
Sept 10	Video Google: A Text Retrieval Approach to Object Matching in Videos. Sivic, J. and Zisserman, A. Proceedings of the International Conference on Computer Vision (2003)	pdf, project page	Gen
Sept 12	Histograms of Oriented Gradients for Human Detection. Navneet Dalal and Bill Triggs. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2005.	.pdf	James
Sept 15	Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006.	pdf, slides	James
Sept 17	Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3.	project page	James
Sept 19	SUN Database: Exploring a Large Collection of Scene Categories J. Xiao, K. Ehinger, J. Hays, A. Oliva, and A. Torralba. IJCV 2014.	project page, pdf	James
Crowd-sourcing and Human Computation
Sept 22	LabelMe: a Database and Web-based Tool for Image Annotation. B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman. International Journal of Computer Vision, 2008.	pdf, project page	James
Sept 24	Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009.	project page	David
Sept 26	ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009	pdf, project page	Sonia
Sept 29	How do humans sketch objects? Mathias Eitz, James Hays, and Marc Alexa. Siggraph 2012.	project page	Nakul
Oct 1	Micro Perceptual Human Computation for Visual Tasks. Yotam Gingold, Ariel Shamir, Daniel Cohen-Or. ACM Transactions on Graphics (ToG) 2012	project page	Sarah
Discriminative Patches
Oct 3	What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012.	project page	Geng
Oct 6	Painting-to-3D Model Alignment Via Discriminative Visual Elements. Mathieu Aubry, Bryan Russell Josef Sivic. ToG 2013.	project page	Brandon
Oct 8	Project Status Updates.		Everyone
Oct 10	Project Status Updates.		Everyone
Oct 13	No Classes
Learned Representations, Deep Learning
Oct 15	CVPR 2014 Tutorial on Deep Learning. Graham Taylor, Marc'Aurelio Ranzato, and Honglak Lee. Read only the first two sets of labeled Introduction and Supervised learning.	CVPR 2014 tutorial	Eli
Oct 17	ImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton. NIPS 2012.	pdf	Eric
Oct 20	DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell. 2013.	arXiv	Takehiro
Oct 22	Visualizing and Understanding Convolutional Networks. Matthew D Zeiler, Rob Fergus. ECCV 2014.	pdf	Patsorn
Oct 24	Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. R. Girshick, J. Donahue, T. Darrell, J. Malik. CVRP 2014.	arXiv	Hasnain
Texture, Image Statistics
Oct 27	PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Siggraph 2009.	project page	Chau
Oct 29	Image Melding: combining inconsistent images using patch-based synthesis. Soheil Darabi, Eli Shechtman, Connelly Barnes, Dan B Goldman, Pradeep Sen. Siggraph 2012.	project page	Johannes
Oct 31	Internal Statistics of a Single Natural Image. Maria Zontak and Michal Irani. CVPR 2011.	pdf, project page	Nate Burnell
Leveraging Image Databases
Nov 3	Photo Clip Art. Jean-François Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007).	project page	Guan
Nov 5	Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu.	project page	Christine
Nov 7	PatchNet: A Patch-based Image Representation for Interactive Library-driven Image Editing. Shi-Min Hu, Fang-Lue Zhang, Miao Wang, Ralph R. Martin, Jue Wang. Siggraph Asia 2013.	project page	James
Nov 10	AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections. Jun-Yan Zhu, Yong Jae Lee, Alexei Efros. Siggraph 2014.	project page	Nate Bowditch
Nov 12	Project Status Updates.		Everyone
Nov 14	Project Status Updates.		Everyone
One last database paper
Nov 17	Microsoft COCO: Common Objects in Context. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. ECCV 2014.	project page, paper	James
Attribute-based Representations
Nov 19	Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth. CVPR 2009	project page	Michael
Nov 21	The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding. Genevieve Patterson, Chen Xu, Hang Su, James Hays. IJCV 2014.	project page	James
Intrinsic Images: factorizing illumination and reflectance
Nov 24	Ground-truth dataset and baseline evaluations for intrinsic image algorithms. R. Grosse, M.K. Johnson, E.H. Adelson and W.T. Freeman. ICCV 2009	project page	James
Nov 26	No Classes
Nov 28	No Classes
Dec 1	Intrinsic Images in the Wild. Sean Bell, Kavita Bala, Noah Snavely. Siggraph 2014.	project page	Chris
Structure from Motion
Dec 3	Photo tourism: Exploring photo collections in 3D. Noah Snavely, Steven M. Seitz, Richard Szeliski. Siggraph 2006.	pdf, project page	Ammar
Dec 5	First Person Hyperlapse Videos. Johannes Kopf, Michael Cohen, Richard Szeliski. Siggraph 2014.	project page	Tim
Misc
Dec 8	Depixelizing Pixel Art. Johannes Kopf and Dani Lischinski. Siggraph 2011.	project page	Thomas
Dec 10	Transient Attributes for High-Level Understanding and Editing of Outdoor Scenes. Pierre-Yves Laffont, Zhile Ren, Xiaofeng Tao, Chao Qian, James Hays. Siggraph 2014.	project page	Philippe
Exam slot, Dec 17 2pm	Final Project Presentations		Everyone

Acknowledgements

Ideas for the organization and content of this course came from many other researchers such as Svetlana Lazebnik, Kristin Grauman, Antonio Torralba, Derek Hoeim, and Alexei Efros.

Related Graduate Seminars at other Universities (although these have more of a vision and learning emphasis)

Learning-Based Methods in Vision (Alexei Efros, CMU and Leonid Sigal, Disney Research Pittsburgh)
Object Recognition (Kristin Grauman, UT )
Object Recognition and Scene Understanding (Antonio Torralba, MIT)
Machine Learning Techniques in Image Analysis (Svetlana Lazebnik, UNC)
Internet Vision (Tamara Berg, Stony Brook)
Visual Scene Understanding (Derek Hoiem, UIUC)