CSCI2951-B Data-Driven Vision and Graphics

Fall 2010, MWF 10:00 to 10:50, CIT 345.
Instructor: James Hays

Course Description

Course Catalog Entry
This graduate seminar course investigates current research topics in image-based graphics and vision. We will examine data sources, features, and algorithms useful for understanding and manipulating visual data. We will pay special attention to methods that harness large-scale or Internet-derived data. Vision topics such as scene understanding and object detection will be linked to graphics applications such as photo editing and image-based rendering. These topics will be pursued through independent reading, class discussion and presentations, and state-of-the-art projects.

The goal of this course is to give students the background and skills necessary to perform research in image-based graphics and vision. Students should understand the strengths and weaknesses of current approaches to research problems and identify interesting open questions and future research directions. Students will hopefully improve their critical reading and communication skills, as well.

Course Requirements

Reading and Summaries

Students will be expected to read one or two papers for each class. For every assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for possible in class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries should be emailed to the instructor by 11:59pm the day before each class. Please put the course number, "2951", somewhere in the subject line. If you are presenting you don't need to turn in a summary.

Class participation

All students are expected to take part in class discussions. If you do not fully understand a paper that is OK. We can work through the unclear aspects of a paper together in class. If you are unable to attend a specific class please let me know ahead of time (and have a good excuse!).

Presentation(s)

Depending on enrollment, students will present one or two papers (or groups of papers) throughout the semester. Students are expected to implement some aspects of the presented material and perform experiments that help understand the algorithms. Presentations and all supplemental material should be ready one week before the presentation date so that students can meet with the instructor and go over the presentation and possibly iterate before the in-class presentation. For the presentations it is fine to use slides or code from outside sources (for example, the paper authors) but be sure to give credit.

Semester projects

Students are expected to complete a state-of-the-art research project on topics relevant to the course. Students will propose a research topic part way through the semester. After a project topic is finalized, students will meet occasionally with the instructor to discuss progress. The course will end with final project presentations. Students will also produce a conference-formatted write-up of their project. The ideal project is something with a clear enough direction to be completed in a couple of months, and enough novelty such that it could be published in a peer-reviewed venue with some refinement and extension.

Prerequisites

Strong mathematical skills (linear algebra, calculus, probability and statistics) and previous imaging (graphics, vision, or computational photography) courses are needed. It is strongly recommended that students have taken one of the following courses (or equivalent courses at other institutions):

If you aren't sure whether you have the background needed for the course, you can try reading some of the papers below or you can simply come to class during the shopping period.

Textbook

We will not rely on a textbook, although the free, online textbook "Computer Vision: Algorithms and Applications" by Richard Szeliski is a helpful resource.

Grading

Your final grade will be made up from

15% Classroom participation and attendance
20% Reading summaries
25% Research presentation(s)
40% Semester project

Helpful Links:

Matlab Tutorial

Office Hours:

James Hays, Monday and Wednesday 11:00-12:00

Schedule

Date	Paper	Paper, Project page, and Material	Presenter
W, Sept 1	Introduction		James
F, Sept 3	The state of vision and graphics		James
Image representations, mid-level vision, and their applications
M, Sept 6	No Classes
W, Sept 8	80 million tiny images: a large dataset for non-parametric object and scene recognition. A. Torralba, R. Fergus, W. T. Freeman. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30(11), 2008.	pdf, project page	James
F, Sept 10	Using Contours to Detect and Localize Junctions in Natural Images. Michael Maire, Pablo Arbelaez, Charless Fowlkes, and Jitendra Malik. Computer Vision and Pattern Recognition (CVPR), 2008.	pdf, project page	James
M, Sept 13	Object recognition from local scale-invariant features, David Lowe, ICCV 1999.	pdf, project page	James
optional reading	Histograms of Oriented Gradients for Human Detection. Navneet Dalal and Bill Triggs. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2005.	.pdf
optional reading	Robust wide baseline stereo from maximally stable extremal regions. J. Matas, O. Chum, U. Martin, and T Pajdla. Proceedings of the British Machine Vision Conference, 2002.	.pdf
optional reading	Learning local image descriptors. Simon Winder and Matthew Brown. CVPR 2007.	.pdf
W, Sept 15	Video Google: A Text Retrieval Approach to Object Matching in Videos. Sivic, J. and Zisserman, A. Proceedings of the International Conference on Computer Vision (2003)	pdf, project page	James
optional reading	Scalable Recognition with a Vocabulary Tree. David Nister and Henrik Stewenius. CVPR 2006	.pdf
optional reading	Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. CVPR 2008.	.pdf
F, Sept 17	Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006.	pdf, project page	David
M, Sept 20	Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3.	project page	James
W, Sept 22	Matching Local Self-Similarities across Images and Videos. Eli Shechtman and Michal Irani. IEEE Conference on Computer Vision and Pattern Recognition 2007 (CVPR'07)	project page	Silvia
Recognition
F, Sept 24	A Discriminatively Trained, Multiscale, Deformable Part Model. P. Felzenszwalb, D. McAllester, D. Ramanan. Computer Vision and Pattern Recognition (CVPR) 2008.	pdf, project page	Konstantin
M, Sept 27	An Empirical Study of Context in Object Detection. Santosh K. Divvala, Derek Hoiem, James H. Hays, Alexei A. Efros, Martial Hebert. Computer Vision and Pattern Recognition (CVPR) 2009.	project page	James
optional reading	Object Recognition by Scene Alignment. B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman. Advances in Neural Information Processing Systems (NIPS), 2007.	pdf
W, Sept 29	SUN Database: Large-scale Scene Recognition from Abbey to Zoo. J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. IEEE Conference on Computer Vision and Pattern Recognition (CVPR2010).	project page	James
F, Oct 1	It's All About the Data. Tamara L. Berg, Alexander Sorokin, Gang Wang, David A. Forsyth, Derek Hoiem, Ali Farhadi, Ian Endres. Proceedings of the IEEE, Special Issue on Internet Vision, August 2010, 98-8, 1434-1453.	.pdf by email	James
M, Oct 4	Utility data annotation with Amazon Mechanical Turk. Alexander Sorokin, David Forsyth. In the First IEEE Workshop on Internet Vision at CVPR 08	pdf,project page	Genevieve
W, Oct 6	ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009	pdf,project page	Sirion
F, Oct 8	What does classifying more than 10,000 image categories tell us? J. Deng, A. Berg, K. Li and L. Fei-Fei. Proceedings of the 12th European Conference of Computer Vision (ECCV). 2010.	pdf	Konstantin
Image Geolocation
M, Oct 11	No Classes
W, Oct 13	IM2GPS: estimating geographic information from a single image. James Hays and Alexei Efros. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2008.	project page	James
optional reading	Landmark classification in large-scale image collections. D. Crandall, Y. Li, and D. Huttenlocher. in ICCV 2009.	pdf
F, Oct 15	Image Sequence Geolocation with Human Travel Priors. Evangelos Kalogerakis, Olga Vesselova, James Hays, Alexei A. Efros, Aaron Hertzmann. Proceedings of the IEEE Internaltional Conference on Computer Vision Recognition (ICCV), 2009.	project page	Donnie
Hallucinating Super-resolution
M, Oct 18	Example-based super-resolution. William T. Freeman, Thouis R. Jones, and Egon C. Pasztor. MERL Technical Report.	pdf	James
W, Oct 20	Context-Constrained Hallucination for Image Super-Resolution.J. Sun and M. F. Tappen. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010).	pdf	Travis
F, Oct 22	Image Upsampling via Texture Hallucination. Y. HaCohen, R. Fattal, D. Lischinski. IEEE International Conference on Computational Photography (ICCP 2010).	project page	Geoff
LabelMe and Applications
M, Oct 25	LabelMe: a Database and Web-based Tool for Image Annotation. B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman. International Journal of Computer Vision, 2008.	pdf, project page	Sirion
W, Oct 27	Building a database of 3D scenes from user annotations. B. C. Russell and A. Torralba. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.	pdf, Project page	George
F, Oct 29	Photo Clip Art. Jean-François Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007).	project page	Yun
Applications of Unlabeled Scene Matching
M, Nov 1	Creating and exploring a large photorealistic virtual space. J. Sivic, B. Kaneva, A. Torralba, S. Avidan and W. T. Freeman. First IEEE Workshop on Internet Vision, associated with CVPR 2008.	pdf	Seth
W, Nov 3	Image restoration using online photo collections. K. Dale, M.K. Johnson, K. Sunkavalli, W. Matusik and H. Pfister. International Conference on Computer Vision, 2009.	project page	Yun
F, Nov 5	CG2REAL. M.K. Johnson, K. Dale, S. Avidan, H. Pfister, W.T. Freeman and W. Matusik.. IEEE Trans. on Visualization and Computer Graphics, to appear 2010.	project page	Donnie
M, Nov 8	Segmenting Scenes by Matching Image Composites. B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. NIPS 2009	pdf	David
Saliency and Image Synthesis
W, Nov 10	Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009.	project page	Travis
optional reading	PhotoSketch: A sketch based image query and compositing system. Mathias Eitz, Kristian Hildebrand, Tamy Boubekeur, and Marc Alexa. ACM SIGGRAPH 2009 - Talk Program.	project page
F, Nov 12	Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu.	project page	Michael
Social Vision
M, Nov 15	Autotagging Facebook: Social Network Context Improves Photo Annotation. Stone, Z.; Zickler, T.; Darrell, T. First IEEE Workshop on Internet Vision, (2008).	project page	Genevieve
W, Nov 17	Estimating Age, Gender and Identity using First Name Priors. A. Gallagher, T. Chen. IEEE Conference on Computer Vision and Pattern Recognition 2008.	project page	Seth
F, Nov 19	Understanding Images of Groups of People. A. Gallagher, T. Chen. IEEE Conference on Computer Vision and Pattern Recognition 2009.	project page	James
M, Nov 22	Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth. CVPR 2009	project page	James
W, Nov 24	No Classes
F, Nov 26	No Classes
M, Nov 29	Class canceled
Photo Tourism
W, Dec 1	Photo tourism: Exploring photo collections in 3D. Noah Snavely, Steven M. Seitz, Richard Szeliski. ACM Transactions on Graphics (SIGGRAPH Proceedings), 25(3), 2006.	pdf, project page	George
F, Dec 3	Scene Summarization for Online Image Collections. Ian Simon, Noah Snavely, and Steven M. Seitz. In ICCV, 2007.	pdf	Geoff
Video
M, Dec 6	LabelMe video: Building a Video Database with Human Annotations. J. Yuen, B. C. Russell, C. Liu, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009.	pdf, project page	Silvia
W, Dec 8	A data-driven approach for event prediction. Jenny Yuen, Antonio Torralba. European Conference on Computer Vision (ECCV), 2010.	pdf	Michael
M, Dec 13, 2pm	Final Project Presentations		Everybody

Acknowledgements

Ideas for the organization and content of this course came from many other researchers such as Svetlana Lazebnik, Kristin Grauman, Antonio Torralba, Derek Hoeim, and Alexei Efros.

Related Graduate Seminars at other Universities (although these have more of a vision and learning emphasis)

Object Recognition (Kristin Grauman, UT )
Object Recognition and Scene Understanding (Antonio Torralba, MIT)
Learning-Based Methods in Vision (Alexei Efros, CMU)
Machine Learning Techniques in Image Analysis (Svetlana Lazebnik, UNC)
Internet Vision (Tamara Berg, Stony Brook)
Visual Scene Understanding (Derek Hoiem, UIUC)