CSCI2951-B Data-Driven Vision and Graphics
Spring 2012, MWF 11:00 to 11:50, CIT 506.
Instructor: James Hays

Course Description
Course Catalog EntryThis graduate seminar course investigates current research topics in image-based, data-driven graphics and vision. We will examine data sources, features, and algorithms useful for understanding and manipulating visual data. We will pay special attention to methods that harness large-scale or Internet-derived data. Vision topics such as scene understanding and object detection will be linked to graphics applications such as photo editing and image-based rendering. These topics will be pursued through independent reading, class discussion and presentations, and state-of-the-art projects.
The goal of this course is to give students the background and skills necessary to perform research in image-based graphics and vision. Students should understand the strengths and weaknesses of current approaches to research problems and identify interesting open questions and future research directions. Students will hopefully improve their critical reading and communication skills, as well.
Course Requirements
Reading and Summaries
Students will be expected to read one or two papers for each class. For every assigned paper, students must write a two or three sentence summary and identify at least one question or topic of interest for class discussion. Interesting topics for discussion could relate to strengths and weaknesses of the paper, possible future directions, connections to other research, uncertainty about the conclusions of the experiments, etc. Reading summaries should be emailed to the instructor by 11:59pm the day before each class. Please put the course number, "2951", somewhere in the subject line. If you are presenting you don't need to turn in a summary.Class participation
All students are expected to take part in class discussions. If you do not fully understand a paper that is OK. We can work through the unclear aspects of a paper together in class. If you are unable to attend a specific class please let me know ahead of time (and have a good excuse!).Presentation(s)
Depending on enrollment, students will lead the discussion of one or two papers during the semester. Ideally, students would implement some aspect of the presented material and perform experiments that help understand the algorithms. Presentations and all supplemental material should be ready one week before the presentation date so that students can meet with the instructor, go over the presentation, and possibly iterate before the in-class discussion. For the presentations it is fine to use slides and code from outside sources (for example, the paper authors) but be sure to give credit.Semester projects
Students are expected to complete a state-of-the-art research project on topics relevant to the course. Students will propose a research topic part way through the semester. After a project topic is finalized, students will meet occasionally with the instructor to discuss progress. The course will end with final project presentations. Students will also produce a conference-formatted write-up of their project. The ideal project is something with a clear enough direction to be completed in a couple of months, and enough novelty such that it could be published in a peer-reviewed venue with some refinement and extension.Prerequisites
Strong mathematical skills (linear algebra, calculus, probability and statistics) and previous imaging (graphics, vision, or computational photography) courses are needed. It is strongly recommended that students have taken one of the following courses (or equivalent courses at other institutions):- CSCI 1230, Introduction to Computer Graphics
- CSCI 1290, Computational Photography
- CSCI 1430, Introduction to Computer Vision
- CSCI 2240, Interactive Computer Graphics
- ENGN 1610, Image Understanding
Textbook
We will not rely on a textbook, although the free, online textbook "Computer Vision: Algorithms and Applications" by Richard Szeliski is a helpful resource.Grading
Your final grade will be made up from- 20% Classroom participation and attendance
- 20% Reading summaries
- 20% Research presentation(s)
- 40% Semester project
Office Hours:
James Hays, Monday and Wednesday 1:00-2:00pmSchedule
Date | Paper | Paper, Project page, and Material | Presenter |
W, Jan 25 | Introduction | James | |
F, Jan 27 | The state of vision and graphics | James | |
M, Jan 30 | 80 million tiny images: a large dataset for non-parametric object and scene recognition. A. Torralba, R. Fergus, W. T. Freeman. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30(11), 2008. | pdf, project page | James |
W, Feb 1 | Object recognition from local scale-invariant features, David Lowe, ICCV 1999. | pdf, project page | James |
optional reading | Histograms of Oriented Gradients for Human Detection. Navneet Dalal and Bill Triggs. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, 2005. | ||
F, Feb 3 | Video Google: A Text Retrieval Approach to Object Matching in Videos. Sivic, J. and Zisserman, A. Proceedings of the International Conference on Computer Vision (2003) | pdf, project page | James |
optional reading | Robust wide baseline stereo from maximally stable extremal regions. J. Matas, O. Chum, U. Martin, and T Pajdla. Proceedings of the British Machine Vision Conference, 2002. | ||
M, Feb 6 | Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006. | pdf, project page | James |
W, Feb 8 | Scene Completion Using Millions of Photographs. James Hays, Alexei A. Efros. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3. | project page | James |
F, Feb 10 | Object Detection with Discriminatively Trained Part Based Models. P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan., IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, September 2010 | pdf, project page | James |
M, Feb 13 | Matching Local Self-Similarities across Images and Videos. Eli Shechtman and Michal Irani. IEEE Conference on Computer Vision and Pattern Recognition 2007 (CVPR'07) | project page | Jung Uk |
W, Feb 15 | SUN Database: Large-scale Scene Recognition from Abbey to Zoo. J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. IEEE Conference on Computer Vision and Pattern Recognition (CVPR2010). | project page | James |
F, Feb 17 | Local Intensity Order Pattern for Feature Description. Z. Wang, B. Fan and F. Wu. ICCV 2011. | project page | James |
M, Feb 20 | No Classes | ||
W, Feb 22 | Evaluating Image Feaures Using a Photorealistic Virtual World. B. Kaneva, A. Torralba and W. T. Freeman. ICCV 2011. | project page | Vazheh |
F, Feb 24 | It's All About the Data. Tamara L. Berg, Alexander Sorokin, Gang Wang, David A. Forsyth, Derek Hoiem, Ali Farhadi, Ian Endres. Proceedings of the IEEE, Special Issue on Internet Vision, August 2010, 98-8, 1434-1453. | IEEE explorer link to pdf | James |
M, Feb 27 | ImageNet: A Large-Scale Hierarchical Image Database. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei. IEEE Computer Vision and Pattern Recognition (CVPR), 2009 | pdf,project page | Sungmin |
W, Feb 29 | Recognizing Jumbled Images: The Role of Local and Global Information in Image Classification. Devi Parikh. ICCV 2011 | Vihang | |
F, Mar 2 | Micro Perceptual Human Computation for Visual Tasks. Yotam Gingold, Ariel Shamir, Daniel Cohen-Or. ACM Transactions on Graphics (ToG) 2012 | project page | Hang |
M, Mar 5 | IM2GPS: estimating geographic information from a single image. James Hays and Alexei Efros. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2008. | project page | Vazheh |
W, Mar 7 | Image Sequence Geolocation with Human Travel Priors. Evangelos Kalogerakis, Olga Vesselova, James Hays, Alexei A. Efros, Aaron Hertzmann. Proceedings of the IEEE Internaltional Conference on Computer Vision Recognition (ICCV), 2009. | project page | Paul |
F, Mar 9 | Avoiding confusing features in place recognition. Jan Knopp, Josef Sivic, and Tomas Pajdla. ECCV 2010. | project page | Chen |
M, Mar 12 | Example-based super-resolution. William T. Freeman, Thouis R. Jones, and Egon C. Pasztor. MERL Technical Report. | Andy | |
W, Mar 14 | Image Upsampling via Texture Hallucination. Y. HaCohen, R. Fattal, D. Lischinski. IEEE International Conference on Computational Photography (ICCP 2010). | project page | Kefei |
F, Mar 16 | Detail Hallucination from Internet-scale Scene Matching. Libin "Geoffrey" Sun and James Hays. IEEE International Conference on Computational Photography (ICCP 2012). | Geoff | |
M, Mar 19 | LabelMe: a Database and Web-based Tool for Image Annotation. B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman. International Journal of Computer Vision, 2008. | pdf, project page | Kefei |
W, Mar 21 | Project Status Updates. | Everyone | |
F, Mar 23 | Project Status Updates. | Everyone | |
M, Mar 26 | No Classes | ||
W, Mar 28 | No Classes | ||
F, Mar 30 | No Classes | ||
M, Apr 2 | Building a database of 3D scenes from user annotations. B. C. Russell and A. Torralba. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009. | pdf, Project page | Kilho |
W, Apr 4 | Photo Clip Art. Jean-François Lalonde, Derek Hoeim, Alexei A. Efros, Carsten Rother, John Winn and Antonio Criminisi. ACM Transactions on Graphics (SIGGRAPH 2007). | project page | Zhaoxin |
F, Apr 6 | Nonparametric Scene Parsing via Label Transfer. Ce Liu, Jenny Yuen, and Antonio Torralba. TPAMI 2011. | project page | Ryan |
M, Apr 9 | Image restoration using online photo collections. K. Dale, M.K. Johnson, K. Sunkavalli, W. Matusik and H. Pfister. International Conference on Computer Vision, 2009. | project page | Paul |
W, Apr 11 | CG2REAL. M.K. Johnson, K. Dale, S. Avidan, H. Pfister, W.T. Freeman and W. Matusik.. IEEE Trans. on Visualization and Computer Graphics 2010. | project page | Chen |
F, Apr 13 | Learning to predict where humans look. T. Judd, K. Ehinger, F. Durand, and A. Torralba. IEEE International Conference on Computer Vision (ICCV), 2009. | project page | Jung Uk |
M, Apr 16 | Sketch2Photo: Internet Image Montage. ACM SIGGRAPH ASIA 2009, ACM Transactions on Graphics. Tao Chen, Ming-Ming Cheng, Ping Tan, Ariel Shamir, Shi-Min Hu. | project page | Tala |
W, Apr 18 | Describing Objects by Their Attributes. A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth. CVPR 2009 | project page | Fuyi |
F, Apr 20 | SUN Attribute Database: Discovering, Annotating, and Recognizing Scene Attributes. Genevieve Patterson and James Hays. CVPR 2012 | Genevieve | |
M, Apr 23 | Relative Attributes. Devi Parikh and Kristen Grauman. ICCV 2011. | project page | Kilho |
W, Apr 25 | Baby Talk: Understanding and Generating Simple Image Descriptions. Girish Kulkarni, Visruth Premraj, Sagnik Dhar, Siming Li, Yejin Choi, Alexander C. Berg, Tamara L. Berg. CVPR 2011 | Hang | |
F, Apr 27 | Im2Text: Describing Images Using 1 Million Captioned Photographs. Vicente Ordonez, Girish Kulkarni, Tamara L. Berg. NIPS 2011 | Sungmin | |
M, Apr 30 | Data-Driven Suggestions for Creativity Support in 3D Modeling. Siddhartha Chaudhuri and Vladlen Koltun. In ACM Transactions on Graphics 2010. | project page | Andy |
W, May 2 | Learning 3D Mesh Segmentation and Labeling. Evangelos Kalogerakis, Aaron Hertzmann, and Karan Singh. ACM Transactions on Graphics, 2010. | project page | Ryan |
F, May 4 | Color Compatibility From Large Datasets. Peter O'Donovan, Aseem Agarwala, and Aaron Hertzmann. ACM Transactions on Graphics, 2011. | project page | Zhaoxin |
F, May 18 2pm | Final Project Presentations | Everybody |