CSCI 2951-I:
Computer Vision for Graphics and Interaction

Fall 2022, MW 15:00 to 16:20, CIT 101

Faculty StyleGAN interpolation video

Contact

Everything is through Slack.
James' office hour appointment slot signups are here (top left).

Course Description

How does computer vision enable new interactive graphical applications, and how can we improve them?

Computer vision strives to understand, interpret, and reconstruct information about the real world from image and video data. Computer graphics models dynamic virtual worlds to be synthesized in realistic or stylized ways. In visual computing, these fields are converging since both disciplines create and exploit models describing the visual appearance of objects and scenes. Interaction allows us to explore these worlds, and to use ourselves and our environments directly as interfaces. Machine learning and deep learning allows us to define mappings between domains across vision, graphics, and interaction, and to generate new data such as images from recombining existing databases. Combined, these disciplines enable applications from the seemingly simple, like semantic photo editing, to the seemingly science fiction, like mixed reality.

In this seminar course, we will discover the state of the art algorithmic contributions in computer vision which make these new applications possible. We will concentrate on recent research results that were published at top-tier conferences and journals from problem fields such as reconstruction of static and dynamic 3D scenes, computational photography and videography, multi-view camera systems, generative methods for image formation, and vision-based interaction devices. Each week, we will read state-of-the-art papers, present them, and discuss their contributions, impact, and limitations. Then, we will develop projects which implement and extend these ideas. Beyond computer vision, this course will help us learn how to quickly interpret and assess academic papers, and how to give effective and engaging presentations.

Please join us!

Yo dawg. We herd you like augmented reality, so we put AR on your AR so your pokemon can pokemon go while you pokemon go.

Learning Objectives

Upon completion of this course, students will have:

Practical experience reading academic papers, and the skill to digest them quickly;
Created effective presentations to explain state-of-the-art techniques, by learning how to critique and how to respond to critique;
Formed, discussed, and evaluated many project ideas, and gained experience creating structured research project proposals;
Developed practical research project skills and demonstrated these on an unsolved computer vision problem;
Familiarity with the state of the art in reconstructing and generating images computationally.

Course Structure

The course is spilt into two halves. In the first half, we will read research papers, present them, and discuss their contributions, impact, and limitations. We will build upon our analyses and discussions to propose projects which would further the state of the art. Then, in the second half, we will try and do it: we will break into teams and implement projects which further the state of the art.

First Half—Review, Ideation, and Proposal

We will read two papers per week, and think about their successes and limitations. For each paper, everyone will submit two+ questions plus a project ideas for discussion by noon on the day of the seminar. (3-6 hours per week)
In class, we will present the papers. Each student will present at least once, and depending on enrollment this could be individually or in groups. Before the presentation, each student must meet with James to go over the work. (5 hours prep; 30 minute presentation)
As a class and with a student discussion leader (assigned randomly on the day), we will critique the presentation and discuss its strengths and how it could have been improved. (10 minutes)
After the presentation, together we will discuss the papers in detail to understand them and to generate new ideas. Each student will act as the discussion moderator at least once. (30 minutes)
In groups, we will develop these ideas into project proposals to try to extend the state of the art. We will critique these proposals in class. (2 proposals; 8 hours per proposal; self-determined but cycling groups).

Second Half—Implementation

Teams and projects will finalize.
Teams will implement their research projects.
In class, we will work together to resolve problems and integrate ideas. We will review and critique progress at regular intervals.
Finally, we will present our projects to the rest of the visual computing group, and eat cake.

Grading

25% paper questions, contribution and improvement in discussion and critique.
25% presentation skill improvement and overall quality.
50% project effort and outcome.

Time Commitment

Tasks	Hours
In class	40
Paper reading	35
Paper presentation	5
Projects:
—Proposals	10
—Discussion	10
—Implementation	80
Total:	180

Capstone

This class can be taken as a capstone. Talk to James about the expected standard and additional work across the course.

Prerequisites

This is a graduate course, but undergraduates are welcome! As a graduate class, we expect you to be somewhat self-guided; be prepared to read beyond the course material, and to explore and discover for yourself. Students should know something about visual computing before taking this course, e.g., having taken an existing vision, graphics, or deep learning course. Any other interested students should get in touch with James!

CSCI 1230 Introduction to Computer Graphics: Should be OK for more experienced students! Knowing something about machine/deep learning will help.
CSCI 1290 Computational Photography: This is a hybrid graphics/vision course, so it should put you in good stead.
CSCI 1300 Interface Design: It will be tough, unless you know some more fundamental techniques in visual computing.
CSCI 1420 Machine Learning: Please expect to learn some graphics by yourself.
CSCI 1430 Computer Vision / ENGN 1610 Image Understanding: Some of the graphics concepts may be tougher; this isn't primarily a 'recognition' course.
CSCI 1490 Deep Learning: Many topics will touch on deep learning; please expect to learn about graphics.
CSCI 2240 Interactive Computer Graphics: Great! Please expect to learn about machine/deep learning by yourself.

Late Submissions and Late Days

Due to the form of the class, there are no late submissions or late days. We expect you to attend every session, but let James know if you have any special requirements. For sickness and other issues of wellbeing, please obtain a note from health services and we will accommodate.

Course Notes

Paper Reading

We expect you to read every paper in preparation for the upcoming presentation and discussion. Reading these papers may be difficult initially, and students are not expected to understand everything. However, students are expected to actively engage in discussions to further their understanding of the presented material, with the help of the instructor and the class, within a supportive and creative atmosphere. Ideas that are developed during the seminar discussion are intended to directly influence your projects.

Paper Presentations

One student presents the work each session and is the designated `primary reviewer'. Given that everyone will have already read the paper, the presentation should aim to critically review the work:

What is the research context for this paper? What connections exist to previous work we have read?
What is the research problem that this paper is trying to solve? Cut away the extraneous details and explain it in simple terms.
What is the contribution over exist works, and how significant is this contribution?
What was difficult to understand in their method? Any interesting nuance or tricks?
How is the work validated, and in what areas could this be improved?
What is good about the work?
What are the limitations of the work? How could it be improved?
What comes next?

Demos are welcome! Code or executables may be available for the techniques, and you should feel free to show them off. Likewise, for video material, but don't just play it without providing any insight.

Leading the Discussion

Each session, one student will be randomly selected as the discussion leader. They will receive a digest of the submitted questions before the seminar. Their goal is to briefly summarize the strengths and weaknesses of the technique, raise questions appropriately throughout the discussion, covers future work ideas, and keep order.

Reading References

How to Read and Present Academic Papers:

Reading Slides PDF 2MB
Keshav, How to Read a Paper, SIGCOMM Computer Communication Review (2007) DOI
Fong, Reading a Computer Science Research Paper, SIGCSE Bulletin (2009) DOI
McGuire, How to Read [Rendering] Research Papers, UWaterloo Advanced Ray Tracing Course (2019) PPTX
Presenting Slides PPTX 10MB / PDF 4MB
Fatahalian, Tips for Giving Clear Talks
McGuire, How to Present a Research Paper, UWaterloo Advanced Ray Tracing Course (2019) PPTX

Tentative Schedule

Date			Topic	Reading / Slides
First half—Review, Ideation, and Proposal
Wed	07	Sep	Intro
Mon	11	Sep	How to read, present, question	Approximating Reflectance Functions using Neural Networks, Gargan and Neelamkavil, Eurographics Workshop on Rendering Techniques 1998
Wed	13	Sep	Paper—overview	Neural Fields in Visual Computing and Beyond, Xie and Takikawa et al., Eurographics STAR 2022
Mon	19	Sep	Paper—basic application	Learning Continuous Image Representation with Local Implicit Image Function, Chen et al., CVPR 2021
Wed	21	Sep	Paper—basic application	DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, Park et al., CVPR 2019
Mon	26	Sep	Paper—architecture	Implicit Neural Representations with Periodic Activation Functions, Sitzmann and Martel et al., NeurIPS 2020
Wed	28	Sep	Paper—signals	BACON: Band-limited Coordinate Networks for Multiscale Scene Representation, Lindell et al., CVPR 2022
Mon	03	Oct	Paper—forward maps	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, Mildenhall, Srinivasan, Tanick et al., ECCV 2020
Wed	05	Oct	Paper—hybrid representations	Instant Neural Graphics Primitives with a Multiresolution Hash Encoding, Müller et al., SIGGRAPH 2022
Mon	10	Oct	Indiginous People's Day—no class
Wed	12	Oct	Trial proposal session
Mon	17	Oct	Paper—priors/conditioning	pixelNeRF: Neural Radiance Fields from One or Few Images, Yu et al., CVPR 2021
Wed	19	Oct	Paper—manipulation	Decomposing NeRF for Editing via Feature Field Distillation, Kobayashi et al., NeurIPS 2022
Mon	24	Oct	Paper—motion	VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution, Chen et al., CVPR 2022
Wed	26	Oct	Paper—material/lighting	NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination, Zhang et al., SIGGRAPH Asia 2021
Mon	31	Oct	Project market	Halloween (bonus points for costumes)
Wed	02	Nov	Final project proposal session
Second half—Implementation
Mon	07	Nov	Paper—generative images	DreamFusion: Text-to-3D using 2D Diffusion, Poole et al., arXiv 2022; Zero-Shot Text-Guided Object Generation with Dream Fields, Jain et al., CVPR 2022
Wed	09	Nov	Paper—generative shape	Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing, Yang and Bao et al., ECCV 2022
Mon	14	Nov	Paper—digital humans	SMPLicit: Topology-aware Generative Model for Clothed People, Corona et al., CVPR 2021; PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization, Saito et al., ICCV 2019
Wed	16	Nov	Studio session
Mon	21	Nov	Studio session
Wed	23	Nov	Thanksgiving—no class
Mon	28	Nov	Studio session
Wed	30	Nov	Project review
Mon	05	Dec	Studio session
Wed	07	Dec	Studio session
Fri	09	Dec	New England Computer Vision Workshop @ MIT	Class field trip
Fri	16	Dec	Final presentations

General Policy

Welcome!

Our intent is that this course provide a welcoming environment for all students who satisfy the prerequisites. Our TAs have undergone training in diversity and inclusion, and all members of the CS community, including faculty and staff, are expected to treat one another in a professional manner. If you feel you have not been treated in a professional manner by any of the course staff, please contact any of James (the instructor), John Hughes (Dept. Chair), Tom Doeppner (Director of Undergraduate Studies), David Laidlaw (Director of Graduate Studies), or Laura Dobler (diversity and inclusion staff member). We will take all complaints about unprofessional behavior seriously. Your suggestions are encouraged and appreciated. Please let James know of ways to improve the effectiveness of the course for you personally, or for other students or student groups. To access student support services and resources, and to learn more about diversity and inclusion in CS, please visit http://cs.brown.edu/about/diversity/resources/.

Prof. Krishnamurthi has good notes on this area.

Quiet Hours

This class runs quiet hours from 9pm to 9am every day. Please do not expect a response from us via any channel. Likewise, we won't ask you to do anything between these times, either, like hand in projects.

Academic Integrity, Collaboration, and Citation

Feel free to talk to your friends about the concepts in the projects, and work through the ideas behind problems together, but be sure to always write your own code and perform your own write up. You are expected to implement the core components of each project on your own, but the extra credit opportunties often build on third party data sets or code. Feel free to include results built on other software, as long as you credit correctly in your handin and clearly demark your own work. In general, if you use an idea, text, or code from elsewhere, then cite it.

Brown-wide, academic dishonesty is not tolerated. This includes cheating, lying about course matters, plagiarism, or helping others commit a violation. Plagiarism includes reproducing the words of others without both the use of quotation marks and citation. Students are reminded of the obligations and expectations associated with the Brown Academic and Student Conduct Codes.

Accommodations

Brown University is committed to full inclusion of all students. Please inform me if you have a disability or other condition that might require accommodations or modification of any of these course procedures. You may email me, come to office hours, or speak with me after class, and your confidentiality is respected. We will do whatever we can to support accommodations recommended by SEAS. For more information contact Student and Employee Accessibility Services (SEAS) at 401-863-9588 or . Students in need of short-term academic advice or support can contact one of the deans in the Dean of the College office.

Mental Health

Being a student can be very stressful. If you feel you are under too much pressure or there are psychological issues that are keeping you from performing well at Brown, we encourage you to contact Brown's Counseling and Psychological Services. They provide confidential counseling and can provide notes supporting extensions on assignments for health reasons.

Incomplete Policy

We expect everyone to complete the course on time. However, we certainly understand that there may be factors beyond your control, such as health problems and family crises, that prevent you from finishing the course on time. If you feel you cannot complete the course on time, please discuss with James Tompkin the possibility of being given a grade of Incomplete for the course and setting a schedule for completing the course in the upcoming year.

Electronic Etiquette

Laptops are discouraged, please, except for class-relevant activities, e.g., to help answer questions and show items relevant to discussion. No social media, email, etc., because it distracts not just you but other students as well. Read Shirky on this issue ("Why I Just Asked My Students to Put Their Laptops Away"), or Rockmore ("The Case for Banning Laptops in the Classroom").

We will release course lecture material online. In considering laptop use for note taking, please be aware that research has shown note taking on paper to be more efficient than on a laptop keyboard (Mueller and Oppenheimer), as it pushes you to summarize the content instead of transcribe it.

Acknowledgements

Portions of this seminar design are from Stefanie Tellex's CSCI 2951-R course, from Christian Theobalt and his CVfCG course @ Max-Planck-Institute for Informatics, with special thanks to Christian Richardt. Thanks also to James Hays and CSCI2951-T Data-Driven Computer Vision course @ Brown University, with special thanks to Genevieve Patterson.

Thanks to Karras et al. (StyleGAN, CVPR 2019) and to Dmitry Nikitko, whose software I used to make the Brown CS faculty bust teaser.

Thanks to Tom Doeppner and Laura Dobler for the text on accommodation, mental health, and incomplete policy.

Thank you to the previous students who helped to improve this class. Previous course runs:

2019 Fall

2018 Spring

2017 Fall

CSCI 2951-I: Computer Vision for Graphics and Interaction