Final Projects
For questions about final projects, please post to the CS242 Piazza discussion site or come to our office hours.
Final projects are worth 40% of your overall course grade: 5% for a 2-3 page project proposal due on November 11, 2016; 10% for a short oral presentation given on December 13, 2016; 25% for a final technical report due on December 20, 2016. You must identify your project topic and project team by October 26, 2016. Successful projects typically have the following key steps:
- Find an application whose structure seems promising for probabilistic graphical models. Ensure data is easily available.
- Identify a baseline application of statistical machine learning (not necessarily graphical models) to related data. Your work will be easier if you can find existing research papers, technical reports, or software packages with relevant results.
- Propose a few related graphical models with varying degrees of complexity. Usually these models are similar to ones discussed in lecture or the research literature, but some details should be novel.
- Implement learning and/or inference algorithms that are appropriate for your graphical model and data. You are very welcome build on existing software packages, but should do more than simply run off-the-shelf code.
- Validate your approach via careful experiments. Application performance numbers are important, but you should also confirm the correctness of your learning algorithms, and inspect learned model parameters or structure.
Broadly, you should try to study combinations of graphical models, learning algorithms, and/or datasets which have not been previously explored. If you're interested in a different (perhaps more theoretical) style of project, talk to the instructor. Graduate students are of course encouraged to identify topics which support their thesis research.
Project Teams (October 26)
By Wednesday, October 26, you need to have identified the topic of your project and the members of your project team. Confirm this by e-mailing the following information to cs242tas@lists.cs.YOU-KNOW-WHERE.edu:
- A short description of the project topic and goals. Explicitly describe the connection to graphical models.
- An example of at least one dataset that is available to you, and you think could be useful for this project.
- The names of the members of your project team. One team member should send the message, cc'ing the others.
- We strongly encourage teams with three or two members. If you would like to do a solo project, include a description of your relevant previous experience with the proposed project topic. Typically students with sufficient background for solo projects have graduate research experience, but this is not a strict requirement.
The course staff will use this information to give you early feedback on project feasibility, and suggest related work.
Project Proposals (November 11)
The project proposal should be 2-3 pages long, including all figures and references. We encourage, but do not require, you to use the NIPS LaTeX style file (be sure to uncomment \nipsfinalcopy). Proposals must be submitted as a single pdf file, by e-mail to cs242tas@lists.cs.YOU-KNOW-WHERE.edu, by November 11. Your proposal should contain the following information:
- A clear description of the problem or application you intend to address. Why is it worth studying?
- A discussion of related work, including references to at least three relevant research articles or technical reports. Which aspects of your project are novel?
- A figure illustrating a graphical model which plays a role in your project. We recommend creating such figures in a vector drawing program, such as Adobe Illustrator, Powerpoint, Inkscape, or Xfig.
- A figure illustrating a preliminary experiment with some data related to your project. This could be some sort of visualization of the raw data, or the results of running a simple (supervised or unsupervised) machine learning method.
- A description of the learning and/or inference challenges that you need to solve to apply a graphical model to your data. It is fine if you do not yet know what algorithms are appropriate, but discuss the challenges that need to be solved.
- An experimental evaluation protocol. How will you know that you've succeeded?
- A concrete timeline for accomplishing your project by the end of the course. What are the biggest challenges?
Project Presentations (December 13)
Short project presentations will be given during the normal 2:30pm lecture time on Tuesday, December 13. Each project team will be allocated a total of 8 minutes, including questions, and thus should prepare roughly 6 minutes of material. Your presentation should include:
- Background material appropriate for an audience who knows the course material, but not your application area.
- A clear, high-level description of the graphical model(s) and learning/inference algorithm(s) used in your work.
- Visualization and analysis of your results. It is OK if you have not completed all experiments you intend to include in your project report, but there should be some partial results. Negative results (poor performance) is also OK, as long as you thoughtfully analyze why things aren't working as well as you'd hoped.
- Contributions from all team members. We are evaluating clarity of presentation as well as the quality of the work, and thus each team member should participate in the talk.
Project Reports (December 20)
The technical report should be between 6-10 pages long, in the style of top machine learning conferences. Although the results need not be sufficiently novel for publication, the presentation and experimental protocols should be of high quality. We encourage, but do not require, you to use the NIPS LaTeX style file (be sure to uncomment \nipsfinalcopy). Reports must be submitted as a single pdf file, by e-mail to cs242tas@lists.cs.YOU-KNOW-WHERE.edu, before 11:59pm on Tuesday, December 20, 2016. Your report should include:
- A clear description of the problem addressed, and summary of related work with appropriate references. Include figures illustrating the graphical model(s) used in your project.
- A mathematically precise description of the statistical models and learning algorithms that you consider. For the parts of your project which are novel contributions, include derivations which are sufficiently detailed for knowledgable experts to reproduce your work. Where you adapt previous work, included detailed references.
- To help verify that your statistical learning algorithm is working properly, at least one plot showing the learning objective (joint log-probability for an MCMC method, a log-likelihood bound for a variational method, etc.) as a function of the number of learning iterations.
- Some sort of visualization of the learned model structure; summary performance numbers are not sufficient. For example, for many graphical models it is possible to plot the learned clusters or features or states, sample from the posterior or predictive distributions, visualize results on low-dimensional toy data, etc.
- A description of implementation details, including references for any code that was adapted and reused, a high-level summary of the functionality that your code implements, the programming language(s) you used, etc.
- Mandatory: A description of how each team member contributed to the project.