Homework Assignments

The specific schedule of topics and due dates below is tentative, and may change as the course progresses.

# Due Description Problems Materials
1 2/13 Naive Bayes Spam Classification questions
Enron email
2 2/16 Gaussian Naive Bayes
ML & Bayesian Estimation
Gamma telescope
ROC Matlab script
3 2/23 Handwritten Digit Classification questions
MNIST digits
4 3/01 Linear Regression questions
Motorcycle impact
bases: poly, rbf
5 3/12 Logistic Regression questions
template A
template B
Classifier plotting
Gamma telescope
Toy data: A, B, C
6 4/05 Regularization & Sparsity questions
Dorothea drug discovery
LogisticLossSimple script
7 4/12 Gaussian Processes &
Laplace Approximations
Pedestrians: intensity, HOG
8 4/26 K-Means Clustering &
EM for Mixture Models
template A
template B
MNIST digits and Word features
Rand index script
Modified mixDiscreteInferLatent.m
9 5/03 Collaborative Filtering via
Factor Analysis & EM Algorithm
Movielens ratings
Sparse factor analysis EM script
10 5/10 Hidden Markov Models
Topic Models & MCMC
Optional Extra Credit
template A
template B
Alice in Wonderland: Train, Test
Alice pre-processed: Train, Test
Text scripts: readText, textToNum, numToText
Topic data: Bars, Daily Kos
Topic scripts: multrand, visualizeBars

Homework assignments combine mathematical derivations with programming exercises in Matlab. If you have questions, come to our office hours or e-mail cs195fheadtas-at-cs. To hand-in your solutions, which are due at noon before lecture, use the following procedure:

You will need a CS department computer account to submit your answers and access homework solutions. If this is your first CS course, email cs195fheadtas-at-cs for help. To submit your answers from outside the CIT building, you will need to remotely login via ssh, scp, or VPN. The Sunlab consultants can help with configuration.
Type your answers in LaTeX, numbered by question and part as in the assignment. Be sure to include your full name and your CS account username at the top of the first page. Compile your document into a pdf. LaTeX editing can be easiest with specialized editors like Texmaker or Kile (available on department workstations). All of your answers should be in a single pdf: include result plots via the LaTeX figure environment, and Matlab source code in clearly labeled sections at the end of the pdf via the verbatim environment. Please be clear, and make it easy for the graders to check your work!
You may work on homework problems in groups, and discuss your work with each other. However, each student must program and write up their solutions independently. Include the names of any collaborators on the front page of your homework solutions. You may not directly copy solutions from other students, or from materials distributed in previous versions of this course.
Change to the directory (cd) your work is in. When you list files (ls), the only file should be hw.pdf. Matlab source code should be included at the end of the pdf, as described above. The Matlab code doesn't need to be extensively documented, but it should be readably commented, and we may run it. You should not turn in any folders.
Execute /course/cs195f/bin/cs195f_handin hw?, replacing ? by the appropriate homework number. This has been tested to work, but if it doesn't for any reason, e-mail your solutions to cs195fheadtas-at-cs with a full description of the problem, including any warning messages (and/or visit office hours).
Late Submissions
Homework assignments are due at noon on the due date, and worth a maximum of 100 points. For every day or fraction of a day that an assignment is turned in late, 25 points will be subtracted from the earned score. When assigning final grades, the lowest homework score will be ignored. Exceptions to this policy are only given in unusual circumstances, and any extensions must be requested in advance by e-mail to the instructor.

Midterm and Final Exams

The midterm exam will be given on Tuesday, March 13 at the normal lecture time (1:00-2:20pm in CIT 227). The final exam will be given on Wednesday, May 16 from 2:00-5:00pm in Kassar House, Fox Auditorium.

These will be pencil and paper exams, with answers written on blue books distributed at the start of the exam. Computers and calculators will not be necessary, nor are you allowed to use them. A formula sheet will be distributed with the exam, similar to this example from last year.

Graduate Credit Projects

Masters and doctoral students in the Brown computer science department can receive 2000-level graduate credit by completing an additional course project. This credit fulfills internal computer science degree requirements, but it is not recorded by Brown's registrar, and is probably not useful to students majoring in other fields.

For the project, you should apply material from (or closely related to) the course to a problem or dataset that you care about. Your experiments and analysis need not be sufficient for publication in a major conference, but should go beyond the typical homework problem. You should try to study combinations of statistical models, learning algorithms, datasets, and/or features which have not been previously explored.

A poor or incomplete project won't hurt your grade, but will mean you don't receive graduate credit. To successfully complete a project, you must fulfill the following requirements.

Prepare a short, 2-3 paragraph proposal describing the problem you will study, the models and learning algorithms you will experiment with, and why you think this application is promising. Specify a specific dataset (which you have access to) to be used in your experiments. Also include at least two references relevant to your project (either sections of books or research articles). Proposals should be submitted by e-mail to the instructor by 11:59pm on Thursday, April 5.
Prepare a short oral presentation summarizing the results of your project. Presentations will be given during the normal lecture time (1:00-2:20pm in CIT 227) on Thursday, May 10.
Prepare a 4-8 page technical report describing the results of your project, in the LaTeX style of your homework solutions. Your report should analyze the quantitative performance of the tested methods, using good practices taught in class (e.g., use of separate validation and test sets). You should also explain any interesting qualitative features of your experimental results, both good and bad. Reports should be submitted by e-mail to the instructor by 11:59pm on Friday, May 18.