Project 1


This is an independent assignment. You may discuss this assignment with course staff only.

Project Description

(Independent) Choose a problem related to the work we have done so far in the course. This is very open-ended, and you can explore one of two options:

  1. Perform the same sort of analyses we did for the Senate on a different dataset. This dataset may be political (e.g. the House), it may include a number of years/sessions, or it may be in a completely different domain. For instance, maybe you have a list of 300 kinds of mushrooms, each with yes/no descriptions of different properties. You'd like to cluster them into groups by their properties and see if there is any biological significance to the grouping.
  2. Perform a different analysis on the Senate dataset to answer a different political question. For example, “Blue Dog” democrats are members of the Democratic party from the southern states who identify themselves as moderate. Are Blue Dog democrats more similar to other democrats or other southerners? For a question like this, we expect that you make a project that, perhaps, allows us to select any subset of senators and assess which of the other groups they are most similar to. In class, we only looked at a subset of votes to reduce difficulty of computing our results, but for the project we would expect you to include a comprehensive set of votes to provide significance to your results.
For your project, you must present a testable hypothesis, carry out the required analyses, report your findings in a clear and understandable way, and discuss some of the trends you find.

Handin 1: Project Proposal (Due Friday, February 17 at 11:59 pm)

Write a concise description of the project you would like to execute before you start working with spreadsheets. Between 1-2 pages is probably enough space for this proposal, but the more details you provide, the better feedback we can give. This description should cover everything in the rubric for the proposal, including the following parts:

  1. Background: a few sentences to put your project idea in context and the overall goal of your project.
  2. Claim: the specific hypothesis you plan to test.
  3. Data: a clear description of the data you plan to use, including:
    • What the data represents.
    • The format of the data and how you are going to import it into your spreadsheet.
    • Where it is going to come from (include the url).
    • What it currently looks like. You can capture a screenshot by pressing PrtScr (for “Print Screen”) and pasting it into a document. If you have a Mac, press Cmd + Shift + 4 and click and drag the selection box around the area of your screen you want to capture (this will save your image to the desktop).
  4. Analysis Steps: a list of steps to carry out your analysis. Be concrete about what you plan to do.
    • The first step(s) should be about importing and formatting your data.
    • Describe the formulas you will use. For example, “I will write a formula that looks at a property of two mushroom species and reports a 1 if they match and a -1 if they don't match.”
    • Describe how you plan to present your results. Will you use conditional formatting, a chart, or other visualization? If none of these is appropriate, why?
    • Describe how the end product(s) of your analysis (e.g., the ranking of senators we produced in class) might support your claim, or show it is false.
  5. Potential Roadblocks: a list of potential hangups and needed resources. There are all kinds of roadblocks you might encounter; try hard to imagine what might trip you up. The following examples are to get you started thinking:
    • The data might have to be formatted in a special way, or you plan to import LOTS of data (i.e. from many years/sessions).
    • There's a calculation that might be tricky.
    • You think there might be a more interesting claim to test but you don't know how to test it.
Check the rubric for the proposal -- did you cover everything? Share a google document named YourName_Proj1_Proposal with

Handin 2: Project (Due Friday, March 3 at 11:59pm)

Carry out the project you proposed. It's OK if the project changes — that's why it was a proposal. Check the rubric for the project — did you cover everything? When you're finished, share a Google folder FirstLast_Proj1 with The folder should contain the following:

  1. Any spreadsheets and other files you used to produce your results.
  2. A link to a web page that includes the following (which can be copied from your proposal):
    • A description of the project.
    • A description of the methods of how you manipulated the data (with references to any intermediate spreadsheets that the course staff should look at).
    • The results of your analysis, displayed in a clear and informative manner.
    • A discussion of the trends you see in your analysis. You should point out expected and unexpected results. For example, “I found that Lincoln Chafee is listed as the Republican who votes most similarly to Ted Kennedy. In 2007 he left the Republican party and became an Independent.”
    • A reflection on the project. Did you have to change anything from the project you originally proposed? What problems did you run into? What would you have done differently?
  3. A Google doc named README that contains (1) the URL of your web page and (2) a list of all files contained in the folder with a short description of what they are.
  4. Finally, share your website with