Project 2

Various Deadlines


This is an independent assignment. You may discuss this assignment with course staff only.

Project Description

(Independent) This project make take one of two forms, which we'll call the "hypothesis" and "computation" forms.

Pose a computational question based on textual data, or describe a computational activity you'll perform on textual data. You may use your own data source, or choose from the data sources we discussed in class:

  1. Project Gutenberg:
  2. Dictionary:
  3. American Presidency Project Debates:

For your project you must either

  1. Present a testable hypothesis, carry out the required analyses, report your findings in a clear and understandable way, and discuss your results (just as in project 1) or
  2. Describe a computation you'll perform, including the form that the results will take, demonstrate that your computations do in fact work as claimed, and discuss the results on some data.
  3. To present your results, you may report descriptive summary statistics (such as count, mean, median, standard deviation, etc.). You will also be able to import your results into Google spreadsheets or Excel to analyze basic trends (via color formatting and plotting). However, you will not be graded on any spreadsheet work beyond presenting your results in a clear manner. This applies to either the "hypothesis" or "computation" forms of the project.

Project Themes

Below are several themes for projects that you are free to build upon in your project. You do not need to limit yourself to these ideas.

Regardless of your hypothesis or computation, you must write new Python functions for your analysis. You can use code from the homeworks and in-class activities, but you cannot only use functions that were provided in class, like compareTwo(), without adding new functions and/or modifying existing ones. What kind of new code could you add?

Project Proposal (Due Friday, October 30th at 11:59pm)

Project Description (Google document called FirstLast_Proj2_Proposal)

Write a concise (one-page) description of the project you would like to execute. This description should include the following parts, but double-check the rubric for full details:

  1. Background: put your project idea in context.
  2. Claim: the specific hypothesis you plan to test (which is a statement, not a question), or the computation you plan to carry out.
  3. Data: a brief description of your data source.
  4. Programming Elements: a few sentences describing the problems you will need to write Python functions for.
  5. Potential Roadblocks: a list of potential obstacles.
  6. Backup Plan: Suppose your project is much harder than you anticipate. What parts of the project would you change to still get somewhat interesting results?

Skeleton Code (

Write a Python file that contains an outline of the code you anticipate writing. This file should compile! It should include the following:

  1. Comments at the top of the file describing what the program does.
  2. Functions that you will write (of course, you might change this later).
  3. Function descriptions (in triple quotes) that describe (1) what the function does, (2) what the inputs are, and (3) what the outputs are.
  4. Some lines of code and comments that help describe what the functions will contain.

Don't get too wrapped up in the details here — the goal of the skeleton code is to provide you with an outline of what you have to program.


Create a Google folder called FirstLast_Proj2_Proposal. Replace FirstLast with your actual first and last name or we will take off points. Make sure the folder contains both your skeleton code and the proposal document.

Share the folder with .

Project (Due Sunday, November 15 at 11:59pm)

Carry out the project you proposed. It's OK if the project changes — that's why it was a proposal.

Python Program

After filling in your skeleton code, you are almost done. However, to make this code usable for others, you must do a few more things.

  1. Provide instructions on how to run your program (in comments).
  2. Provide at least one test function and/or test file that verifies that your code does what it should do. Include instructions in the comments.
    • If you have a tricky function that has a regular expression, write a test file and show that the function returns the proper result.
    • If you have a function that counts occurrences of words, write a test file and show that the function returns the proper result.
  3. Handle data and input errors and notify the user.
    • One way to notify the user is to print a string (such as "Error! Input should be an integer, not a float" and return nothing.
    • Remember the type() function. The following expressions all evaluate to True:
      type('a') is str
      type([1,2,3]) is list
      type(2.5) is float
    • Suppose you are using data where you know that each line should be split into exactly 13 elements. To skip any lines that do not have 13 elements, you could write:
      if len(myList) != 13:
        print("Skipping line with != 13 elements", myList)
        #continue with code...


You will create a website that presents your analysis and results. It should contain the following things:

  1. Project description and hypothesis.
  2. Concise explanation of your methods.
  3. Your results, presented in a clear and informative manner.
  4. Discussion of the results of your analysis or computation. You should point out expected and unexpected results.
  5. Reflection of the project. What went well? What didn't?
  6. Python and data/test files available for download.

Refer to the Project 2 Rubric for more details on the code and website requirements.

DON'T FORGET to change the permissions on your website so that we are able to view it. You may make the site public or restrict it to only people with a Brown email address if you like, but we must be able to access it in some way.


Create a Google Folder named FirstLast_Proj2. Please make sure to replace FirstLast with your first and last name or we will take off points. It should contain the following:

  1. All files you used in your project, including Python files, Excel files/Google Spreadsheets, data files, and test files.
  2. A Google document named 'README' that contains (1) the url of your web page and (2) a list of all files contained in the folder with a short description of what they are.

Share the folder with .