Homework 3-2

Due December 1, 2015, 9:00 am


For the following problems you may discuss the concepts that will help solve these problems with classmates and course staff. You are never allowed to copy down the answers of your classmates, as that is a violation of the collaboration policy.


This is a somewhat long homework assignment (do not start too late), as it involves using three separate interfaces - Google's Geocoding API, a Twitter's API, and Google Earth. This homework should be helpful to your final project as well.

Task 1 - Google Geocoding and Reverse Geocoding

Download getcoords.py. This file contains two functions that use google's geohashing API. We want you to play around with these functions, so open the program, and run it.

  1. Test the getCoordinates function by calling it with the following three strings as input:
    1. 'Brown University'
    2. 'White House'
    3. 'Chipotle, Thayer St., Providence'.
  2. Do the same with printLocation with the following string inputs:
    1. '41.8236,-71.4222'
    2. '-15.7939,-47.8828'
    3. '51.5072,0.1275'.
  3. There's much more information included in the result from querying google in printLocation. Edit the end of printLocation so that it also prints the current city. Retry question two to test your changes. Hint: Another term for city is 'locality'.

Task 2 - Twitter

  1. Download twcol.py, fill your authentication info (as seen in class), and run it. The execution should generate two files, one resulting for a term search and another resulting from a user timeline query.
  2. Play around with the inputs of getTweetsTimeline and getTweetsSearch at the bottom of the file (beneath the line that says MAIN PROGRAM). Check the results of term.csv and user.csv after using different inputs, you should get see different tweets show up in these files based on where you centered the searches.
  3. This is useful, but now we want to pull specific pieces of data from that file. Download twparse.py. It contains a function designed to find and print the latitude and longitude from each line (i.e., each tweet) in 'term.csv', although it's incomplete. Replace 'YOUR_RE_HERE' with a actual regular expression that will successfully find the latitude and longitude from EACH line of the CSV file. The lines of the CSV file are passed one at a time to the function. Your regular expression should contain:
    1. a literal parenthesis (Hint: escape with backslash to avoid special meaning)
    2. followed by a sequence of digits with a period somewhere on it, optionally preceded by a minus sign (Hint: you'll have to use '\.' to mean an actual period, because period has a special meaning in a regular expression, matching any character)
    3. followed by a slash
    4. followed by another sequence of digits with a period somewhere on it, optionally preceded by a minus sign. This is the same regular expression as in (b)
    5. a literal closing parentesis (Hint: escape with backslash to avoid special meaning)
    6. Enclose the sub-regular expressions (b) and (d) with a parenthesis (without a preceding backslash, preserving its special meaning), which will define sub-match groups. You are now able to capture them using match.group(1) and match.group(2).

Task 3 - Google Earth

Download kmlgen.py, open it, and take a look at the functions. You will be filling them in as described below.

  1. Look at the getPlacemark function. It takes in a name, description, longitude, and latitude. Change it to produce a string that represents a KML Placemark element that can be read by Google Earth. For more details see the template variable in the getPlacemark function - your output string should be in this format, but with the appropriate pieces replaces with the function's input.
  2. Next take a look at the getKML function. This function takes a list of lists, where each inner list is of the form ['text tweet 1', lat, long]. Edit getKML to call getPlacemark using each inner list as input to get a KML Placemark corresponding to each tweet. Once you've done that, combine them all into a single string representing a KML file with multiple placemarks, and return that string.

    Take care - there is some extra content in a KML file beyond what getPlacemark produces, and you'll need too add these pieces to the string you return. Refer to CIT.kml.txt to figure out what else you'll need.

  3. Now you have a function that produces the text of a KML file, but you need to save it to your computer. Create another function that takes in an entire KML file's contents as a string. Have it write that string to a file with extension ".kml". If you call this new function with the output of getKML as input, you'll generate a file that maps the given points and can be opened in Google Earth!
  4. Using getKML and your new function, add a few lines at the bottom of your code to create a KML file from the array called test_input. If everything worked, then you should be able to open the resulting .kml file in Google Earth.


Rename all four of your files FirstLast_original_file_name.py, place them all in a folder named FirstLast_HW3-2 and share that folder with .

Note: Before you turn in your Python files, make sure they run properly(Save your Python file. Then select Run > Run Module or hit F5 on your keyboard)! If nothing appears in the Shell, don't worry as long as no red error messages appear. If they don't run, i.e. if red stuff starts appearing in the shell, points will be taken off!