cs0931handinfall2015@gmail.com
.
This is a long assignment, but that's mostly due to the amount that you have to read. For each problem, there's not too much you have to do, so don't be put off by it.
In class, we've talked about measuring a similarity or distance between a pair of senators. That distance is abstract. It represents how far apart the two senators are in the "space" of voting records. Now we'll consider a more concrete example in order to think about these distances.
Bad Sheet
tab. It shows the September prices for two items (Bread and Rice) and how many of each item we bought, and then the cost of the items, which is computed as D3*B3 + D4*B4
, i.e., the number of breads times the price of bread, plus the number of rices times the price of rice. That total is shown in cell B6
.
Down below, we've got both September prices (copied from those above) and October prices, and a (correct) formula for the increase in price from September to October, which takes the total October cost, D3*C9 + D4*C10
and subtracts the total September cost, D3*B9 + D4*B10
. Notice that we're using the costs from rows 9 and 10.
Change the price of bread in cell B3
from $2.35 to $2.50. You'll see the September cost go up. But the cell labelled "Increase" (D9) won't change at all. Look at the formulas and explain why in cell E12
.
Good Sheet
. On this one, things are a little clearer: the only places the person looking at this sheet should change anything are highlighted in yellow (with a note at the top explaining this). The September prices in rows 9 and 10 are not copied from rows 3 and 4, but rather are defined by formulas that refer to rows 3 and 4: when the price of bread in B3 is changed, the price in B9 automatically updates. Note, too, that all dollar amounts have been formatted to display nicely. (That was true of the "Bad Sheet" tab as well.)
The cells in this sheet can be divided into four classes:
BadBetterGood
. On this one, the first two columns import Senator names as we did in class. The key formula, in cell B1
, is =importXML("http://www.senate.gov/legislative/LIS/roll_call_votes/vote1132/vote_113_2_00008.xml", "//members/member/first_name")
You can see that the legislative session, the congress number, and the vote number are all in the url, and the numbers "113" and "2" appear twice. Someone wanting to see a list of senators for the first session of the 110th congress would have to do some careful editing of the URL.
Note that in this case, the senator names are "data", but there don't seem to be any parameters. To the degree that there are, they're hidden inside the importXML
formula: the URL and the XPath.
Columns D and E do a better job. Once again the data you can change is highlighted. There are several well-named entries (like "Which congress"), and even the item to extract has been separated out into its own field.
There's a problem, though: the list of senators is missing.
If you look at the "Goal" entry, I've written down the thing you should be trying to produce in the URL entry (for the 113th congress, second session, anyhow). The formula I've used for the URL doesn't produce that, as you can see.
Modify the formula in E7
(you'll need several more ampersands and several more strings!) to make it exactly reproduce the "goal" URL. When you do, the Query results should appear.
Change the "Which session:" to "1". Does the right thing happen? What about if you change the congress to "110"?
The formula for the query is =importXML(E7,E5)
. That makes sense: E7
contains the URL, E5
contains the XPath. But we can do better.
=importXML(url,xpath)
H7
, went to
Data > Named Ranges
and named it "url"; I did the same thing with cell H5
, naming it "xpath". After doing that, I'm allowed to refer to those cells by name rather than by letter-number coordinates. It makes the formula a good deal easier to read.
Change things so that instead of the member names being listed lower down in column H, their votes are listed. Was it easy?
Finally, we want to check in and make sure you feel you're becoming comfortable with the ideas needed to do a successful spreadsheet project of your own.
Task3
tab and list two or more sources of data you might be interested in exploring in your project. Describe in your own words each data set in a few words and provide a link. If you can't find sources yet, that is okay: instead, briefly describe two or more kinds of data you might be interested in exploring (e.g., votes, education, some kind of survey data). Write your answers in B1
B3
, please rate how you feel the class is going so far:
B5
.Share the following files with cs0931handinfall2015@gmail.com
FirstLast_HW1-3
In the above file name, "FirstLast" should be replaced with your first and last name. If it is not, we will take off points. Make sure every task has been completed.