Activity 1-2

In this activity, you'll be finding the senators that agreed or disagreed the most with Senator Warren. You will use IMPORTXML to pull directly from the Senate's archived voting records and run computations on them. This is a similar exercise to your homework, so it will be very beneficial to walk through this carefully.

At the end of class, if you are done, please check this activity off with the TAs. Otherwise, please complete and share your spreadsheet with by the end of the day.

Task 1

Open up a new spreadsheet. Each senate vote we want to download will require a separate XML request, so we'll need to build a url for each request. In cells B1:K1, create a range from 1 - 10. You can do this by typing 1 in cell B1, then 2 in cell C1. Instead of continuing with 3, 4, etc, select both B1 and C1 and then fill until K1. This is the easiest way to create a range.

The first vote looks like

You'll notice the value 00001, this is the vote number. Vote 2 will need 00002. Vote 10 will need 00010. We can't just concatenate a text string of four zeros to the number in the row above because that wouldn't work for vote 10. We can use Text() to pad the vote number with 0s to the left of the value. In cells B2 to K2, convert the vote number to formatted text with the format string "00000". This should pad the value to a total width of 5 0s.

In cells B3 to K3 build the url like shown above, but with the appropriate formatted number string in the url. Remember, you can concatenate text with the & symbol.

Verify that your URL formula works, by copying and pasting on of the URLs you built to a web browser and checking that it returns the sort of XML data you are looking for.

Inspect the XML document returned. You'll find the data broken down for each senator. For our list of names, we want the name of the senator, state and party given in the "member_full" element.

Task 2

In cell A4, we will import the the names of the senators using IMPORTXML. You can use the URL in cell B3. Because we are interested in name contained in the "member_full" element, our xpath_query should look like "//member_full". The "//" means that we will look at all elements that match the pattern "member_full"

In cell B4, import the votes of the senators using the links from the row above. In a web browser, inspect the XML voting data and try to find an element that looks like it contains the correct voting information. Use this to create your xpath_query. If the data loads correctly, fill the rest of the columns to import the XML data.

This might take a few minutes. To avoid the spreadsheet recomputing these values many times, which can down our computations, we're going to copy and paste these values to a new sheet. There are actually many ways to copy and paste the information from one sheet to another. We are interested in copying the values only, so the spreadsheet doesn't continue to recompute the IMPORTXML command, which takes a long time.

Create a new sheet. Then select all the cells with information in Sheet1. Go to Edit->Copy, then click on the second sheet and click Edit->"Paste Special"->"Paste Values Only". If you just use command+c and command+v you'll repaste all the formulas and you will run 10 more instances of importxml. You'll notice that the keyboard shortcut for pasting values only is command+shift+v. This will likely come in handy in the future.

Task 3

Create a new sheet. In this sheet we will code each vote (1='Yea', -1='Nay', 0='Not Voting'). Copy the vote number from the previous sheet to the new sheet starting at cell B1. Copy the names of each senator to column A in the new sheet, starting at A2. Create a formula in cell B2 that codes the vote appropriately using a nested if statement. Remember IF takes three parameters, a logical expression, the output when the expression is true, and the output when it is false).

Task 4

Create a new sheet. In this sheet we will determine whether each senator agreed or disagreed with Senator Warren. Again copy and paste the vote numbers and senator names to the new sheet. Then in cell B2 compute whether the senator on that row agreed with Senator Warren for that vote. The result should be 1 if they both voted 'Yea' or both voted 'Nay', -1 if they disagreed and 0 if either abstained. Hint: You should be able to do this with a simple mathematical operation.

Task 5

Now we will compute the two pieces of our alignment score. The numerator is the number of agreements minus the number of disagreements and the denominator is total number of agreements and disagreements.

Because of the chosen coding scheme of agreements (1=agree, -1=disagree, 0=otherwise), we can easily compute the number of agreements - disagreements. The coded agreements are just 1s, -1s and 0s, so it might look like [1, 0, -1, 1, 1, -1]. The number of agreements is just the sum of the positive numbers, and the number of disagreements is the negative sum of the negative numbers. In this case, A=3 and D=2. Because, the disagreements were already negative, to compute the A - D numerator, we can just sum all the coded agreement values to get our final A - D value. Take a few minutes to convince yourself with a few examples that this is the case.

To compute the number of agreements + disagreements number, we can compute a sum similarly to the previous step, but convert all 1s and -1s to 1. All this requires is computing the absolute value of the coded agreement using the ABS function.

Create a new sheet and copy the column of senator names over to the new sheet, starting at A2. Type "A - D" in cell B1, "A + D" in cell C1 and "Alignment" in D1. For each senator, compute the sum of their coded agreements with Senator Warren into column B. Then repeat this again in column C, but compute the absolute value of each coded score before summing the column. For column D compute the final alignment score.

As a sanity check, Sen Warren should have a perfect alignment score. If that's not the case, or you notice other errors, check the results for each sheet and evaluate whether they look correct or not.

Task 6

Lastly, we'll conditionally format the cells to provide a clearer illustration where the senators that agree with Warren, vs those that don't.

You can do this by selecting the alignment column, then clicking the menu option Format -> Conditional Formatting. This will bring up the Conditional Formatting popup. In this popup, click the Color Scale tab. Under the preview option, select a two-color option, and then click Done.

Which senators does she align with the most and the least?

Once you're done, please check off your lab with a TA or share your file with by midnight, 1/31.