Save Link As...
or Save Target As...
or something like that (this depends on your web browser). Give it a name that includes the url information and make sure it has .xml
at the end..xml
file to a .csv
file for us. Don't worry if you don't know yet know what this means.File > Help
.Do the sections that pertain to your operating system.
Start
button (lower left). In the Search Programs and Files
box, type cmd and click on the first item (it will be either cmd
or cmd.exe
). This should open up a window that has some text ending in your name followed by a greater-than sign. It looks something like this:
C:\Users\jmiles>
cd
(for "change directory") command. Type
cd Desktop
C:\Users\jmiles\Desktop>
Applications/Utilities
and open the application called Terminal
(if you're on a lab machine, the path is Cluster/Applications/Utilities
). This will open up a window that has some text ending in your name followed by a dollar sign. It looks something like this:
Macintosh-102:~ Jadrian$
cd
(for "change directory") command. Type
cd Desktop
Macintosh-102:Desktop Jadrian$
java -jar xml2csv-conv.jar
java
tells the computer to run a program written in Java, -jar
tells the computer it is a JAR file (as opposed to some other format), and xml2csv-conv.jar
is the name of the program you ran.
When you ran the above command, you got some kind of error along with a bunch of instructions on how to run the program. This is all useful information, but for now we just want to somehow give this program an XML file and have it produce a CSV file.
If you did not get instructions on how to run the program, perhaps you do not have the Java programming language installed on your computer. If you want to run this program on your computer at a later date, meet with a TA and they will set you up. For now, do the rest of this task with your neighbor.
myfile.xml
and you want a CSV file called myfile.csv
. At the command prompt, you would type:
java -jar xml2csv-conv.jar myfile.xml myfile.csv
Data
tab. Go to Get External Data > From Other Sources > From XML Data Import
. Use this to open the XML file that you just downloaded. If you get a warning, click OK
. When asked where to put the data, make sure the upper left cell of your worksheet is highlighted, and click OK
again. Import
and select CSV file
. Navigate to the CSV file you made (the one with the correct number of rows!). Select Delimited
, then click Next
. Since this is a CSV file, we only want Comma
to be selected for delimiters; click Next
. Then click Finish
. When asked where to put the data, make sure the upper left cell of your worksheet is highlighted, and click OK
. View > Normal
to see the table all at once.Save
congress112_allvotes.xlsx
onto the Desktop (saving it allows you to edit it, rather than viewing it immediately). If the file got saved as a .zip
file, change the extension to .xlsx
. (This problem arises when you use Internet Explorer; if you use Firefox, it should not occur. To change the file's extension, right-click on the file and select Rename
.) Open congress112_allvotes.xlsx
in Excel (this may take a few minutes).session
vote_number
vote_question_text
member_full
vote_cast
vote_question_text
column. Some of the cells in this column contain the words "On Passage of the Bill..." We want to keep only the rows with similar text in the vote_question_text
column. Hint: this should only take a few minutes; it does not involve deleting the non-bill-vote rows one by one.vote_id
column in cell F1
. Excel automatically infers that this is a new part of the table. In row 2 of the vote_id
column, write a formula to combine the “session” for this row with the “vote_number” for this row, placing a colon between them. Use Fill down
to apply this formula to all the other rows. (Excel may do this automatically because you're working on a table; whether it does so or not depends on how it's been set up.) The table should look like this:
A | B | C | D | E | F | |
---|---|---|---|---|---|---|
1 | session | vote_number | vote_question_text | member_full | vote_cast | vote_id | 2 | 1 | 19 | On Passage... | Akaka (D-HI) | Yea | 1:19 | 3 | 1 | 19 | On Passage... | Alexander (R-TN) | Yea | 1:19 | 4 | ... | ... | ... | ... | ... | ... |
vote_id
, and title it numerical_vote
. In the second row of that column, enter a formula that has a 0
if the senator did not vote, a 1
if s/he voted “Nay,” and a 2
if s/he voted “Yea.” If s/he has some other vote than these three, your cell should contain the word “ERROR”. Verify that no cell contains the word “ERROR.” Hint: Try writing an expression that uses multiple IF
functions "nested" within each other. The table should now look like this:
A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|
1 | session | vote_number | vote_question_text | member_full | vote_cast | vote_id | numerical_vote | 2 | 1 | 19 | On Passage... | Akaka (D-HI) | Yea | 1:19 | 2 | 3 | 1 | 19 | On Passage... | Alexander (R-TN) | Yea | 1:19 | 2 | 4 | ... | ... | ... | ... | ... | ... | ... |
Insert
tab, and insert a pivot table, using the suggested Table 1
(that's what Excel automatically named our vote data). For Mac, click on Data > PivotTable Report...
and select the entire table. Click OK
. Either way, insert the pivot table into a new worksheet. If you don't remember how to use pivot tables, review the tutorial.member_full
, the column labels be vote_id
, and the value in the cells be the numerical_vote
that you just created. Remember that you can drag-and-drop the fields from the top right side list to one of the bottom right side areas. For example, try dragging the member_full
field down into the "Row Labels" box. Value Field Settings
; choose Max
from the list that appears, and click OK
. For Mac, click PivotTable > Field Settings
, select Max
, and click OK
.Grand Total
row and column. For Windows, select Option > PivotTable > Options > Option
,select the Totals & Filters
tab, and uncheck the Show grand totals
checkbox. For Mac, select PivotTable > Table Options
and uncheck the Grand totals
checkboxes.