Sort and Filter
Upload the csv file spreadsheet 1-4 to your google drive and open it as a spreadsheet, using what you learned in the last lecture. This is a dataset of meteorites observed falling from space to Earth. The dataset is adapted from kaggle, which originally got its data from NASA.
Use logical statements to fill in in the general classification column to the right of the Geolocation.
- Stony meteorites have recclass "Chrondrite", "Achrondrite", or "Stone-uncl"
- Iron meteorites have recclass "Iron" and have mass over 10
The new column should label meteorites as "Stony" or "Iron" using
Sort the entire dataset by largest to smallest id number and write down the name and year of the meteor with the largest id.
We don't want to sort the header row, so we freeze it. To freeze just the top row of your spreadsheet, go to the View menu, then click on Freeze, then 1 row. (Forget where a command is? You can use the searchbox under the Help menu to find a command by its name.) Note that you don't have to highlight anything in particular to freeze a row.
Once the header is protected, you can sort the spreadsheet by any column by clicking on the arrow next to its letter. You can also sort by the column in which you've selected a cell using the Data menu.
Use sorting to find the mass and name of the three heaviest achrondrites. This requires sorting by multiple columns (one for reccclass and one for mass), which can be done by selecting the entire spreadsheet and using the "Sort range..." command. The order in which you choose to sort by the two columns will drastically change your results. Not sure which way to do it? Try both ways and note the difference. Why does one of them work better?
You can use the
SORT formula to copy over some fraction of the spreadsheet in a particular assortment. The
SORT formula lets you specify which range of the spreadsheet to copy over, which equally sized range to sort it by, and whether the sort starts from the smallest or largest. For example, if you wanted to copy over the years from smallest to largest, you would use the formula
SORT(E2:E, E2:E, true) because we want to sort the year column by itself in smallest to largest order. Create a new sheet of the spreadsheet and on it, provide the names of all of the meteorites sorted by mass from heaviest to lightest.
Turn on filtering by clicking "Filter" in the Data menu while you've selected a cell somewhere in your dataset.
Display all of the meteorites in a particular geographic region by latitude and longitude. This will require clicking on the arrows next to both latitude and longitude and filtering by condition.
Not sure where to start? Providence is 41.8240, -71.4128. Boston is 42.33196, -71.020173. So you could look for all meteors that fell in the 40 to 45 and -70 to -75 range, which approximately covers New England. To filter the meteors that fell in the proper New England latitude, click on the arrow next to "reclat" in column F, then "filter by condition," then "Is between" 40 and 45. Note that the "Is between" filtering includes both endpoints. Now you just need to add a filter for longitude. Feel free to instead look up a different region and find its meteorites, but write down the coordinates you chose.
Before you remove the filters, find the number and average mass of the meteorites in the region. Filters just hide part of the spreadsheet, so
AVERAGE will include some extra cells. Instead, use the SUBTOTAL function. The first argument of
SUBTOTAL is 1 for an average and 2 for a count.
Turn off filtering to show the entire spreadsheet again.
You can also use the
FILTER formula to copy over some fraction of the spreadsheet. On a separate sheet, use the
FILTER formula to provide all the meteorites that weigh less than 10 grams.
Once you're done, please check off your lab with a TA or share your file with firstname.lastname@example.org by midnight, 2/7.