Class summary:   Introduction to Tables
1 Tabular Data
1.1 Summary of Table Operations So Far

Class summary: Introduction to Tables

Copyright (c) 2017 Kathi Fisler

Try a new problem today: managing a gradebook. I want to write programs to help me do various computations over students’ grades on exams (average scores on an exam, average scores across exams, improvement, histograms of scores, etc). Let’s start with two students:

Allie got 85 on exam1 and 90 on exam2 Carl got 75 on exam1 and 60 on exam2

Let’s say I want to have a function that lets me lookup the grade that a specific student got on a specific exam. How might we write that code with what we have now?

  fun lookup-grade(student :: String, asgn :: String) -> Number:

    doc: "Return grade of a given student on a given exam"

    if (student == "Allie"):

      if (asgn == "exam1"):

        85

      else if (asgn == "exam2"):

        90

      else:

        raise("No such assignment")

      end

    else if (student == "Carl"):

      if (asgn == "exam1"):

        75

      else if (asgn == "exam2"):

        60

      else:

        raise("No such assignment")

      end

    else: raise("No such student")

    end

  end

(the raise operator is how we have Pyret programs generate errors in the case of unexpected or invalid inputs.)

This code should bother us for several reasons:

Key idea in CS: As a general rule of thumb, we want ways to keep related data together. We also want to keep data separate from computations that process the data (so we can reuse the data).

As we discussed this setup, some students raised the idea of spreadsheets. Spreadsheets are good for grouping data into rows and columns – with a spreadsheet, we could keep all of a student’s data together in a row, while keeping all data about a particular assignment together in a column. If only we had a spreadsheet!

Actually, Pyret has a data type that gives us the structure of a spreadsheet (but without all the built-in formulas). It’s called a Table.

1 Tabular Data

Here’s our gradebook written in a Pyret table:

  include tables

  include shared-gdrive("cs111-2018.arr", "1XxbD-eg5BAYuufv6mLmEllyg28IR7HeX")

  

  gradebook = table: name, SNC, exam1, exam2

    row: "Allie", false, 85, 90

    row: "Carl",  false, 75, 60

    row: "Elan", true, 95, 63

    row: "Lavon", false, 87, 88

    row: "Nunu", true, 70, 0

  end

What computations might you want to do with this table?

  • compute course grades

  • get histogram of performance on each exam

  • look at delta from first exam to second exam

  • check whether SNC or letter-grade students did better on exam2

  • get names of students who did poorly on the first exam

  • etc

To do these analyses, need to be able to do operations on tables. What sort of operations do you need?

  • filter out some rows (to look at only low grades)

  • re-order the rows (to see high or low scores first)

  • perform computations based on particular some columns (i.e., SNC and exam2)

  • add a column with the overall course grade

Our first task was to learn how to do these computations in Pyret. Here are examples of these computations:

  #----------------------------------------------

  # order the rows by descending values on exam1

  # false means sort descending; true would sort ascending

  sort-by(gradebook, "exam1", false)

  

  #----------------------------------------------

  # keep only those rows in which the SNC column contains true

  # to do this, we need a function that takes a row and produces

  #  a boolean indicating whether to keep the row.

  #

  # the notation r["SNC"] extracts the SNC value from the given row.

  # you don't know what rows are yet, but all you need for now is the

  # ability to look up values in their columns.

  

  fun taking-snc(r :: Row) -> Boolean:

    doc: "get the value in the given row's SNC column"

    r["SNC"]

  end

  

  # filter-by produces a table with the rows from the given gradebook

  # for which the given function returns true.

  filter-by(gradebook, taking-snc)

  

  #----------------------------------------------

  # keep only those rows in which the SNC column contains false

  fun not-taking-snc(r :: Row) -> Boolean:

    not(r["SNC"])

  end

  

  filter-by(gradebook, not-taking-snc)

  

  #----------------------------------------------

  # keep those students whose grades dropped from exam1 to exam2

  # this shows how to do a computation over two values in a row

  

  fun exam2-lower(r :: Row) -> Boolean:

    r["exam1"] > r["exam2"]

  end

  

  filter-by(gradebook, exam2-lower)

  

  # extract the first row from a table

  gradebook.row-n(0)

  # this . notation means "dig into" the table.  It's what we'll use

  # when we have to pull info out of a table.

  

  # extract the exam1 value from the first row of the table

  gradebook.row-n(0)["exam1"]

  

Some notes:

1.1 Summary of Table Operations So Far

As a summary, here are the functions shown in these examples (you won’t find them in the Pyret documentation – you need to use our documentation of table operations instead:

  filter-by :: (t :: Table, (test :: Row->Boolean)) -> Table

  sort-by :: (t :: Table, col :: String, ascending :: Boolean) -> Table

the notation row["colname"] extracts the value stored in the named column in the given row

What is the key takeaway from this segment?

Key idea in CS: Once data are made up of smaller pieces of data, we want to organize the data to make it easier to maintain and process. Tables are good for data about multiple entities, each of which has the same attributes.