Class summary:   Introduction to Tables
1 Tabular Data
2 Table Operations:   Keeping and Ordering
3 Practice:   Keeping and Ordering
4 Adding Columns to Tables
5 Transforming Tables
6 Practice:   Extending and Transforming

Class summary: Introduction to Tables

Copyright (c) 2017 Kathi Fisler

This material goes with intro to tabular data from the textbook

The text uses a check construct that we have not covered yet – ignore the check statement for a day or two.

1 Tabular Data

Start with a gradebook table:

  gradebook = table: name, SNC, exam1, exam2

    row: "Alina", false, 85, 90

    row: "Carl",  false, 75, 60

    row: "Elan", true, 95, 63

    row: "Lavon", false, 87, 88

    row: "Nunu", true, 70, 92

  end

What computations might you want to do with this table?

  • compute course grades

  • get histogram of performance on each exam

  • look at delta from first exam to second exam

  • check whether SNC or letter-grade students did better on exam2

  • get names of students who did poorly on the first exam

  • etc

To do these analyses, need to be able to do operations on tables. What sort of operations do you need?

  • filter out some rows (to look at only low grades)

  • re-order the rows (to see high or low scores first)

  • filter out some columns (to focus on SNC and exam2)

  • add a column with the overall course grade

Going to learn how to do these in Pyret. Let’s focus on the first three, then come back to the fourth a bit later.

2 Table Operations: Keeping and Ordering

  # order the rows by descending values on exam1

  order gradebook: exam1 descending end

  # keep only those rows in which the SNC column contains true

  sieve gradebook using SNC:

    SNC

  end

  # keep only those rows in which the SNC column contains false

  sieve gradebook using SNC:

     not(SNC)

  end

  # keep only the exam2 column

  select exam2 from gradebook end

Notice that these operations return a table but leave the original gradebook table unchanged (this may look unusual to those with prior programming experience).

Each of these operations has a specific form (syntax) to follow (the all-caps words are where we customize the operation to a specific computation):

  order TABLE: column CRITERION end

  

  sieve TABLE using COLUMN:

     CRITERION

  end

  

  sieve TABLE using COLUMN, COLUMN, ...:

     CRITERION

  end

  

  select COLUMN from TABLE end

Programming Tip: When you first learn to program, getting all the punctuation and keywords in the right place can get frustrating. Be patient. Once you’ve written these kinds of expressions multiple times, your brain will start to internalize the patterns. You don’t get to that point just by reading code though – you have to write or type expressions using the operators. If you are finding this frustrating, write extra examples using our example tables to help train your brain.

3 Practice: Keeping and Ordering

What if we wanted a table of students taking the course SNC, sorted from high to low grades on exam2? First off, what table do we expect to get?

  table: name, SNC, exam1, exam2

    row: "Nunu", true, 70, 92

    row: "Elan", true, 95, 63

  end

Programming Tip: Always think out a couple of concrete examples of what you want your program to produce BEFORE you write the program. This helps make sure that the question/goal is clear up front. Otherwise, if the answer "looks wrong", you won’t be sure whether the problem is in your code or your understanding of the question.

How would we achieve this in code?

  # table of students taking the course SNC

  snc-only = sieve gradebook using SNC:

    SNC

  end

  

  # order SNC table by decreasing exam2

  order snc-only:

    exam2 descending

  end

What if we wanted a table of those students who got at least 85 on both exams? What table do we expect to get?

  table: name, SNC, exam1, exam2

    row: "Alina", false, 85, 90

    row: "Lavon", false, 87, 88

  end

How would we achieve this in code?

  # create table of students with at least 85 on both exams

  both-over-85 = sieve gradebook using exam1, exam2:

    (exam1 >= 85) and (exam2 >= 85)

  end

Note that can use expressions of any complexity as the sieve criterion, but that expression can only use columns named in the using clause.

Load the weather table and practice writing the expressions mentioned in the file.

The intensity of peppers are rated using the Scoville scale, (which essentially measures the ratio of pepper extract to sugar water needed to cancel the heat of the pepper). We want to be able to look up the name of a pepper based on its scoville rating. Bell peppers have a rating of 0. Paprika ranges from 100-1000 units, jalapeno from 3,500-10,000 units, cayenne from 30,000-50,000, and habanero from 100,000-350,000.

Write this program two ways: once using if-statements and once using tables. What are the strengths and weaknesses of each approach?

Now that you’ve practiced how the operators should get used, let’s try finding mistakes in incorrect programs. Load the broken menu programs and work on correcting the errors. Try summarizing what the error was at a high level, to help you internalize what went wrong.

4 Adding Columns to Tables

Return to the gradebook table. To help in computing course grades, we’d like to add a column that averages the two exam scores.

  grades-with-average = extend gradebook using exam1, exam2:

    exam-avg: (exam1 + exam2) / 2

  end

You don’t have to save the resulting table under a name (like grades-with-average), but doing so makes it easier to use the table in another computation later.

Go back to the weather table. Add a column named heat-alert that contains a boolean. The column should contain true when the temperature was at least 100 and there was no rainfall and false otherwise.

5 Transforming Tables

Sometimes, we simply want to alter the contents of cells, but without adding new columns. The transform operator takes care of this. For example, we might want to round up all of the exam averages to the nearest whole number.

  transform grades-with-average using exam-avg:

    exam-avg: num-round(exam-avg)

  end

If you ask to see grades-with-average in the interactions window now, will it show rounded grades or unrounded ones? Why? How do we get to see the table with the rounded averages again?

Did we have to extend and transform in two steps? Could we have rounded the average in the first place, at the time that we extended the table with the exam-avg column?

We could have done it in one step, by writing:

  grades-with-average = extend gradebook using exam1, exam2:

    exam-avg: num-round((exam1 + exam2) / 2)

  end

The difference between the approaches is that one gives you access to the table without the rounded data – which approach you take depends on whether you need that data. Alternatively, you may have only realized later that you wanted the numbers rounded; using transform lets you respond to changes that had to get made after the column was created.

Key idea in CS: In both programming and computer science in general, we often deal with changes in the problem we were trying to solve. This is a theme we will come back to again and again in this course.

6 Practice: Extending and Transforming

Transform the gradebook so that all students are taking the course for a grade, rather than as SNC.

Extend the gradebook with a column that records the highest exam grade that each student received.

Oops! There was a grading error on the first exam. Everyone should get 5 additional points on exam1. Write an expression to do that on the table that also records the highest grade.

Look at the table you produced – is the table correct? If not, where is there an error? If it is, where might there have been an error that you should watch out for?

Extend the table with a column that assigns a letter grade to each students. Students taking the course SNC get "pass" with an exam average of at least 65, and "NC" otherwise. Students taking the course for a grade get an "A" with an exam average of at least 87, a "B" with an exam average from 73 to 86, "C" with an exam average from 65 to 72, and "NC" otherwise.