Class summary:   Introduction to Lists
1 Looking up values by keys
2 Lists:   Two Motivating Problems
3 Lists:   a new kind of data for sets
4 What are Lists?
5 Extracting Lists from Tables
6 Operations on Lists
7 Categorizing Pizza Toppings
7.1 Operations:   Filter, Member, Distinct
7.2 Operations Recap
7.3 Map
7.4 An Aside:   Tables versus Lists
8 Combining Map and Filter
9 Summary of List Operations

Class summary: Introduction to Lists

Copyright (c) 2017 Kathi Fisler

1 Looking up values by keys

We want a function that takes the name of a person and returns the number of tickets they have ordered:

  fun tickets-for(t :: Table, who :: String) -> Number:

    doc: "Extract tickcount value for order with given name"

    ...

  where:

    tickets-for(event-data-clean, "Alvina") is 3

    tickets-for(event-data-clean, "Ernie") is 0

  end

We filled in the body as follows:

  fun tickets-for(t :: Table, who :: String) -> Number:

    doc: "Extract tickcount value for order with given name"

    matches = filter-by(t, lam(r): r["name"] == who end)

    matches.row-n(0)["tickcount"]

  where:

    tickets-for(event-data-clean, "Alvina") is 3

    tickets-for(event-data-clean, "Ernie") is 0

  end

What happens if we try the following?

  tickets-for(event-data-clean,"kathi")

Our current code assumes that filter-by will return a non-empty table. We should instead check that we got a non-empty table, and raise an error if we did not:

  fun tickets-for(t :: Table, who :: String) -> Number:

    doc: "Extract tickcount value for order with given name"

    matches = filter-by(t, lam(r): r["name"] == who end)

    if matches.length() > 0:

      matches.row-n(0)["tickcount"]

    else:

      raise("Tickets-for: table has no row with name " + who)

    end

  where:

    tickets-for(event-data-clean, "Alvina") is 3

    tickets-for(event-data-clean, "Ernie") is 0

    tickets-for(event-data-clean, "Kathi") raises "no row"

  end

The where clause shows how to check whether a call to the function results in an error being raised – rather than write is in the example, we write raises. The string after raises needs to be a substring of the raised error for the test to pass.

2 Lists: Two Motivating Problems

Consider the following two questions:

We have an idea of how to write the first one – a filter-by with a helper function that uses or to check the code against a collection of options:

  fun check-discounts1(t :: Table) -> Table:

    doc: "filter out rows whose discount code is not valid"

    fun invalid-code(r :: Row) -> Boolean:

      not(

        (r["discount"] == "STUDENT") or

        (r["discount"] == "BIRTHDAY") or

        (r["discount"] == "") or

        (r["discount"] == "EARLYBIRD"))

    end

    filter-by(t, invalid-code)

  end

There’s something unsatisfying about this solution, though: every time the set of codes changes, we have to change the function. It would be much nicer if the codes could be written independently of the function. Then, the sales department could change the codes without having to bother the programmers every time.

So the real question is how can we rewrite this function so that the set of valid codes is written down outside the function?

3 Lists: a new kind of data for sets

  valid-discounts = [list: "STUDENT", "BIRTHDAY", "", "EARLYBIRD"]

  

  fun check-discounts(t :: Table) -> Table:

    doc: "filter out rows whose discount code is not valid"

    fun invalid-code(r :: Row) -> Boolean:

      not(L.member(valid-discounts, r["discount"]))

    end

    filter-by(t, invalid-code)

  where:

    check-discounts(event-data)

      is

    add-row(

      add-row(

        add-row(event-data.empty(), event-data.row-n(3)),

        event-data.row-n(4)),

      event-data.row-n(6))

  end

Here is a version written with anonymous functions/lambda.

  fun check-discounts2(t :: Table) -> Table:

    doc: "filter out rows whose discount code is not valid"

    filter-by(t, lam(r): not(L.member(valid-discounts, r["discount"])) end)

  where:

    check-discounts2(event-data)

      is

    add-row(

      add-row(

        add-row(event-data.empty(), event-data.row-n(3)),

        event-data.row-n(4)),

      event-data.row-n(6))

  end

4 What are Lists?

Lists are one of the key data structures in programming. They feature:

As we will see, there are many built-in operations on lists.

5 Extracting Lists from Tables

Turning to the second question, how could we get a list of names of people with the "STUDENT" discount? (Perhaps we want to validate those names against data from a school).

We know how to filter the table down to only those rows that have "STUDENT" in the discount column. How do we get the names from those rows? We use a table operator called get-column that pulls out the values from a column as a list:

  filter-by(

     event-data-clean,

     lam(r): r["discount"] == "STUDENT" end).get-column("name")

Alternatively, using an intermediate name for the filtered table:

  rows =

    filter-by(

       event-data-clean,

       lam(r): r["discount"] == "STUDENT" end)

  rows.get-column("name")

We’ll do a lot more with lists as we go forward.

6 Operations on Lists

So far, we’ve had an introduction to lists, a way to group together a collection of items (such as a collection of names, grades, dates, images, etc). We saw how to create lists by hand (using [list: ...]) and how to extract a list from the column of a table (using .get-column(colname)).

Next, we cover some of the (many) operations on lists. There’s a full list of the operations in the Pyret lists documentation; we’ll look at just a handful of them today.

We’ll step away from tables and work with lists on their own for now.

7 Categorizing Pizza Toppings

Imagine that you are running a pizzeria and need to track different categories of pizza toppings. Let’s do that by setting up the following lists:

  meats = [list: "sausage", "pepperoni", "chicken", "shrimp"]

  veggies = [list: "spinach", "peppers", "onion"]

  unusual = [list: "egg", "pickle"]

  premium = [list: "pickle", "shrimp"]

What do we notice about lists from these examples? Lists can have any number of items. The items within a list are written as separated by commas.

7.1 Operations: Filter, Member, Distinct

The staff at your office have to vote on which toppings to get as part of the weekly pizza lunch. You have a list of all the votes that people have cast.

  topping-votes =

    [list: "peppers", "pepperoni", "onion", "onion", "onion"]

Here are various expressions that show the list operations of distinct, member, and filter. We introduced L.member and L.distinct in the last lecture. L.filter is analogous to the filter-by operation on tables: filter takes a function that determines whether to keep elements from the list in the output list.

  # Which different veggies were ordered?

  unique-veggies =

    L.distinct(

      L.filter(lam(t): L.member(veggies, t) end, topping-votes))

  

  # What toppings to include on a vegetarian pizza? Leave off the meats

  veg-friendly =

    L.filter(lam(t): not(L.member(meats, t)) end,

      topping-votes)

If you weren’t sure how to start on something like "which different veggies were ordered", you can start by writing out the tasks:

  1. Create a function that determines whether a string is in the veggie list

  2. Filter the veggies out of the topping-votes list

  3. Remove duplicates from the list of veggies

Each of these tasks is a separate expression in the code: the lam(t): L.member ... is the function, L.filter extracts the veggies, and L.distinct removes the duplicates.

7.2 Operations Recap

What operations do we have so far?

Operation

  

Types and Notes

L.member

  

List, item -> Boolean

  

Indicates whether item in the list

L.distinct

  

List -> List

  

Returns the unique values from input list

L.filter

  

(elt -> Boolean), List -> List

  

Returns list of items from input list on which function returns true (in same order as in input list)

7.3 Map

Now let’s try another problem – it’s vegetarian-awareness week, and we want to replace all the meats in the list with tofu.

Let’s think about what the input and output of this computation should be. We are starting with

  [list: "peppers", "pepperoni", "onion", "onion", "onion"]

which should become

  [list: "peppers", "tofu", "onion", "onion", "onion"]

Note there is exactly one item in the output list for each item in the input list.

Which of our existing list operations can we use for this? We need something that produces a list, and some of the items are different than in the input list. None of the operations we have so far achieve this, so we need something else.

What we need is an operation called L.map, which is similar to transform-column or build-column from tables – L.map produces a list with one item corresponding to each item in the given list, in the same order.

  # Make all ingredients vegetarian by replacing meat with tofu

  fun replace-if-meat(str :: String) -> String:

    doc: "If string is a meat, return tofu, else return the string"

    if L.member(meats, str):

      "tofu"

    else:

      str

    end

  end

  

  vegetarian-delight = L.map(replace-if-meat, topping-votes)

7.4 An Aside: Tables versus Lists

It would seem we could have just as well put our topping information in a table rather than all of these lists. For example:

topping

  

meat

  

veggie

  

unusual

  

...

sausage

  

X

  

  

  

egg

  

  

  

X

  

pepperoni

  

X

  

  

  

spinach

  

  

X

  

  

...

  

...

  

...

  

...

  

...

Stop and discuss – what are the tradeoffs between one table and our multiple-lists approach?

Here are some observations on this:

Whether you use tables or lists depends on the data you have and how you plan to use it. For the programs we’ve written today, the lists were sufficient and lightweight, so they were the better choice. Other programs might have benefitted from the table-shaped data. This is our first real example of starting to consider choices in how we represent information when designing programs.

8 Combining Map and Filter

Here’s one last example.

For tweeting and texting, people want to reduce the number of characters they have to type. For example, instead of writing "are you home?", they might write "R U home?".

This feels like a problem for L.map – we want to convert each string in the original message to a shortened string.

Here’s a function that shortens common strings:

  fun shorten(w :: String) -> String:

    string-replace(

      string-replace(

        string-replace(w, "for", "4"),

        "you", "U"),

      "are", "R")

  end

How can we shorten all of the words in a message? Let’s assume we have a list of all the words. Then we can use L.map.

  msg-words = [list: "unfortunately", "you", "are", "late"]

  

  msg-trim = L.map(shorten, msg-words)

What if we want to find all of the words that are still long after shortening? We combine map and filter:

  msg-words = [list: "unfortunately", "you", "are", "late"]

  

  msg-trim =

    L.filter(lam(w): string-length(w) > 4 end,

      L.map(shorten, msg-words))

9 Summary of List Operations

Let’s extend our table of list operations to include map. We’ll also add L.length, which is useful for getting the size of a list.

In the types, the notation List<type> means a list whose elements are of the named type. When the type isn’t fixed, we use generic names like item and elt to show the relationship between the types of the lists and the types of the functions used to produce them.

Operation

  

Types and Notes

L.member

  

List, item -> Boolean

  

Indicates whether item in the list

L.distinct

  

List -> List

  

Returns the unique values from input list

L.filter

  

(elt -> Boolean), List<elt> -> List<elt>

  

Returns list of items from input list on which function returns true (in same order as in input list)

L.map

  

(elt -> item), List<elt> -> List<item>

  

Returns result of calling function on each element of given list, in order

L.length

  

List -> Number

  

Returns length of the list