Class summary: Processing Lists
Copyright (c) 2017 Kathi Fisler
Throughout this course, we’ve written programs that leverage the structure of data. We started with images, showing that the structure of code that builds a piece of data reflects the structure of the datum itself. We moved onto tables, writing programs to process the rows and cells that make up tables.
Now we have lists. Last lecture, we learned several built-in functions for processing lists. But the operations we learned only cover some of the computations we might want to do with lists. For example, what if we wanted the sum of a list of numbers? Or if we wanted to make sure all numbers in a list are less than 10. Neither L.map nor L.filter, which return lists, can be used alone to write functions that return numbers or booleans. Sometimes, we will need to write our own functions to process or aggregate lists. Today, we start learning how to do that.
This material is covered very nicely in a chapter of a textbook by Brown CS Professor Shriram Krishnamurthi. These notes will highlight unique features of how we covered this material in class, but details will be in chapters 6.1 through 6.4.2 of PAPL chapter 6 (which we followed fairly closely in class), as well as in the lecture capture.
1 Summing a list of numbers
Imagine that we had written a function called my-sum (sum is built-in, so we need to use a different name). What might the where: block look like?
fun my-sum(lst :: List<Number>) -> Number: |
... |
where: |
my-sum([list: 4, 7, 5]) is 4 + 7 + 5 |
my-sum([list: 7, 5]) is 7 + 5 |
my-sum([list: 5]) is 5 |
my-sum([list:]) is 0 |
end |
Stare at these examples – do you notice a pattern?
Yes, each where answer could be written using the line below it:
where: |
my-sum([list: 4, 7, 5]) is 4 + my-sum([list: 7, 5]) |
my-sum([list: 7, 5]) is 7 + my-sum([list: 5]) |
my-sum([list: 5]) is 5 + my-sum([list:]) |
my-sum([list:]) is 0 |
end |
Do you see the pattern here? When we want to sum a list, we can add the first item on the list to the sum of the list with everything except the first item (that list is called the rest of the list). If we had an easy way to get at the first and rest parts of a list, our examples guide us to writing the function.
2 Checking all Numbers Less than 10
Let’s write the where clause for another problem:
fun all-below-10(lst :: List<Number>) -> Boolean: |
... |
where: |
all-below-10([list: 8, 2, 6]) is |
(8 < 10) and (2 < 10) and (6 < 10) |
all-below-10([list: 2, 6]) is (2 < 10) and (6 < 10) |
all-below-10([list: 6]) is (6 < 10) |
all-below-10([list:]) is true |
end |
Which, substituting subexpressions, yields:
where: |
all-below-10([list: 8, 2, 6]) is |
(8 < 10) and all-below-10([list: 2, 6]) |
all-below-10([list: 2, 6]) is |
(2 < 10) and all-below-10([list: 6]) |
all-below-10([list: 6]) is |
(6 < 10) and all-below-10([list:]) |
all-below-10([list:]) is true |
end |
2.1 The Case-based Definition of Lists
See section 6.1 for details.
3 Leveraging Cases to Write Our Functions
Lecture capture shows how we built-up these final solutions piece-by-piece.
fun my-sum(lst :: List<Number>) -> Number: |
cases (List) lst: |
| empty => 0 |
| link(fst, rst) => fst + my-sum(rst) |
end |
where: |
my-sum([list: 4, 7, 5]) is 4 + 7 + 5 |
my-sum([list: 7, 5]) is 7 + 5 |
my-sum([list: 5]) is 5 |
my-sum([list:]) is 0 |
end |
|
fun all-below-10(lst :: List<Number>) -> Boolean: |
cases (List) lst: |
| empty => true |
| link(fst, rst) => (fst < 10) and all-below-10(rst) |
end |
where: |
all-below-10([list: 8, 2, 6]) is |
(8 < 10) and (2 < 10) and (6 < 10) |
all-below-10([list: 2, 6]) is (2 < 10) and (6 < 10) |
all-below-10([list: 6]) is (6 < 10) |
all-below-10([list:]) is true |
end |
4 all-below-10
Write a function that takes a list of numbers and returns a Boolean indicating whether every number in the list is smaller than 10.
fun all-below-10(lst :: List<Number>) -> Boolean: |
doc: "Determine whether all numbers are below 10" |
cases (List) lst: |
| empty => true |
| link(fst, rst) => (fst < 10) and all-below-10(rst) |
end |
where: |
# an example on a specific list |
all-below-10(link(3, link(8, link(2, empty)))) |
is (3 < 10) and (8 < 10) and (2 < 10) |
# an example on the rest of that list |
all-below-10(link(8, link(2, empty))) |
is (8 < 10) and (2 < 10) |
# rewriting the first example, using the second to show |
# the call on the rest |
all-below-10(link(3, link(8, link(2, empty)))) |
is (3 < 10) and all-below-10(link(8, link(2, empty))) |
end |
One interesting discussion here concerned the result of true in the empty case. Let’s use an example to see why we have to return true rather than false. We’ll take a function and expand out the code as we evaluate a call to all-below-10:
all-below-10([list: 7, 8]) |
(7 < 10) and all-below-10([list: 8]) |
(7 < 10) and (8 < 10) and all-below-10(empty) |
The result of all-below-10(empty) cannot interfere with the result on the items in the list. Since both 7 and 8 are less than 10, the function should return true on this list. If all-below-10(empty) returns false, the entire and expression will return false. To avoid interfering with the computation of and, we must return true in the empty case.
By analogous reasoning, if the computation involves or, we must return false in the empty case.
5 multi-stripe-flag
Remember that we started out with producing images of flags? At the start of the course, we wrote a function to produce a flag with three stripes:
fun three-stripe-flag(top :: String, middle :: String, bot :: String) -> Image: |
doc: "produce image of three equal-height horizontal stripes" |
above(rectangle(120, 30, "solid", top), |
above(rectangle(120, 30, "solid", middle), |
rectangle(120, 30, "solid", bot))) |
end |
What if wanted to allow an arbitrary number of stripes? We would take the sequence of stripes in as a list. (Don’t worry about scaling the stripe heights for now). Here’s the list-based version:
fun multi-stripe-flag(colors :: List<String>) -> Image: |
doc: ```produce flag with horizontal stripes in given colors |
from top to bottom``` |
cases (List) colors: |
| empty => rectangle(1, 1, "solid", "white") |
| link(fst, rst) => |
above(rectangle(120, 30, "solid", fst), |
multi-stripe-flag(rst)) |
end |
where: |
# Normally, we don't write wheres for functions that return images, |
# but I'm doing it here to illustrate the pattern of building |
# the image |
multi-stripe-flag([list: "red", "blue", "orange"]) |
is |
above(rectangle(30, 100, "solid", "red"), |
multi-stripe-flag([list: "blue", "orange"])) |
end |
Here, we returned a tiny, nearly invisible image in the empty case. There actually is a way to have an empty image in Pyret, but we didn’t have that code on hand, so we wrote it this way instead. If that little rectangle bothers you, trust that we could indeed have eliminated it.
The key thing to note here is that the nested sequence of above expressions that we wrote in the three-stripe version is precisely what we get from the multi-stripe version – if you unroll the calls to multi-stripe-flag, you’ll see that we get the same pattern of nested above expressions as we wrote manually. This is one of the beauties of writing functions that call themselves in this way – they expand repeated patterns for us without us having to manually replicate code.
6 List versus link when writing functions that return lists
We developed a function that adds 1 to every element in a list of numbers.
fun add-1-all(lst :: List<Number>) -> List<Number>: |
cases (List) lst: |
| empty => empty |
| link(fst, rst) => link(fst + 1, add-1-all(rst)) |
end |
where: |
add-1-all([list: 3, 7, 4]) is link(3 + 1, add-1-all([list: 7, 4])) |
add-1-all([list: 7, 4]) is link(7 + 1, link(4 + 1, empty)) |
add-1-all(empty) |
end |
We went over why we had to write
| link(fst, rst) => link(fst + 1, add-1-add(rst)) |
instead of
| link(fst, rst) => [list: fst + 1, add-1-add(rst)] |
Roughly, using list: as in the second case makes a chain of nested lists, not a flat list.
We also talked about the importance of writing an example on the empty case so you can figure out what that case does before you are in the middle of writing code.
See the lecture capture video for more details.
7 Design Recipe: A Systematic Process for Designing Programs
Let’s formally write out the sequence of steps that we’ve been informally using in class to design solutions to programming problems:
Write the name, inputs, input types, and output type for the function.
Write some examples of what the function should produce. The examples should cover all structural cases of the inputs (i.e., empty vs non-empty lists), as well as interesting scenarios within the problem.
Identify the tasks that the problem requires. Label each task with some information on how you will handle it (i.e., use a built-in function like L.filter, write a new function). If you are writing a new function, note the inputs and output for the task, along with the types.
For each task that requires you to write a function, start by copying the template for the main function input (i.e., the list template).
Fill in the templates and bodies for all task functions.
Combine the code for the tasks into code for the entire problem.
7.1 An example task list
Imagine that we asked you to write a program to compute the average number of vowels in a list of words. Here’s a possible task plan:
Define the set of vowels
Get a list of characters in a word [could use string-explode]
Count how many vowels are in a given word [could use L.filter, L.member, and L.length with the results from the previous two steps].
Sum the total number of vowels across all words in a list [could write a new function total-vowels that takes a list and returns a number].
Count the words in the list [could use L.length]
Compute the average number of vowels from the previous two values [using division].