Performance and Tree intro
Program performance
We’ve talked about making programs more readable and concise. We have not yet talked very much about how to make programs fast. This won’t be a focus of the course (and you generally won’t be graded on program performance) but it’s worth seeing some examples of how to reason informally about program performance.
Consider these two versions of a program to find the characters in a text (as on HW 6).
fun is-character(str :: String) -> Boolean: string-to-upper(str) == str end fun find-charactersO(the-text :: List<String>) -> List<String>: L.distinct(L.filter(is-character, the-text)) end fun find-charactersR(the-text :: List<String>) -> List<String>: cases (List) the-text: | empty => empty | link(fst, rst) => if is-character(fst): L.distinct(link(fst, find-charactersR(rst))) else: L.distinct(find-charactersR(rst)) end end end
Do you think one of them is faster? Why?
find-characters0
seems like it should be faster, since it only calls
distinct
once instead of calling it once per member of the list.
How much faster is it? How long does each one take? In order to answer these
questions, we’ll need to know the definitions of filter
and distinct
:
fun filter(f :: Function, lst :: List) -> List: cases(List) lst: | empty => empty | link(fst, rst) => if f(fst): link(fst, filter(f, rst)) else: filter(f, rst) end end end fun distinct(lst: List) -> List: cases (List) lst: | empty => empty | link(fst, rst) => if L.member(rst, fst): distinct(rst) else: link(fst, distinct(rst)) end end end
Let’s take a look at filter
. It’s looking once at every element of the
list, so it runs in an amount of time proportional to the size of the list. We
call filter’s running time linear–for each member in the list, filter
does
a constant operations, so its runtime grows linearly with its input.
How about distinct? At first glance, distinct
looks like filter
–it’s doing
some work for every element of the list. But distinct
is calling
member
. How long does each call to member
take?
member
might have to look at every remaining element of the list–for instance, it will
do this on characters that appear once. So let’s say member
’s running time is
linear. Since distinct
calls member
for every member of the list, its
running time is quadratic–it’s proportional to the square of the list’s
length.
So: what’s the running time of find-characters0
? We’re running a linear
operation, then a quadratic operation. So our running time is quadratic (as the
list gets large, the quadratic term dominates the linear term).
How about find-charactersR1
? Here, we’re running distinct
–a quadratic
operation–on every element of the list. So our running time is cubic!
FYI, here’s a more efficient recursive version:
fun find-charactersR2(the-text :: List<String>) -> List<String>: cases (List) the-text: | empty => empty | link(fst, rst) => if is-character(fst) and not(L.member(rst, fst)): link(fst, find-charactersR2(rst)) else: find-charactersR2(rst) end end end
Ancestry data
Imagine we’re trying to do a genealogy project–we’re looking at eye color heritability. Our data are from the 18th-century House of Hanover; specifically, the family tree of King George III of the United Kingdom. Here’s a chunk of the tree:

How would we represent these data? We could use a table:
name | eye-color | mother | father |
---|---|---|---|
“George” | “green” | “Auguste” | “Frederick” |
“Auguste” | “green” | “Magdalena” | “Friedrich II” |
“Frederick” | “brown” | “” | “” |
Let’s say we want to write a function to get someone’s grandparents. How would we do it?
We’d have to first get parents, then get grandparents. Each would involve filtering the table based on name, and dealing with empty data (e.g., if a parent field is empty then we can’t get the grandparent). We could do it, but it would be unpleasant.
Could we use a datatype for this?
data AncTree: | person(name :: String, eye-color :: String, mother :: ???, father :: ???) end
What should the types of mother
and father
be? We could use String
, but
that leaves us with the same problem we had before–we’d have to search for a
person
with the right name. How about this?
data AncTree: | person(name :: String, eye-color :: String, mother :: AncTree, father :: AncTree) end
We’ll talk more about trees on Wednesday and Friday.