## Class summary: Introduction to Trees

This material is not yet in the textbok.

### 1Data Structures for Family Trees

Here is a picture of a small family tree. The picture shows the mother and father of each person in the tree, when known:

Assume that we want to represent this picture in code, so we can ask questions about family trees (such as whether one person is an ancestor of another).

Propose a data structure for family trees in which people refer to their parents (ignore people referring to their children for now).

#### 1.1Family Trees as Tables

Many of you will see this and think of a table. Here’s a table that captures everyone other than Robert (we’ll come back to Robert later):

 family = table: name :: String, mother :: String, father :: String row: "Anna", "Susan", "Charlie" row: "Susan", "Ellen", "Bill" row: "Bill", "Laura", "John" end

Let’s say I wanted to write a function to compute someone’s grandparents (at least, those grandparents known in the tree)

 fun grandparents(of-name :: String) -> List: false # this is the wrong answer -- leaving a hole where: grandparents("Anna") is [list: "Laura", "John"] grandparents("Laura") is [list:] grandparents("Kathi") is [list:] end

What would be involved in doing that computation? What subtasks would we identify/what functions would we write?

• Need to go from a name to the mother

• Need to go from a name to the father

Let’s write one of these functions to see what it would look like:

 import lists as L fun get-mother(of-name :: String, from-family :: Table): person-row = sieve from-family using name: name == of-name end L.get(extract mother from person-row end, 0) where: get-mother("Anna", family) is "Susan" end

What happens if the person we asked for isn’t in the table (meaning that we don’t know their family history)? Right now, we get a Pyret error. The error arises because we shouldn’t try to use L.get unless we know that we found a row for the named person. We could modify the code, but that would be premature.

As always, start with examples: what should the function produce if the named person doesn’t have a row in the table?

• if we raise an error, we can’t use this function to get whichever grandparents are known (the raise would terminate the function)

• if we use something like "unknown", we can’t tell the difference between a real name and this value (both are strings)

• in practice, we want to return an answer of a _different type_, to avoid both problems. Here, we could return false (the boolean) to indicate that the person wasn’t found.

 fun get-mother(of-name :: String, from-family :: Table): person-row = sieve from-family using name: name == of-name end if L.length(person-row) > 0: L.get(extract mother from person-row end, 0) else: false end where: get-mother("Anna", family) is "Susan" get-mother("Fred", family) is false end

What would we do if we wanted to include a person (such as John) for whom we knew the name of one parent, but not the other? What would we put into the table? We might have to put false, following our approach of using a distinct type to capture missing information (but then we have to leave off the types)

 family = table: name, mother, father row: "Anna", "Susan", "Charlie" row: "Susan", "Ellen", "Bill" row: "Bill", "Laura", "John" row: "John", false, "Robert" # this is the new row end

This is the tables approach. Let’s try another approach that builds on data blocks instead. Later, we’ll contrast the approaches and see their strengths and weaknesses.

#### 1.2Creating a Data block for Trees

For this approach, we want to create a data block for Family Trees that has a variant (constructor) for setting up a person. Look back at our picture – what information makes up a person? Their name, their mother, and their father. That suggests the following pattern, which basically turns a row into a data block:

 data FamTree: | person( name :: String, mother :: String, father :: String ) end

Try to build the family tree from the picture using this data:

 anna-person = person("Anna", "Susan", "Charlie") susan-person = person("Susan", "Ellen", "Bill")

Wait – this seems wierd – we have one family (tree), but we’re setting up separate people? Do we maybe want a list of this information instead?

 family-lst = [list: person("Anna", "Susan", "Charlie"), person("Susan", "Ellen", "Bill") ]

This is better (one piece of data for the entire family tree, but it still seems to be missing the "tree-ness" of the picture. Note that in the picture, it is easy to get from Anna to her grandparents. Here, there’s this list and we have to look across the people to find the next generation. Could we do better?

Remember that we can make the mother and father be any type we would like. They don’t have to be Strings. In fact when we look at the picture, what we see up the mother and father sides is an entire family tree. Wouldn’t this then be better?

 data FamTree: | person( name :: String, mother :: FamTree2, father :: FamTree2 ) end

Try writing the family tree using this definition instead. Do the part starting just from Susan for now.

Hopefully, you got this far, but there’s a question of what to put in the ellipses (the cases in which we don’t know what person goes in there)

 susan-as-tree = person2("Susan", person2("Ellen", ..., ...), person2("Bill", person2("Laura", ..., ...), person2("John", ..., ...)) )

How do we fill in the ellipses? When we did tables, we used false for this. Let’s try that:

 susan-as-tree = person2("Susan", person2("Ellen", false, false), person2("Bill", person2("Laura", false, false), person2("John", false, false)) )

Oops – that didn’t work. Why not? Our data block requires the mother and father to be FamTrees, but false isn’t a FamTree. Maybe we could relax the type of mother/father to allow Famtree or boolean, but there’s acutally a better approach. We were only using false because we needed some kind of data that we could distinguish from a real name. We can get the same affect by adding another variant of family tree, one corresponding to an "empty" tree (or a tree with no people)

data FamTree: | unknown() | person( name :: String, mother :: FamTree, father :: FamTree ) end

Now, we can finish our example

 susan-tree = person("Susan", person("Ellen", unknown(), unknown()), person("Bill", person("Laura", unknown(), unknown()), person("John", unknown(), unknown())) )

Or we can build up the entire family:

 the-family = person("Anna", susan-tree, person("Charlie", unknown(), unknown()))

How would we find Susan’s mother?

susan-tree.mother

This gives the entire person structure. What if I want her name?

susan-tree.mother.name

We still need to come back to the discussion comparing tables and trees, but first, let’s write some programs over trees.

### 2Programming Over Trees

Write in-family, which takes a name and a FamTree and determines whether there is a person in the tree with that name. Don’t forget to write examples!

 fun in-family(a-name :: String, ft :: FamTree) -> Boolean: doc: "determine whether family has a person with the given name" cases (FamTree) ft: | unknown() => false | person(name, mother, father) => (name == a-name) or in-family(a-name, mother) or in-family(a-name, father) end where: in-family("Bill", unknown()) is false in-family("Zoe", unknown()) is false in-family("Susan", the-family) is true in-family("Zoe", the-family) is false in-family("John", the-family) is true end

#### 2.1The Trees Template

Did you dive in and try writing in-family from scratch? Remember than when we did lists we had the notion of a template that captured how we traverse (aka, walk along) the entire data structure. The template expanded the data structure into cases, then made a recursive call on the rest of the list (which was also a list).

We can use that same approach here, developing a template for trees. In the tree case, however, there are recursive calls on each of the mother and the father. Here is the template for a family tree:

 fun ft-func(ft :: FamTree) -> ???: cases (FamTree) ft: | unknown() => | person(name, mother, father) => ... name ... count-gens(mother) ... count-gens(father) end

Think about starting from this template as you try the next example.

#### 2.2Another Example

Write count-generations, which takes a FamTree and determines the maximum number of generations up any branch of the tree. Don’t forget to write examples!

 fun count-gens(ft :: FamTree) -> Number: doc: "produce number of generations in longest branch of the tree" cases (FamTree) ft: | unknown() => 0 | person(name, mother, father) => 1 + num-max(count-gens(mother), count-gens(father)) end where: count-gens(unknown()) is 0 count-gens(the-family) is 4 end

### 3Tables Versus Trees

Let’s get back to the discussion about tables vs trees – what are the benefits of each?

Trees:

• allow direct access to parents, rather than needing another table lookup to find parents

• better support multiple people with the same name in the family

• structure captures generations naturally

Tables:
• capture siblings easily

• feels more like a database of data on people

There are clearly tradeoffs here. In Computer Science, trees are often used instead of table, because of the direct access to parents (and generally capturing the structure of the underlying data).

### 4More Practice

Open the file of exercises (posted to the schedule page) and work on whichever set of exercises fits your level and interest.

If you finish those, extend the data block so that a person also has a birth year and an eye color. Think of some programs that you could write now that you have this information as well.