Lab 8: Sorting

You are not required to complete any part of this lab, this is posted purely as a resource.
Some of the content in this lab may not have been covered in class due to the adjusted schedule.

Billy the Teen is working the night shift as a guard for a local shoe shoppe with the hope of catching the dastardly cowboy-boots burglar. To get the scoop (and the stolen boots), the sheriff and his deputy go to talk to the cobbler.

They find out that the devious burglar has been replacing the boots he stole—which are real leather—with faux leather versions! The cobbler has been having a hard time distinguishing the real leather from the cheap faux leather, meaning a lot of customers have been coming in with refund requests! Indubitably, this is a job for Sheriff Bridgette and Deputy Randy.

Part 1: List Sorting

There are many ways to distinguish real leather from fake leather. Real leather, for example, tends to have an uneven texture with blemishes and other imperfections while faux leather looks uniform and smooth. Help the cobbler by writing a sorting algorithm to sort the real leather boots from the fake ones.

What’s a sorting algorithm?
There are many ways to sort a list. Some are fast, some are slow. Some work very well on certain kinds of inputs and very poorly on others.

A sorting algorithm is a series of steps that outlines a particular way to sort a list. We will examine two sorting algorithms throughout this lab: selection sort and merge sort.

SELECTION SORT

If the input list is empty: return the empty list.
If the input list contains a single element: return a list containing just that element.
Otherwise:
- Find the smallest element in the list
  - e.g.: [list: 3,6,1,8,9]
- Put it at the beginning of the list
  - e.g.: [list: 1,3,6,8,9]
- Link this element to the recursive call on the rest of the list
Here is a gif of selection sort for the people who need to see things moving around to understand:

MERGE SORT

If the input list is empty: return the empty list.
If the input list contains a single element: return a list containing just that element.
Otherwise:
- Split the list in half
- Run mergesort (recursively) on both halves
- Merge the now-sorted halves
  - Merge is a helper function that merges two sorted lists into a single sorted list.
Below is an illustration of how the different components of the merge sort algorithm fit together recursively:

Or, for those with more dynamic visual needs:

Think about why both algorithms (selection and merge sort) are guaranteed to produce a sorted list. Post on Campuswire if you’re not sure!
Pull up this file, which provides the code for both algorithms.
Look at the code for selection sort. We’ll use the number of comparisons (such as <) as our measure of the runtime for this algorithm.
1. Think back to your collection of test inputs for selection sort. Are any of those inputs likely to require more comparisons than the other inputs? Do any require fewer comparisons than the others?
2. Write a formal prediction of the worst-case running time of selection sort. You can express this as a broad class of the function (e.g., “constant,” “linear,” “quadratic,” etc.) or as an equation in terms of N, the list size.
3. The file we gave you loads a new library called efficiency-sort-grapher. Read the instructions at the bottom of the file to experiment with the runtimes of selection sort and merge sort.
Predict how many element comparisons occur within the worst case with selection sort.
Now write a prediction for the runtime of merge sort! For this algorithm, use the number of “links” as the measure of runtime.
We have written several functions that allow you to test your predictions on lists of any size: link to code. Use them to experiment with the runtimes of selection sort and merge sort!

Which sorting algorithm should the cobbler use? In what case(s)?

CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!

Part 2: Working with Ancestor Trees

The following activity gives you a chance to understand the in-tree function covered in class and to write another function (or two) on ancestor trees.

1. Tracing `in-tree`

This file contains an AncTree datatype and the in-tree function.

For each of the following expressions, write out the sequence of calls to in-tree that get made while evaluating those expressions:

in-tree("Robert", annaT)
in-tree("Susan", annaT)

Open the tracer-version of Pyret. Copy and paste the code from the file into the definitions area. Press “Trace and Run.” Enter the following expression at the prompt:

in-tree("Robert", annaT)

then press “Trace Last.” Set the dropdown menu to read “Depth First,” and use the arrow keys to the right of the dropdown to see the order of the function calls that get made. Does this order match what you predicted?

Now run in-tree on a name that is not in the tree. Again use “Trace Last” to see the sequence of calls that get made. Does this sequence match what you would have expected?

2. Counting Generations

We’d like to be able to compute how many generations are represented in the tree. In the case of annaT, there are 5 generations (from Anna up to Robert).

Your goal is to write a function called count-gens that takes an AncTree as an input and produces a number indicating the number of generations in the tree.

Start by writing examples on small trees with no more than two generations. Your examples should have unknown people in various places. Can you connect the results of one tree to the next? For example, if you started from the tree for Ellen in our running example, how would you use the results of Laura and John’s trees to compute the number of generations upwards from Ellen?

Remember to break into cases on AncTree and to call count-gens on each of the mother and father branches/trees.

As you did for in-tree, copy your code into the Tracer version and use “Trace Last” to explore the order in which the people in the tree get processed.

3. Reflection

Write a block comment answering the following questions to see whether you are understanding trees:

Imagine that you are running in-tree and are at the spot for “Ellen.” The computation starts up the tree for Ellen’s mother. Which other people get checked before Ellen’s father? What part of the code determines the order in which people are processed?
When we call in-tree on a name that is in the father’s side of the bottom-most person (such as Anna), the computation first explores all the way up the mother’s tree. When there are no more mothers, the mother tree is unknown. in-tree returns false on an unknown tree. Why doesn’t the entire computation just stop and return false at this point? What makes the computation continue after unknown is reached the first time?
A classmate doesn’t care for all of this recursion and instead tries to write the following code for in-tree. Would this approach work? Why or why not? The dot notation here to get components from values works fine—that aspect is not a problem. What does this scenario tell you about recursion and when/why we use it?
```
fun in-tree(name :: String, anct :: AncTree) -> Boolean:
  (anct.name == name) or
  (anct.mother.name == name) or
  (anct.father.name == name) or
  (anct.mother.mother.name == name) or
  (anct.mother.father.name == name)
end
```
What questions do you still have about trees and/or recursion? Post them on Campuswire to help us identify what still needs clarification.

4. In case you want more …

Can you write a function to count how many blue-eyed people are in the tree? How about a function to return a list of all the names in the tree? Feel free to try these if you want more practice.

Part 3: Working More with Lists

This section is designed moreso for students who wish to take CS18 next semester.

The shoe shoppe’s reputation has been saved, but the cobbler’s sales are still not what they used to be. He suspects that this dip in revenue is because the process by which he finds the faux leather utilizes a slow, inefficient lookup function…

`lookup` on lists

Consider the function lookup(num :: Number, lst :: List<Number>) -> Number which, when given a list of numbers, returns the index of a particular number in the list.

Predict this function’s runtime. What would its runtime be in the best case? How about the worst case?
How does runtime change if the input list is sorted or unsorted, assuming it changes at all?
In the Pyret definitions window, write thelookup function.

CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!

`lookup` on trees

Randy suggests using a tree to speed up the lookup function!

Here is a general tree datatype that makes a tree containing numbers rather than people:

data Tree:
    | leaf
    | node(value :: Number, left-child :: Tree, right-child :: Tree)
end

Every tree is either a leaf or a node with two children, where each child is also a tree.

We also talked about binary search trees, which are special kinds of trees where every node obeys the following two rules:

Its left subtree contains only values less than it
Its right subtree contains only values greater than it

Copy the data definition for a tree into the Pyret definitions window. Then, rewrite the following list as a binary search tree: [list: 7, 3, 5, 9, 10, 22, 17, 1]

A balanced binary search tree is a binary search tree that has minimal depth, equal to the number of jumps from the root (the first/topmost node) to the farthest leaf.

For example, this tree is unbalanced and has a depth of 2:

But this tree is balanced and has a depth of 1:
How balanced can you make the tree from the first question? Sketch a few versions of it on paper.
Predict lookup's runtime on a binary search tree. Is it different than your prediction for lookup's runtime on lists?
Now, write bst-lookup(num :: Number, tree :: Tree) -> Tree. Because trees don’t have indices the way that lists do, the output of bst-lookup should be the actual node containing the value.
If you have time, write the function bst-insert(num :: Number, tree :: Tree), which takes in a number and a balanced binary search tree and inserts that number into the tree. Remember that your new tree must also be a binary search tree (and, thus, must also follow the rules for binary search trees outlined above).

Part 4: Bonus (i.e., Optional): More Sorting Algorithms

This section is intended to work as a supplement for students who are considering taking CS18 next semester. If you are not considering CS18, feel free to try it out anyway! If you are, this content will probably help with the transition.

Below are descriptions of two additional sorting algorithms: insertion sort and quicksort. Implement them and predict their runtimes.

INSERTION SORT

Insertion sort works by inserting elements into their correct spots, one at a time.

If the input list is empty: return the empty list.
If the input list contains a single element: return a list containing just that element.
Run insertion-sort (recursively) on the rest of the list.
Insert the current first element into the correct place in the now-sorted rest of the list.

Below is an animation of insertion sort to help you visualize the process better:

QUICKSORT

Quicksort works by choosing a single element to be a “pivot” and rearranging the rest of the list based on that pivot.

If the input list is empty: return the empty list.
If the input list contains a single element: return a list containing just that element.
Choose the first element of the list to be the “pivot.”
Split all of the elements in the rest of the list into two separate lists: one containing all elements less than the pivot and one containing all elements greater than the pivot.
Run quicksort (recursively) on each of the new lists.
Join elements into a new list in the following order: first the one with elements less than the pivot, then the pivot, then the one with elements greater than the pivot.

Below is an animation of quicksort to help you visualize the process better:

A Happy Ending

Thanks for your help! Bridgette, Randy, and the cobbler went out to the saloon to celebrate the grand reopening of his shoppe! The boots on display were beautiful! Now, the Sheriff and Deputy can finally prepare to catch the cowboy-boots burglar!

Lab 8: Sorting

Part 1: List Sorting

SELECTION SORT

MERGE SORT

CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!

Part 2: Working with Ancestor Trees

1. Tracing in-tree

2. Counting Generations

3. Reflection

4. In case you want more …

Part 3: Working More with Lists

lookup on lists

CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!

lookup on trees

Part 4: Bonus (i.e., Optional): More Sorting Algorithms

INSERTION SORT

QUICKSORT

A Happy Ending

1. Tracing `in-tree`

`lookup` on lists

`lookup` on trees