Lab 8: Sorting

You are not required to complete any part of this lab, this is posted purely as a resource.
Some of the content in this lab may not have been covered in class due to the adjusted schedule.

Billy the Teen is working the night shift as a guard for a local shoe shoppe with the hope of catching the dastardly cowboy-boots burglar. To get the scoop (and the stolen boots), the sheriff and his deputy go to talk to the cobbler.

They find out that the devious burglar has been replacing the boots he stole—which are real leather—with faux leather versions! The cobbler has been having a hard time distinguishing the real leather from the cheap faux leather, meaning a lot of customers have been coming in with refund requests! Indubitably, this is a job for Sheriff Bridgette and Deputy Randy.

Part 1: List Sorting

There are many ways to distinguish real leather from fake leather. Real leather, for example, tends to have an uneven texture with blemishes and other imperfections while faux leather looks uniform and smooth. Help the cobbler by writing a sorting algorithm to sort the real leather boots from the fake ones.

What’s a sorting algorithm?
There are many ways to sort a list. Some are fast, some are slow. Some work very well on certain kinds of inputs and very poorly on others.

A sorting algorithm is a series of steps that outlines a particular way to sort a list. We will examine two sorting algorithms throughout this lab: selection sort and merge sort.

SELECTION SORT

MERGE SORT

  1. Think about why both algorithms (selection and merge sort) are guaranteed to produce a sorted list. Post on Campuswire if you’re not sure!

  2. Pull up this file, which provides the code for both algorithms.

  3. Look at the code for selection sort. We’ll use the number of comparisons (such as <) as our measure of the runtime for this algorithm.

    1. Think back to your collection of test inputs for selection sort. Are any of those inputs likely to require more comparisons than the other inputs? Do any require fewer comparisons than the others?
    2. Write a formal prediction of the worst-case running time of selection sort. You can express this as a broad class of the function (e.g., “constant,” “linear,” “quadratic,” etc.) or as an equation in terms of N, the list size.
    3. The file we gave you loads a new library called efficiency-sort-grapher. Read the instructions at the bottom of the file to experiment with the runtimes of selection sort and merge sort.
  4. Predict how many element comparisons occur within the worst case with selection sort.

  5. Now write a prediction for the runtime of merge sort! For this algorithm, use the number of “links” as the measure of runtime.

  6. We have written several functions that allow you to test your predictions on lists of any size: link to code. Use them to experiment with the runtimes of selection sort and merge sort!

Which sorting algorithm should the cobbler use? In what case(s)?


CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!


Part 2: Working with Ancestor Trees

The following activity gives you a chance to understand the in-tree function covered in class and to write another function (or two) on ancestor trees.

1. Tracing in-tree

This file contains an AncTree datatype and the in-tree function.

For each of the following expressions, write out the sequence of calls to in-tree that get made while evaluating those expressions:

Open the tracer-version of Pyret. Copy and paste the code from the file into the definitions area. Press “Trace and Run.” Enter the following expression at the prompt:

then press “Trace Last.” Set the dropdown menu to read “Depth First,” and use the arrow keys to the right of the dropdown to see the order of the function calls that get made. Does this order match what you predicted?

Now run in-tree on a name that is not in the tree. Again use “Trace Last” to see the sequence of calls that get made. Does this sequence match what you would have expected?

2. Counting Generations

We’d like to be able to compute how many generations are represented in the tree. In the case of annaT, there are 5 generations (from Anna up to Robert).

Your goal is to write a function called count-gens that takes an AncTree as an input and produces a number indicating the number of generations in the tree.

Start by writing examples on small trees with no more than two generations. Your examples should have unknown people in various places. Can you connect the results of one tree to the next? For example, if you started from the tree for Ellen in our running example, how would you use the results of Laura and John’s trees to compute the number of generations upwards from Ellen?

Remember to break into cases on AncTree and to call count-gens on each of the mother and father branches/trees.

As you did for in-tree, copy your code into the Tracer version and use “Trace Last” to explore the order in which the people in the tree get processed.

3. Reflection

Write a block comment answering the following questions to see whether you are understanding trees:

4. In case you want more …

Can you write a function to count how many blue-eyed people are in the tree? How about a function to return a list of all the names in the tree? Feel free to try these if you want more practice.

Part 3: Working More with Lists

This section is designed moreso for students who wish to take CS18 next semester.

The shoe shoppe’s reputation has been saved, but the cobbler’s sales are still not what they used to be. He suspects that this dip in revenue is because the process by which he finds the faux leather utilizes a slow, inefficient lookup function…

lookup on lists

Consider the function lookup(num :: Number, lst :: List<Number>) -> Number which, when given a list of numbers, returns the index of a particular number in the list.

  1. Predict this function’s runtime. What would its runtime be in the best case? How about the worst case?
  2. How does runtime change if the input list is sorted or unsorted, assuming it changes at all?
  3. In the Pyret definitions window, write thelookup function.

CHECKPOINT: Make sure you have written comments to answer all the questions. Post on CW if you have any questions!


lookup on trees

Randy suggests using a tree to speed up the lookup function!

Here is a general tree datatype that makes a tree containing numbers rather than people:

data Tree:
    | leaf
    | node(value :: Number, left-child :: Tree, right-child :: Tree)
end

Every tree is either a leaf or a node with two children, where each child is also a tree.

We also talked about binary search trees, which are special kinds of trees where every node obeys the following two rules:

  1. Copy the data definition for a tree into the Pyret definitions window. Then, rewrite the following list as a binary search tree: [list: 7, 3, 5, 9, 10, 22, 17, 1]

    A balanced binary search tree is a binary search tree that has minimal depth, equal to the number of jumps from the root (the first/topmost node) to the farthest leaf.

    For example, this tree is unbalanced and has a depth of 2:

    But this tree is balanced and has a depth of 1:

  2. How balanced can you make the tree from the first question? Sketch a few versions of it on paper.

  3. Predict lookup's runtime on a binary search tree. Is it different than your prediction for lookup's runtime on lists?

  4. Now, write bst-lookup(num :: Number, tree :: Tree) -> Tree. Because trees don’t have indices the way that lists do, the output of bst-lookup should be the actual node containing the value.

  5. If you have time, write the function bst-insert(num :: Number, tree :: Tree), which takes in a number and a balanced binary search tree and inserts that number into the tree. Remember that your new tree must also be a binary search tree (and, thus, must also follow the rules for binary search trees outlined above).

Part 4: Bonus (i.e., Optional): More Sorting Algorithms

This section is intended to work as a supplement for students who are considering taking CS18 next semester. If you are not considering CS18, feel free to try it out anyway! If you are, this content will probably help with the transition.

Below are descriptions of two additional sorting algorithms: insertion sort and quicksort. Implement them and predict their runtimes.

INSERTION SORT

Insertion sort works by inserting elements into their correct spots, one at a time.

QUICKSORT

Quicksort works by choosing a single element to be a “pivot” and rearranging the rest of the list based on that pivot.

A Happy Ending

Thanks for your help! Bridgette, Randy, and the cobbler went out to the saloon to celebrate the grand reopening of his shoppe! The boots on display were beautiful! Now, the Sheriff and Deputy can finally prepare to catch the cowboy-boots burglar!