CS195Y Lecture 16

3/3/2017


Announcements

Last time, we had this Empty sig that was causing some issues. Since then, I've removed it and cleaned up Descent a bit.

There's a more interesting property we'd like to check about our descent model. What makes a descent good? Say you're writing a binary tree in your favorite programming language. How do you know it works? Imagine an algorithm that checks if the value is at the root and just stops there. What's wrong with it?

We want to find a node with the number iff a node with the number exists in the tree.

assert findIFFThere {
    all d: Descent | {
        d.val in d.t.nodes.num iff d.path[d.path.lastIdx].num = d.val
    }
}

Does this seem right? We could run it, but let's examine the assertion for a bit. What could go wrong?

Q: What should the result of this be when the node isn't in the tree?

Tim: On the left, d.val in d.t.nodes.sum is hopefully false, since d.t.nodes.sum is all the numbers in the tree. And the last value of the descent (d.path[d.path.lastIdx].num) ought to just not be d.val.

Q: Are we vulnerable if we're looking for 5, but 3's in the tree so we end on 3?

Tim: Since we use d.val, if we end on something else that's perfectly fine. Both sides of the iff will be false.

Let's take a step back and just look at an instance with run {}

Is this a tree? No, we get an instance with a self-loop, so it's not a tree. It's also not ordered...

Now, if we check the assertion, we get some counterexamples. What happened?

Student: We defined Descent to only work on binary search trees, but we're not requiring that we have one.

findIFFThere is a bit too strong, and it requires Descent to work even without a binary search tree.

assert findIFFThere {
    all d: Descent | {
        isBSTree[d.t] implies {
            d.val in d.t.nodes.num iff d.path[d.path.lastIdx].num = d.val
        }
    }
}

This is something to be aware of with your Alloy specs. Sometimes your assertions are a bit too strong, and sometimes they can be too weak.

Now, I want to move on and use our Descent to prove that our addNode algorithm works. A big part of adding a node is doing a descent to find where it should go, so we've already done a lot of the work.

What fields should an AddNode operation have? Think about the event idiom. How should our sig fact be structured, using a similar style to Descent?

sig AddNode {
    toadd: Int,
    pre, post: Tree,
    finding: Descent
} {

}

If you were doing this in Java or OCaml or some other language, you might break this into cases depending on whether or not pre is empty. We can do the same thing in Alloy. It's like the paired implications I used earlier, but this is Alloy's equivalent if an if-else statement.

sig AddNode {
    ...
} {
    no pre.root => {
        -- no search needed; create single node tree
        -- Alloy doesn't have a way to make a new tree, so we check this by describing what post should look like
        -- It turns out that because of the constraints we have, we don't need to explicitly say that the one node
        -- in post is the root.
        one post.nodes
        no post.lefts
        no post.rights
        post.nodes.num = toadd
    } else {
        -- Have our finding actually find what we're looking for
        finding.val = toadd
        finding.t = pre

        -- Q: is this an overconstraint?
        -- The way you could check this is to make two versions of AddNode and look for differences in their behavior
        -- with a predicate.
        pre.root = post.root -- We're assuming no rebalancing

        -- Have we stopped at the value we were trying to find?
        finding.path.last.num = finding.val => {
            -- The thing we're trying to add is already here. We could either force pre and post to be the same
            -- structurally and have the same fields, or just reuse the tree (which is what we'll do here)
            pre = post
        } else {
            -- We have to add it!
            let lastdata = finding.path.last |
                -- Q: Is `Node - pre.nodes` also an overconstraint?
                some newnode : Node - pre.nodes | { -- new Node();
                    newnode.num = toadd
                    post.nodes = pre.nodes + newnode

                    -- Break into two cases to see which pointer I look at
                    lastdata.num < toadd implies {
                        -- Q: What if you had 7, 6, and 8, and were adding 5? Don't you have to add a link from 5 to 7?
                        -- Tim: We're only adding leaves and not doing any rebalancing, so we don't! Instead, we'd have 6 -> 5.
                        post.rights = pre.rights + (lastdata -> newnode)
                        post.lefts = pre.lefts
                    }
                    lastdata.num > toadd implies {
                        post.lefts = pre.lefts + (lastdata -> newnode)
                        post.rights = pre.rights
                    }

                    -- There is some harmless overconstraint here, but we can always root it out with comparison predicates.
                }
        }
    }
}

Q: Why don't we make an extension to Tree that forces it to be a binary search tree, since we're doing a lot with them?

Tim: We totally could here, but I'm doing it this way because on your final projects, several of you will probably have multiple kinds of objects that you want to support. Starting out with predicates gives you more flexibility to do that, especially as you figure out what those objects should look like.

What does it mean for an add to be correct? If we start with a binary search tree, adding ought to produce a binary search tree

assert addpreserves {
    all a: AddNode | {
        isBSTree[a.pre] implies isBSTree[a.post]
    }
}

-- We're reasoning about any possible AddNode in isolation, so only have one.
check addpreserves for 2 Tree, 1 Descent, 1 AddNode, 5 Node, 4 Int, 4 seq

This passes, but what should we be suspicious of?

All we know is that it's a binary search tree. We don't know if it actually succeeded in adding the node.

Do we actually need seq here? In this case, we only have one descent, so there's only one seq and we could have used an ordering instead.