CS195Y Lecture 30

4/19/17

Rosette and SMT Solving

Last time

We showed how Z3 could be used to factor a polynomial without using the quadratic equation. We specified an equation and asked for numbers that preserved equality between the factored and un-factored versions - the roots.

We saw that error messages are ofen unhelpful: sometimes you'll get an 'unsat' result when it's really just a syntax error.

Why are we using the Racket binding for Z3 instead of Python or Java? Because we'll be able to use those bindings in conjunction with actual Racket programs and prove things about those programs in they way we did mathematical equations.

SMT Solvers

How do SMT solvers work? We've seen how to make SAT solvers that can reason about boolean formulas but how can we make them more expressive to that point that we can prove things about ints or strings? We want to be able to reason about more complicated, high level notions - not just simple boolean formulas.

How does integrating real numbers into SMT solvers work? There are two main strategies: eager and lazy.

Eager

Take an expression that has integers in it and represent those integers with boolean values. For example, the expression x > 5. If we set the max bitwidth to be 3 (and don't have negative numbers), then we know that every integer is composed of 3 bits which can either be 1 or 0 (true or false).

x0 is the 1's place, x1 is the 2's and x2 is the 4's

We know that 6 and 7 satisfy x > 5 (with the bounded bitwidth), and we can translate them into a boolean formula using the binary representation.

6 = ((not x0) and x1 and x2) = 011 in binary
7 = (x0 and x1 and x2) = 111 in binary
and the whole thing can be written: (((not x0) and x1 and x2) or (x0 and x1 and x2))

This method involves knowing the integers and then transforming them into the boolean representation. Instead, you can solve it more directory by looking at the boolean representation of 5 (x0 and (not x1) and x2) and giving a formulate that has some xn where n is larger than the largest positive term in the formula.

Lazy

Solving larger expressions with the eager method would be really slow - so many new terms are added for each integer. Instead, we will be 'lazy.' Lazy, in computer science, means that we defer evaluation of something until it's absolutely necessary. In the eager version we just evaluated everything we could, here we'll only do it gradually.

How do we solve this:
(y <= 5 or x > y) and ( y > 6 or x > 5)

Observe that y <= 5 and y > 6 can't both be true at the same time. We'll rewrite this into a boolean skeleton of the formula to get rid of the numbers:

(A or B) and (C or D)

and now it's something a regular SAT solver can work with. It gives us an instance (like {A, C}) and the engine that knows how to deal will numbers can check it using a system of in equalities: 6 < y <=5 and see that it is unsat. So then it just asks for another instance, and it might give us {A, D} for example.

This is called an offline method because we defer all the integer work to a separate engine. An online method could add its own constraints into the original formula based on what it knows about the numbers (like the incompatibility of A and C.

Uninterpreted functions with equality

A function is a relation and an interpreted function is a relation that does not have any constraints on what is in it. As we saw in alloy - you can define a relation R, but that relation can be anything at all until you add some constraints to specify meaning. For example, the relation less-than has a strictly defined behavior: 10 <= 1 is not valid, so (10, 1) is not a valid member of that relation.

A theory us a set of constraints that force the relation to behave the way that we expect, they enforce a certain interpretation. For example, what is the theory for the equality relation?

1. For all x, x = x - the reflexive property
2. For all x and y, x = y implies y = x - symmetric property
3. For all x,y and z, x = y and y = z implies x = z - transitive property

Is that enough to constrain equality to what we really mean? Not quite. We actually need some notion of congruence over other relations: i.e. if two things are equal, the should treated in the same way by all other relations.

4. For all relations R: for all x1 ... xn and y1 ... yn, if x1 = y1 ... xn = yn, then it must be true that R(x1 ... xn) iff R(y1 ... yn)

Now we have a theory of equality! We have defined what it means for two things to be equal. Some theories are easy to solve, others are harder and some are undecidable.