lec26

CS195Y Lecture 26

4/20/16

LiquidHaskell (LH)

Case 1: Absolute Value

Let’s write a function for absolute value. What’s its type?

It takes an integer (usually) and produces another integer. We write this as

abz :: Int -> Int
abz n | 0 < n     = n
      | otherwise = 0 - n

This is correct and passes Haskell’s type checker, but the types don’t quite express the property we want of abz's output: that it is a non-negative integer.

We can express this by writing a LH refinement:

{-@ abz :: Int -> {v : Int | v >= 0} @-}
abz :: Int -> Int
abz n | 0 < n     = n
      | otherwise = 0 - n

With the refinement, LH will check whether the function will always return a non-negative number. Because in this case LH could do so, when we hit the check the button, LH will say that it’s SAFE. Note that checking is done statically, just like type checking when compiling Haskell code (without actually running the Haskell program at all!)

Indeed, LH would be able to catch buggy implementations which return a negative integer.

{-@ abz :: Int -> {v : Int | v >= 0} @-}
abz :: Int -> Int
abz n | 0 < n     = n
      | otherwise = (-1) - n

Now, when we hit the check the button, LH will say that it’s UNSAFE.

More on Refinements:

A refinement ({-@ ..... @-}) is a Haskell block comment ({- ..... -}) with additional bookend @ characters (but, it’s still syntactically a Haskell block comment). Therefore, it will not affect the program when running at all. It’s called “refinement” because it refines Haskell’s type to be more expressive, as in the example above see above.

To comment out a refinement, simply delete the first @ character. This will make the whole block a normal Haskell comment, and LH will not consider this a refinement.

Case 2: Division

From Monday’s lecture, we know that the following divide could cause an exception when d = 0

divide :: Int -> Int -> Int
divide n d = div n d

What we will do is to provide a refinement to prevent anyone from calling divide with d = 0.

{-@ divide :: Int -> {v : Int | v != 0 } -> Int @-}
divide :: Int -> Int -> Int
divide n d = div n d

Now, if we write:

myval = divide 6 3

LH will say that it’s SAFE. On the other hand, if we write:

myval2 = divide 6 0

Then, LH will say that it’s UNSAFE.

Case 3: Average

We define a function that computes the average of a list of Ints.

average :: [Int] -> Int
average xs = divide total n
  where
    total = sum xs
    n = length xs

Alright, so what happens if we hit the check button? It’s UNSAFE! If average gets an empty list, length xs will be 0. Consequently, divide will get 0 as its second argument.

Broadly, there are two ways to solve this problem

Explicitly define the behavior of average on an empty list such as average [] = 0. LH then would understand that the only case that divide will actually be invoked is when the list is non-empty.

However, why did we choose 0 as our choice? Actually, average [] is undefined. We just don’t want this to happen!
Write a refinement for average so that LH knows that we are not going to pass empty list to the function.
```
{-@ average :: {l: [Int] | len l > 0} -> Int @-}
average :: [Int] -> Int
average xs = divide total n
  where
    total = sum xs
    n = length xs
```
Now, LH knows that the input is going to be a non-empty list, so it can prove that length xs is going to be non-zero. Thus, LH says it’s SAFE.

As you can see, in a refinement, you can use this magic len in addition to Boolean operations. This len is called a measure (note that a measure is completely different from a function. A measure is used inside a refinement, while a function is used inside a code, so len and length are different things). If we define our own datatype, LH has a mechanism to let us define our own measures as well, though for now we will not talk about this.

How does LH works?

Let’s analyze

{-@ abz :: Int -> {v : Int | v >= 0} @-}
abz :: Int -> Int
abz n | 0 < n     = n
      | otherwise = 0 - n

LH tries to find a counterexample that makes this fail.

In the first branch, the condition is that 0 < n, and the goal is to prove that the result is greater than or equal zero. To find a counterexample, LH tries to find

some n : Int | 0 < n and not (n >= 0)

In the second branch, the condition is that 0 >= n (otherwise). and the goal is to prove that the result is greater than or equal zero. To find a counterexample, LH tries to find

some n : Int | 0 >= n and not (0 - n >= 0)

Obviously, these two are unsatisfiable. While this is something humans can pick out readily, it’s more challenging for computers. A model checker like Alloy would perform an exhaustive check within some integer bounds that we have specified. However, what we really want is to check for all Int! (technically, Int in Haskell is bounded, but it’s still big enough that Alloy wouldn’t be able to handle it efficiently).

On the other hand, SMT solvers like Z3 comes with the theories that let us check whether constraints are satisfiable or not. And indeed, under the hood LH actually uses Z3 to check for satisfiablilty! Recall that SMT solvers have 3 possible outputs: sat, unsat, and unknown. “unknown” could happen when the constraints we input are undecidable. However, LH limits the expressivity of refinements that we could write to make sure that constraints that it will generate will be decidable. That means, LH will always be able to output either SAFE or UNSAFE.

(There are some tools called “proof assistant” which let us write very expressive constraints. This comes with the cost that we have to prove these constraints hand, as computers cannot prove them automatically due to undecidability).

With the above mechanism, LH can confidently say SAFE.

Case 4: Insertion Sort’s `insert`

Let’s talk about insertion sort’s insert function:

insert :: Int -> [Int] -> [Int]
insert x [] = [x]
insert y x:xs
  | y < x = y:(x:xs)
  | otherwise = x:(insert y xs)

One very important property of a recursive function that we should be interested is, the function will terminate. The reason is that, an infinite loop vacuously returns a value of any type[1]. For example, if LH were not care about termination (there’s a mode to disable termination check), the following will result in SAFE:

{-@ insertBad :: Int -> lst: [Int] -> {v : [Int] | len v = 1 } @-}
insertBad :: Int -> [Int] -> [Int]
insertBad x [] = [x]
insertBad y x:xs = insertBad y x:xs

But we wouldn’t say that an infinite loop is a list of length 1!

How do we prove that a function will terminate? Intuitively, there must be “something” that gets smaller and smaller in every iteration. Moreover, that “something” must be bounded (we don’t want infinite decreasing). With these two conditions (decreasing and bounded), eventually that “something” must hit the ground, which is when the function terminates.

By default, LH will check for the termination of function. Specifically, LH wants us to provide an expression called decreasing expression. As the name suggests, the expression should be strictly decreasing. It should be composed of arguments of the function and results in a natural number (>= 0). Some examples of valid decreasing expressions for insert is

len lst
2 * (len lst)
(len lst) + 1

where lst refers to the second argument of the function. All of our choices of decreasing expression work because they are decreasing as computation continues. Also, the results are all greater than or equal to 0. On the other hand

-(len lst)
(len lst) - 1

both wouldn’t be valid decreasing expressions for insert because they are not decreasing or they could be negative.

Also, by default, LH will look at the first argument which is “decreasable” (e.g., Int) and assume that the argument is the decreasing expression. This obviously fails for our function insert, as our first argument has nothing to do with termination of the function. Therefore, if you hit the check button, LH will show UNSAFE.

As mentioned above, it would be nice if we can name the second argument so that we can refer to it in the decreasing expression. And yes, we could!

{-@ insert :: Int -> lst: [Int] -> [Int] @-}
insert :: Int -> [Int] -> [Int]
insert x [] = [x]
insert y x:xs
  | y < x = y:(x:xs)
  | otherwise = x:(insert y xs)

Now, the second argument is known as lst. We then can write a decreasing expression:

{-@ insert :: Int -> lst: [Int] -> [Int] / [len lst] @-}
insert :: Int -> [Int] -> [Int]
insert x [] = [x]
insert y x:xs
  | y < x = y:(x:xs)
  | otherwise = x:(insert y xs)

If we hit the check button again, LH will say that it’s SAFE.

There are other things that we could refine for insert such as, the output will be ordered, or the length of the input will be the length of the second argument (lst) plus one. We will do the latter as it’s easier.

Because now we have the name of the second argument, we could relate several variables together:

{-@ insert :: Int -> lst: [Int] -> {v : [Int] | len v = len lst + 1 } / [len lst] @-}
insert :: Int -> [Int] -> [Int]
insert x [] = [x]
insert y x:xs
  | y < x = y:(x:xs)
  | otherwise = x:(insert y xs)

This property (len v = len lst + 1) will be crucial in proving that insertion sort returns a list of the same length as its input.

LH’s Syntax

Basic

A refinement functions essentially the same as type annotation that we would write. However, we could replace types with set comprehensions with some conditions to refine the types.

{ <id> : <Type> | <condition> }

For example, we could refine Int to

{ v : Int | v >= 5 || v <= -10 }

which reads: "a type/set whose value/element that we will call v is of type Int, and v >= 5 or v <= -10"

Conventionally, we will use v to refer to a value/element in the type/set.

Naming

We could name any argument (but not the return type, since there’s no need to do so) by putting the name and a colon in front of a type. For example:

val: Int

lst: { v : [Int] | len v <= 2 }

Note that a condition of each set comprehension can only refer to names that previously appear before the argument (excluding the name of the argument itself). Thus:

{-@ fooBad :: {v : Int | v <= a } -> a: Int -> Int @-}

is not valid. The correct way is to write:

{-@ fooGood :: b: Int -> {v : Int | b <= v } -> Int @-}

Similarly:

{-@ barBad :: myval: {v : Int | myval >= 5 } -> Int @-}

is not valid. The correct way is to write this is:

{-@ barGood :: {v : Int | v >= 5 } -> Int @-}

Decreasing Expression

Decreasing expression has the form:

/ [<expression>]

They should appear at the end of the refinement. For example:

{-@ barGood :: myval: {v : Int | v >= 5 } -> Int / [myval + 100] @-}

There’s no need to write a decreasing expression for non-recursive functions.

[1]: In programming languages, infinite loops belongs to the “bottom” type, which is a subtype of every type. That’s why it vacuously belongs to any type.