Handling errors
Right now, here’s how we’re defining correctness for our compiler:
For all programs p
, if the interpreter produces a value when run on p
, the
compiler produces machine code that produces that same value.
But the interpreter doesn’t produce a value for every program! On (add1
false)
, for instance, the interpreter throws an exception.
For these programs, we’re currently making no claims about our compiler’s
behavior. Maybe it will return an error of some kind–for instance, on (add1
false)
we get an error from the runtime because it doesn’t know how to print
the value. On totally invalid programs like (hello hello)
our compiler will
raise the same error as our interpreter–we don’t know how to compile programs
like that.
But on some of these programs, our compiler will actually produce a value (or
really, produce a machine-code program that produces a value). (add1 (sub1
false))
, for instance, produces false
in the compiler even though the
interpreter doesn’t recognize it as a valid program.
Today, we’ll fix this issue, modifying our compiler to handle these errors.
Modifying the runtime
First, we’ll add an error-handling function to the runtime. We’ll call this function from our compiled programs when an error occurs.
void error() { printf("ERROR"); exit(1); }
As usual, we’ll need to recompile the runtime:
gcc -c runtime.c -o runtime.o
Modifying the compiler
First, we’ll need to modify our compiler’s output so that we can call our new
error
function:
let compile (program : s_exp) : string = [Global "entry"; Extern "error"; Label "entry"] @ compile_exp Symtab.empty (-8) program @ [Ret] |> List.map string_of_directive |> String.concat "\n"
That Extern "error"
directive is sort of the inverse of Global
: it tells the
assembler that our program will be linked against a program that includes a
definition for the error
label.
We’ll jump to this label whenever we want to signal an error at runtime. For
instance, add1
should raise an error if its argument isn’t a number:
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with (* some cases elided ... *) | Lst [Sym "add1"; arg] -> compile_exp tab stack_index arg @ [ Mov (Reg R8, op) ; And (Reg R8, Imm num_mask) ; Cmp (Reg R8, Imm num_tag) ; Jnz "error" ] @ [Add (Reg Rax, operand_of_num 1)]
We raise an error by jumping to our error
function. In general calling C
functions will be more complex than this since we want to preserve our heap
pointer and values on our stack, but since the error
function stops execution
we don’t need to worry about any of that.
We can extract these directives into a helper function:
let ensure_num (op : operand) : directive list = [ Mov (Reg R8, op) ; And (Reg R8, Imm num_mask) ; Cmp (Reg R8, Imm num_tag) ; Jnz "error" ]
(We should only call ensure_num
when we’re not using the value in r8
!)
We can use this to add error handling to functions that should take numbers:
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with (* some cases elided ... *) | Lst [Sym "add1"; arg] -> compile_exp tab stack_index arg @ ensure_num (Reg Rax) @ [Add (Reg Rax, operand_of_num 1)] | Lst [Sym "+"; e1; e2] -> compile_exp tab stack_index e1 @ ensure_num (Reg Rax) @ [Mov (stack_address stack_index, Reg Rax)] @ compile_exp tab (stack_index - 8) e2 @ (ensure_num (Reg Rax) @ [Mov (Reg R8, stack_address stack_index)] @ [Add (Reg Rax, Reg R8)]
and so on. We can write a similar function for pairs:
let ensure_pair (op : operand) : directive list = [ Mov (Reg R8, op) ; And (Reg R8, Imm heap_mask) ; Cmp (Reg R8, Imm pair_tag) ; Jnz "error" ]
Compiler correctness revisited
We can now make a stronger statement about compiler correctness:
For all programs p
, if the interpreter produces a value when run on p
, the
compiler produces machine code that produces that same value. If the interpreter
produces an error, the compiler will either produce an error or produce a
program that produces an error.
We can add support for erroring programs to our tester:
let interp_err (program : string) : string = try interp program with BadExpression _ -> "ERROR"
let compile_and_run_err (program : string) : string = try compile_and_run program with BadExpression _ -> "ERROR" let difftest (examples : string list) = let results = List.map (fun ex -> (compile_and_run_err ex, Interp.interp_err ex)) examples in List.for_all (fun (r1, r2) -> r1 = r2) results
We have one lingering problem: there are some programs that produce an error in
our compiler but not in our interpreter. An example is (if true 1 (hello
hello))
. Since the interpreter never evaluates (hello hello)
, it happily
produces the value 1
. The compiler, however, will throw an error at
compile-time. We could fix this by adding a check to the interpreter to ensure
that the programs it’s trying to interpret are well-formed (i.e., don’t
contain expressions like (hello hello)
) even if they aren’t type-correct.