12 Representation Decisions
Go back and look again at our interpreter for function as values [REF]. Do you see something curiously non-uniform about it?
No, really, do. Do you?
Consider how we chose to represent our two different kinds of values: numbers and functions. Ignoring the superficial numV and closV wrappers, focus on the underlying data representations. We represented the interpreted language’s numbers as Racket numbers, but we did not represent the interpreted language’s functions (closures) as Racket functions (closures).
That’s our non-uniformity. It would have been more uniform to use Racket’s representations for both, or also to not use Racket’s representation for either. So why did we make this particular choice?
We were trying to illustrate and point, and that point is what we will explore right now.
12.1 Changing Representations
They are too much if what we want is a more restricted number system. For instance, Java prescribes a very specific set of fixed-size representations (e.g., int is specified to be 32-bit). Numbers that fall outside these sets cannot be directly represented as numbers, and arithmetic must respect these sets (e.g., overflowing so that adding 1 to 2147483647 does not produce 2147483648).
They are too little if we want even richer numbers, whether quaternions or numbers with associated probabilities.
The reason we did so is because we weren’t really interested in the study of numbers; rather, we were interested in programming language features such as functions-as-values. As language designers, however, you should be sure to ask these hard questions up front.
Now let’s talk about our representation of closures. We could have instead represented closures by exploiting Racket’s corresponding concept, and correspondingly, function application with unvarnished Racket application.
Replace the closure data structure with Racket functions representing functions-as-values.
(define-type Value [numV (n : number)] [closV (f : (Value -> Value))]) (define (interp [expr : ExprC] [env : Env]) : Value (type-case ExprC expr [numC (n) (numV n)] [idC (n) (lookup n env)] [appC (f a) (local ([define f-value (interp f env)] [define a-value (interp a env)]) ((closV-f f-value) a-value))] [plusC (l r) (num+ (interp l env) (interp r env))] [multC (l r) (num* (interp l env) (interp r env))] [lamC (a b) (closV (lambda (arg-val) (interp b (extend-env (bind a arg-val) env))))]))
Observe a curious shift. In our previous implementation, the environment was extended in the appC case. Here, it’s extended in the lamC case. Is one of these incorrect? If not, why did this change occur?
This is certainly concise, but we’ve lost something very important: understanding. Saying that a source language function corresponds to lambda tells us virtually nothing: if we already knew precisely what lambda does we might not be studying it, and if we didn’t, this mapping would teach us absolutely nothing (and might, in fact, pile confusion on top of our ignorance). For the same reason, we did not use Racket’s state to understand the varieties of stateful operators [REF].
Once we’ve understood the feature, however, we should feel to use it as a representation. Indeed, doing so might yield much more concise interpreters because we aren’t doing everything manually. In fact, some later interpreters [REF] will become virtually unreadable if we did not exploit these richer representations.It’s a little like saying, “Now that we understand addition in terms of increment-by-one, we can use addition to define multiplication: we don’t have to use only increment-by-one to define it.” Nevertheless, exploiting host language features has perils that we should safeguard against.
When programs go wrong, programmers need a careful presentation of errors. Using host language features runs the risk that users will see host language errors, which they will not understand. Therefore, we have to carefully translate error conditions into terms that the user of our language will understand, without letting the host language “leak through”.
Worse, programs that should error might not! For instance, suppose we decide that functions should only appear in top-level positions. If we fail to expressly check for this, desugaring into the more permissive lambda may result in an interpreter that produces answers where it should have halted with an error. Therefore, we have to take great care to permit only the intended surface language to be mapped to the host language.
As another example, consider the different mutation operations. In our language, attempting to mutate an unbound variable produces an error. In some languages, doing so results in the variable being defined. Failing to pin down our intended semantics is a common language designer error, saying instead, “It is whatever the implementation does”. This attitude (a) is lazy and sloppy, (b) may yield unexpected and negative consequences, and (c) makes it hard for you to move your language from one implementation platform to another. Don’t ever make this mistake!
12.3 Changing Meaning
Mapping functions-as-values to lambda works especially because we intend for the two to have the same meaning. However, this makes it difficult to change the meaning of what a function does. Lemme give ya’ a hypothetic: suppose we wanted our language to implement dynamic scope.Don’t let this go past the hypothetical stage, please. In our original interpreter, this was easy (almost too easy, as history shows). But try to make the interpreter that uses lambda implement dynamic scope. It can similarly be difficult or at least subtle to map eager evaluation onto a language with lazy application [REF].
Convert the above interpreter to use dynamic scope.
The point is that the raw data structure representation does not make
anything especially easy, but it usually doesn’t get in the way,
either. In contrast, mapping to host language features can make some
The moral is that this is a good property to exploit only we want to
“pass through” the base language’s meaning—
12.4 One More Example
Let’s consider one more representation change. What is an environment?
(define-type-alias Env (symbol -> Value))
(define (mt-env [name : symbol]) (error 'lookup "name not found"))
(define (extend-env [b : Binding] [e : Env]) (lambda ([name : symbol]) : Value (if (symbol=? name (bind-name b)) (bind-val b) (lookup name e))))
(define (lookup [n : symbol] [e : Env]) : Value (e n))