8 Lazy
8.1 Introduction
For this assignment, you will implement a modified version of Paret (called Lazy Paret) that uses call-by-need (i.e. lazy) evaluation and to support pairs.
8.2 Reading
Chapter 17.1 of PLAI 2/e. You may also find Chapter 13.3–13.4 of PAPL helpful.
8.3 Features to Implement
8.3.1 Pairs
You will add pairs to the language:
(pair f s) constructs a pair with two elements, f and s.
(first p) gets the first element of a pair p.
(second p) gets the second element of a pair p.
(is-pair v) evaluates to true if v evaluates to a v-pair Value; otherwise, it evaluates to false.
Pairs in Lazy Paret are not type homogeneous; that is, both elements of a pair do not have to be the same type.
8.3.2 Lazy Evaluation
Lazy evaluation means that arguments to functions should not be evaluated if they are not used in the body. As a result, many places where your language used to work with Values will instead work with Computations:
(define-type Computation |
(c-suspend |
[body : Expr] |
[env : Env])) |
|
(define-type-alias Env (Hashof Symbol Computation)) |
Function calls and the pair constructor must be lazy: that is, a function must not evaluate its argument if it is not used, and a pair must not evaluate its parts if they are not used. let’s value should also be lazy.
8.3.2.1 Strictness Points
On the other hand, certain parts of the language should force their arguments to be evaluated. These strictness points should force an expression to have a value. Specifically:
The operators +, ++, num=, str=, and is-pair should force all of their arguments to be evaluated.
if should force its condition to be evaluated as well as the target branch of the conditional (that is, if the condition evaluates to true, then the consq branch should be forced, and if the condition evaluates to false, then the altern branch should be forced).
first and second should force the accessed element of the pair to be evaluated.
Identifier resolution should force the referenced identifier to be evaluated.
The top-level should force its result to be evaluated (so it can show the end-user an answer).
All forced evaluations should be shallow; that is, you should not recursively force values inside of a concrete value. In particular, you should not recursively force values inside of pairs.
8.3.3 Desugaring Simplification
For simplicity, the and and or sugar forms have been removed from the language; however, your language still needs to handle the let form.
8.4 Assignment
You will implement the Lazy Paret language in two different ways: as an interpreter and as a Racket #lang.
8.4.1 Part 1: Interpreter
As usual, we have provided a function
parse, which consumes an expression in the language’s concrete syntax (described in
Grammar) and returns the abstract syntax representation of that expression (an
Expr).
You will implement the following two functions:
desugar :: Expr -> Expr
interp :: Expr -> Value
The specification for
desugar and
interp is identical to their counterparts in
Interpreter.
Pay close attention to the Lazy Paret specifications for let; make sure that your desugarer respects how the arguments to these forms should be lazily (or eagerly) evaluated. As a reminder, let should not allow for recursive definitions.
8.4.1.1 Error Handling
Your Lazy Paret interpreter should raise and handle all of the error cases from
Interpreter plus the addition of the following pair error case:
(define-type LazyInterpError |
(err-if-got-non-boolean [val : Value]) |
(err-bad-arg-to-un-op [op : UnaryOperator] |
[val : Value]) |
(err-bad-arg-to-op [op : Operator] |
[val : Value]) |
(err-unbound-id [name : Symbol]) |
(err-not-a-function [val : Value])) |
8.4.1.2 Grammar
The interpreter grammar has been updated to reflect the new pair operators:
<un-op> ::= first | second | is-pair |
<bin-op> ::= + | ++ | num= | str= | pair |
|
<expr> ::= <num> |
| <string> |
| <id> |
| true |
| false |
| (<un-op> <expr>) |
| (<bin-op> <expr> <expr>) |
| (if <expr> <expr> <expr>) |
| (let (<id> <expr>) <expr>) |
| (lam <id> <expr>) |
| (<expr> <expr>) |
8.4.1.3 Abstract Syntax
The abstract syntax has been updated to support pairs through the addition of the v-pair variant to the Value type.
(define-type Value |
(v-num |
[value : Number]) |
(v-str |
[value : String]) |
(v-bool |
[value : Boolean]) |
(v-fun |
[param : Symbol] |
[body : Expr] |
[env : Env]) |
(v-pair |
[first : Computation] |
[second : Computation])) |
|
(define-type Expr |
(e-num |
[value : Number]) |
(e-str |
[value : String]) |
(e-bool |
[value : Boolean]) |
(e-op |
[op : Operator] |
[left : Expr] |
[right : Expr]) |
(e-un-op |
[op : UnaryOperator] |
[arg : Expr]) |
(e-if |
[cond : Expr] |
[consq : Expr] |
[altern : Expr]) |
(e-lam |
[param : Symbol] |
[body : Expr]) |
(e-app |
[func : Expr] |
[arg : Expr]) |
(e-id |
[name : Symbol]) |
(sugar-let |
[id : Symbol] |
[value : Expr] |
[body : Expr])) |
|
(define-type Operator |
(op-pair) |
(op-plus) |
(op-append) |
(op-num-eq) |
(op-str-eq)) |
|
(define-type UnaryOperator |
(op-first) |
(op-second) |
(op-is-pair)) |
8.4.1.4 Testing
For the purposes of testing, we have defined an eval function that calls parse, desugar, and interp for you. eval consumes a program in the Paret language’s concrete syntax (S-Exp) and returns a Paret Value:
You should use eval in your testing file when writing test cases. You should not directly test desugar and interp individually in your test file (though you are welcome to and encouraged to individually test these functions in your code file).
In addition to the testing forms documented in the
Testing Guidelines, we provide the following testing forms:
(test-raises-lazy-interp-error? name expr lazy-interp-error)
Tests that the given expr raises the given lazy-interp-error.
(test-raises-err-if-got-non-boolean? name expr)
(test-raises-err-bad-arg-to-un-op? name expr)
(test-raises-err-bad-arg-to-op? name expr)
(test-raises-err-unbound-id? name expr)
(test-raises-err-not-a-function? name expr)
Tests that the given expr raises any instance of the named LazyInterpError. Useful when you want to test that an error was raised but the fields of the expected error contain implementation-specific values (such as a pair).
Like closures, programs can evaluate to a pair value. Because desugaring and closure representation is implementation-specific, the representation of Computations in pair values is implementation-specific. Thus, in your test submission, you should only test whether eval returns a pair, not which specific pair it returned; for example:
; good |
(test-pred "Good pair test" |
v-pair? (eval `(pair 1 2))) |
|
; bad |
(test-equal? "Bad pair test" |
(eval `(pair 1 2)) |
(v-pair (c-suspend (e-num 1) (hash empty)) |
(c-suspend (e-num 2) (hash empty)))) |
Likewise, if you are writing a test that raises a LazyInterpError that contains a pair Value, make sure to not test for the specific contents of the v-pair:
; good |
(test-raises-err-not-a-function? "Good error test with pairs" |
((pair 1 2) "arg")) |
|
; bad |
(test-raises-lazy-interp-error? "Bad error test with pairs" |
((pair 1 2) "arg") |
(err-not-a-function |
(v-pair (c-suspend (e-num 1) (hash empty)) |
(c-suspend (e-num 2) (hash empty))))) |
8.4.1.5 Starter Code
We’ve provided starter code for Part 1 at
lazy-interp.rkt and support code at
lazy-interp-support.rkt. You are not allowed to change the signature of
eval,
desugar, and
interp, but you are welcome to add any helper functions that you need for your implementation.
8.4.2 Part 2: Macros (with Caching)
You will now implement the lazy Pyret language again, but this time, you will use Racket macros to define your own lazy #lang. You will also implement a caching mechanism to improve performance.
Other than the caching mechanism requirement and unless otherwise stated in the requirements below, the semantics of your macros implementation should be identical to the interpreter semantics described above.
8.4.2.1 Implementing a #lang
In previous assignments, we have defined and used macros in the same file. However, macros are written in #lang racket, and Racket has an eager semantics. Instead, we want to define a lazy #lang and write our programs in that #lang.
To avoid name-confusion with the already existing #lang lazy, we will call your language my-lazy instead. Creating a fully-fledged #lang my-lazy requires some knowledge of Racket intricacies that are outside the scope of the course, so we will adopt a simpler solution.
You will instead create a my-lazy.rkt module (described below) to define a lazy language. Then, you can use this language in a new file by writing:
#lang s-exp "my-lazy.rkt" |
|
<your code goes here> |
8.4.2.2 Macro Particulars
In addition to providing the constructs defined in the Paret specification above, you need to override the way function application works to enforce laziness. Racket provides a convenient way to do this: the #%app macro. Every application in a regular program—for example, (f x)—is actually an invocation of the #%app macro: it’s really (#%app f x). The default #%app macro in Racket enforces eager evaluation. Your job is to define an #%app macro that enforces lazy evaluation (which can rely on eager evaluation underneath).
Doing this requires a little care. If you override #%app in my-lazy.rkt, you will end up changing the function application behavior within the module itself. Instead, we will use the rename-out construct in Racket to export a local macro as the #%app macro for our client program:
(provide (rename-out [lazy-app #%app])) ; overrides #%app with lazy-app |
|
(define-syntax (lazy-app fun arg ...) ; implementation of lazy application |
; TODO: Implement me! |
....) |
The stencil uses this pattern to maintain the normal behavior of various other Racket constructs that would be helpful in your macro implementation, but that we ultimately want to override in the client program.
8.4.2.3 Alternative to Forcing the Top-Level
While
Strictness Points states that we should ensure that our language forces at the top-level, implementing this exact behavior requires some knowledge of Racket internals that are distracting for our purposes.
Instead, your macro implementation should provide a ! construct that enforces strictness for a given expression:
Like how
Computations were forced in
Part 1: Interpreter,
! should perform
shallow evaluation; that is, you should not recursively force values inside of a concrete value. However, if
! forces a suspended computation and gets back another suspended computation, the expression isn’t fully forced, and
! will need to recur.
We will use
! to simulate top-level strictness when we run tests against your implementation by wrapping the top-most expression with
(! ...), so as long as you implement
! correctly, your macros-based implementation of Lazy Paret does not have to handle top-level strictness in any special way. However, your macros-based implementation must enforce the remaining parts of the
Strictness Points specification.
8.4.2.4 Representing Computations in Racket
We’ve provided an analogous version of the Plait
Computation type from
Part 1: Interpreter using Racket
structs. This version includes the addition of a new
computation/value variant (explained in
Caching):
(struct computation ()) |
(struct computation/suspend computation (body)) |
(struct computation/value computation (value)) |
The purpose of the computation struct is similar to that of the Computation type from your interpreter. To implement lazy evaluation properly, you should use the computation/suspend struct to store non-evaluated expressions as thunks. computation/suspend struct does not have an “environment” parameter, since #lang racket will automatically provide proper identifier binding and lookup to your thunks.
You can use Racket’s
thunk form to easily create nullary functions (that is, functions of zero arguments) for the purpose of suspending computations.
8.4.2.5 Caching
A lazy language must necessarily not have side-effects. Therefore, there is no need to re-evaluate any lazy expression: it suffices to evaluate it once and then cache its value, trading space for time.
When a computation is forced, its result should be cached so it doesn’t have to be computed a second time. We have provided the ref-comp struct for this purpose:
(struct ref-comp (comp) #:mutable #:transparent) |
As mentioned above in
Representing Computations in Racket, we’ve updated the
computation struct with a new
computation/value variant, which represents an
evaluated computation. When arguments to functions are thunked, they should be stored in a
ref-comp via a
computation/suspend. If that argument needs to be evaluated (due to forcing), your implementation should evaluate the
body stored in the
computation/suspend, then update the
ref-comp with a
computation/value containing the computed value of that expression.
By marking the ref-comp struct as #:mutable, Racket automatically defines a set-ref-comp-comp! function that will allow you to update the ref-comp’s comp field in-place. You should use this to update a ref-comp after computing its value for the first time.
As a hint, your ! function will likely need to do different things based on whether or not the input ref-comp is of type computation/suspend or computation/value:
#lang racket |
(define (! rc) |
(match (ref-comp-comp rc) |
; A computation/suspend is a non-evaluated computation. |
[(computation/suspend body) |
; TODO: Evaluate the body to a value v, then update the ref-comp in-place |
; with (set-ref-comp-comp! rc (computation/value v)). |
....] |
; A computation/value is a computation that has been previously evaluated. |
[(computation/value value) |
....])) |
8.4.2.6 Variable Arity
Certain operators in Racket are variable arity procedures. We’d like to emulate this in our macros-based implementation of Lazy Paret. Specifically, if at least two arguments are supplied to the + and ++ operators, the operator should be applied pairwise from left-to-right across the arguments and the final result of those applications should be returned. (Raise an error if less than two arguments are supplied to those operators.)
Additionally, your implementation should also extend the lam and let forms to support a variable number of arguments and bindings.
The strictness point specification from
Lazy Evaluation still applies to all of these updated forms.
It’s worth observing that, in Racket, some operators (such as
+ and
++) are defined as functions (and not macros), which allows the operators themselves to be passed as arguments to other functions. However, for simplicity, we do not require you to implement the operators as functions in your macros implementation of Lazy Paret. The main requirement for your operators is that they support the
Updated Macros Grammar.
8.4.2.7 Implementation-Specific Pairs
In
Macros (with Caching), the support code does not give you an internal representation for
pairs. Thus, your macros-based implementation should define its own representation of
pairs. You are free to choose whatever representation that you want, but you should make sure that your representation allows the rest of your implementation to adhere to the
Lazy Evaluation semantics in the specification.
We will use the
is-pair function to test for
pair-ness when testing your code, so make sure that your
is-pair construct works correctly given your implementation’s representation of
pairs. (This is similar to how we instructed you to test for
v-pairs in the
Testing section from Part 1.)
8.4.2.8 define
Finally, your language should provide the define form from Racket, which will allow you to create named functions (and other bindings) in your test programs. The stencil code already provides define for you, which should be sufficient for the purposes of your implementation.
8.4.2.9 Error Handling
The set of possible error cases in your macros implementation is the same those from
Part 1: Interpreter with the addition of potential
arity mismatch errors (i.e. the number of arguments supplied to a function does not match with the number of arguments in the function definition).
The priority order and handling of error cases in your macros implementation is undefined, and you can assume that we will not test your macros implementation on any error cases.
8.4.2.10 Updated Macros Grammar
The Lazy Paret grammar for your macro-based implementation has been updated to reflect these changes:
<file> ::= <form> ... |
<form> ::= <define-statement> |
| <expr> |
|
<un-op> ::= first | second | is-pair | ! |
<bin-op> ::= num= | str= | pair |
<extended-op> ::= + | ++ |
|
<expr> ::= <num> |
| <string> |
| <id> |
| true |
| false |
| (if <expr> <expr> <expr>) |
|
| (<bin-op> <expr> <expr>) |
| (<un-op> <expr>) |
| (<extended-op> <expr> <expr> ...+) |
|
| (lam (<id> ...) <expr>) |
| (let ([<id> <expr>] ...) <expr>) |
| (<expr> ...+) |
where
<define-statement> follows the grammar for Racket
defines.
8.4.2.11 Testing Your Macros
To test your program, save a file (say, test.rkt) in the same directory, and put #lang s-exp "my-lazy.rkt" at the top. Once you’ve implemented your lazy language, the code in the test.rkt file should now follow lazy Paret semantics.
You should aim to write test code that exercises your caching feature to ensure it works. When you do this (and it works correctly), you may see a potentially dramatic change in performance.
8.4.2.12 Starter Code
We’ve provided starter code for your macro implementation at
my-lazy.rkt. This includes template macro definitions for Paret’s syntax forms as well as the updated
computation struct and new
ref-comp struct.
There are no wheats and chaffs for Part 2, and you do not need to submit a separate test suite.
8.5 Analysis
You should also concisely answer the following questions:
In the Caching section, we say “A lazy language must necessarily not have side-effects.” Explain why.
Instead of using ref-comps, we could implement Macros (with Caching) using a global hash table that maps the quoted version of computed expressions to their result. For instance, when we compute an expression (+ 1 2), we could map ’(+ 1 2) to the value 3. If we compute (+ 1 2) again, we check if the cache contains a mapping for ’(+ 1 2), and if so, return the stored value. The main advantage with this strategy over the ref-comp approach is that if an expression is passed as an argument in multiple places in the program, our cache only has to maintain one reference to that expression instead of multiple ref-comps. Does this caching strategy work? If so, explain why. If not, provide a counter-example.
Provide an example of a program that runs in your macros implementation that demonstrates the Caching power of your macros implementation. (Specifically, your submitted program for this question should run significantly slower if your macros implementation had not implemented caching.)
8.6 What To Hand In
You will submit three files for this assignment:
lazy-interp.rkt, which should be uploaded to the “Lazy - Interpreter (Code)” drop on Gradescope.
lazy-interp-tests.rkt, which should be uploaded to the “Lazy - Interpreter (Tests)” drop on Gradescope.
my-lazy.rkt, which should be uploaded to the “Lazy - Macros (Code)” drop on Gradescope.
Finally, submit your answers to the
Analysis questions to the “Analysis” drop on Gradescope. You will need to upload your answer to Question 3 as a
.rkt file.
You can update your submissions as many times as you want before the deadline.