# Desugaring ParselTongue

In this assignment, you will desugar full ParselTongue to core ParselTongue. The core interpreter will be provided as a black-box; you will compose your desugarer with the core to produce a working implementation of the full language.

## Materials

The template and test bundle should contain a directory called suite/, with several subdirectories, and two Racket files, typed-desugar-template.rkt and typed-lang.rkt.

Your task is to write a Racket program that implements a function called desugar, which converts between full ParselTongue and core ParselTongue. These two definitions are given in typed-lang.rkt First, ExprP is the abstract syntax for full ParselTongue, which has P-suffixed type constructors. These should obviously correspond to the expression forms from the ParselTongue specification. We're providing you with a definition of, and an interpreter for, the language described by ExprC, the core language that ParselTongue uses.

You are going to write desugar, which should follow the following template:

(define (desugar [exprP : ExprP]) : ExprC
)


Your desugar is successful when it passes all the tests included in the binary, and runnable with --test-with-desugar. (We thank you all for providing such exhaustive tests in the first assignment!)

The rest of this document describes the core language, and has a walkthrough of how to get started with desugaring.

## Features of the Core

The core language has a few features that don't show up in the surface language, and some features that warrant explanation:

• TrueC, FalseC, NumC, and StrC simply evaluate to the obvious corresponding values, just like NumC in the language we've built up in class.

• ObjectC creates an object value by evaluating all of the sub-expressions that make up the values of its fields

• GetFieldC evaluates its subexpressions to values, and if they are an object and a string, provides the corresponding field in the object. It will also throw errors if non-objects or non-strings are provided, in correspondence with the specification.

• SetFieldC evaluates its subexpressions to values, and if the first two are an object and a string, creates a new object with the string field replaced or added as appropriate. It will also throw errors if non-objects or non-strings are provided, in correspondence with the specification.

• FuncC behaves almost just as the lamC from the notes in chapter 8 - it closes over its environment and, when applied, creates a new location for each parameter. If more than one parameter of the same name are provided, the last one is bound to the argument in application.

• AppC behaves just as appC in the notes. It throws errors if the function position is not a function, and arity mismatch errors if the wrong number of arguments are provided. It evaluates its arguments left-to-right, and before it reports any arity mismatch errors.

• LetC creates a new location, binds the provided identifier to the new location in the environment, and evaluates its body in that environment.

• IdC evaluates to the value held at the location bound to the identifier in the current environment.

• Set!C behaves as setC in the notes - it replaces the value at the location bound to the identifier with the value that the right-hand-side evaluates to.

• (IfC test then else) evaluates its test, and if that results in the false value, it evaluates the else branch. Otherwise, it evaluates the true branch.

• SeqC evaluates its first expression, throws out the result (but keeps side-effects), and evaluates the second expression.

• Prim1C evaluates the unary operators print and tagof:

• print prints a string representation of its argument, as per the spec

• tagof is a primitive operation in the core that doesn't appear in the main language. tagof takes a single argument, and returns a string representing its type:

• "object" for objects
• "function" for functions
• "string" for strings
• "number" for numbers
• "boolean" for true and false

For example:

tagof(5) -> "number"
tagof(+("asdf","foo")) -> "string"
tagof({}) -> "object"

• Prim2C evaluates the binary operators num+, num-, string+, <, >, and ==

• num+ takes two number values and adds them together. Any non-numeric arguments result in an error with "Bad arguments for num+: ", followed by the string representations of its arguments

• num- takes two number values and subtracts the second from the first. Any non-numeric arguments result in an error with "Bad arguments for num-: ", followed by the string representations of its arguments

• string+ takes two string values and concatenates them Any non-string arguments result in an error with "Bad arguments for string+: ", followed by the string representations of its arguments

• < takes two number values and yields true if the first is smaller than the second, and false if not. Any non-number arguments result in an error with "Bad arguments for <: ", followed by the string representations of its arguments

• > takes two number values and yields true if the first is greater than the second, and false if not. Any non-number arguments result in an error with "Bad arguments for >: ", followed by the string representations of its arguments

• == takes any two values and compares them for equality using the Equal metafunction from the specification, returning true if they are Equal, and false otherwise.

• ErrorC is an expression form that halts the program with a string message. The message is the same string that print would output when given the same value, but the output occurs on stdout:

(ErrorC (StrC "This is an error!")) -> "This is an error!" on stderr
(ErrorC (ObjectC (list))) -> "object" on stderr


## Getting Started, and Some Desugaring Tricks

Writing desugar is nontrivial, and we don't expect you to do it without help. For many cases, desugaring actually _is_ trivial, and handing the top level expression down basically unchanged is the right thing to do. For example, when desugaring number expressions NumP, we can just convert them directly to the number expression of the core:
(define (desugar (exprP : ExprP)) : ExprC
(type-case ExprP exprP
[NumP (n) (NumC n)]
... more cases here ...
))


Other expressions, however, we express in terms of a combination of more fundamental features in the core. We'll go through two - while loops and arbitrary-arity subtraction.

### Desugaring While Loops

To get you started, we're going to walk through what you need for a pretty tricky desugaring - while. Notice that WhileP is defined in ParselTongue (as an ExprP), but is not defined at all in the core language (there is no WhileC). How do we pull off this desugaring?

Let's start by filling in a little bit of desugar:

#lang plai-typed

(require "typed-lang.rkt")

(define (desugar [exprP : ExprP]) : ExprC
(type-case ExprP exprP
...
[WhileP (test body)
... fill in here ...]
)


What do we "fill in here?" Well, the specification has this to say:

1. Evaluate test to a value v.
2. If v is the value false, then yield false as the result of the whole expression
3. Otherwise
1. Evaluate body to a value v2.
2. Evaluate test to a value v3.
3. If v3 is the value false, then yield v2 as the result of the whole expression.
4. Otherwise, go to sub-step 3.1 above and repeat.

We need to start by evaluating the test and checking if it's false. That's pretty straightforward, we can us IfC for that. Note the recursive call to desugar - we need to make sure we always have expressions that are ExprC's:

    [WhileP (test body)
(IfC (desugar test)
... step 3 goes here ...
(FalseC)]


The next thing to fill in is the "Otherwise..." case of the specification. Let's try the straightforward transliteration of the spec again, evaluating body and then test again

    [WhileP (test body)
(IfC (desugar test)
(SeqC (desugar body)
(IfC (desugar test)
... step 4 ...
... v2? ...
(FalseC)]


What should we put in place of v2 above? We could try putting the desugared expression there again:

    [WhileP (test body)
(IfC (desugar test)
(SeqC (desugar body)
(IfC (desugar test)
... step 4 ...
(desugar body)))
(FalseC)]


But this isn't a good idea. This program:

  defvar x = 0
while(<(x,2)) {
x = +(x,1)
}


would result in x being bound to a location that contains 3 rather than 2, because the body's effects would be duplicated by the second occurrence of the body!

This is an important consideration in desugaring a language with mutation - we need to be really careful about understanding just what expressions we're putting where!

We need to store the result of evaluating somewhere so we can only evaluate once. LetC is perfect for this:

    [WhileP (test body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
... step 4 ...
(IdC 'temp)))
(FalseC)]


Now, the value of evaluating the body is stored in the identifier temp for us to access later.

On to step 4. We just need to go back to sub-step 3.1, which was to evaluate the body, then the test. We can just put the same thing in again:

    [WhileP (test body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
(LetC 'temp
(desugar body)
(IfC (desugar test)
... step 4 ...
(IdC 'temp)))
(IdC 'temp)))
(FalseC)]


Uh oh. We can't just nest these expressions infinitely; we need some way to bundle up this computation and use it again. Thankfully, we know how to do that - we set up recursion! We can bind a dummy value to a name that we use in the body of a function we create, and then set the name to the function after, like so:

    [WhileP (test body)
;; dummy-fun will tell us it was called if we do so accidentally
(local ([define dummy-fun (FuncC (list) (ErrorC (StrC "Dummy function")))])
(IfC (desugar test)

;; while-var will hold the actual function once we tie
;; everything together
(LetC 'while-var dummy-fun
(LetC 'while-func

;; this function does the real work - it runs the body of
;; the while loop, then re-runs it if the test is true, and
;; stops if its false
(FuncC (list)
(LetC 'temp-var
(desugar body)
(IfC (desugar test)
(AppC (IdC 'while-var) (list))
(IdC 'temp-var))))

;; The Set!C here makes sure that 'while-var will resolve
;; to the right value later, and the AppC kicks things off
(SeqC (Set!C 'while-var (IdC 'while-func))
(AppC (IdC 'while-var) (list)))))

(FalseC)))]


This is (one way) to implement a While loop using only functions and mutation. Our loop has grown up! We'll leave for loops to you.

Note that we've carefully chosen to use variables with "-" in them (while-var, temp-var) User code cannot have identifiers that contain "-". What bad things could happen if we used identifiers that could overlap with user code?

### Desugaring Subtraction

The specification for ParselTongue allows any number of arguments to the - operator. If they are all numbers, it performs subtraction left-to-right. If any of the arguments aren't numbers, it throws an error. Our core only supports binary subtraction in the form of num-, so we clearly have some work to do here. Let's get the skeleton of the case written down:

(define (desugar [exprP : ExprP]) : ExprC
...
[PrimP (op args)
(case op
['- ... fill in subtraction here ...])]
...
)


And let's have a look at the specification, as well:

To evaluate the expression -(expr1, ..., exprn), for n > 0:

1. Evaluate expr1, ..., exprn to a list of values v1, ..., vn, starting with expr1, from left to right.
2. If n is 1 and v1 is a number, then yield v1.
3. Otherwise, if all of v1, ..., vn are numbers, then yield the result of applying the Racket - operation to v1, ..., vn
4. Otherwise, yield error("Bad arguments to -")

Note first that for 0 arguments, there is a special case:

  [PrimP (op args)
(case op
['- (cond
[(= 0 (length args)) ... error for 0, find it in the spec ...]
[(< 0 (length args)) (desugar-subtract args)])])]


The actual work is going to get a little longer than we want to do here, so we've made a helper for ourselves:

;; somewhere else in the file
(define (desugar-subtract (args : (listof ExprP))) : ExprC
...)


What do we want to do here? Well, the specification tells us that we need to evaluate all the arguments to values. And, we're going to need to use those values later. LetC is going to be helpful here again. But, we don't know up front how many arguments there will be, and we're going to need an identifier for each one. We need to create new identifiers on demand:

(define (make-ids (n : number)) : (listof symbol)
(build-list n (lambda (n) (string->symbol (string-append "var-" (to-string n))))))


Now we can at least have the identifiers that we need:

(define (desugar-subtract (args : (listof ExprP))) : ExprC
(local ([define ids (make-ids (length args))])
...))


Now we need to think about what we'll want these expressions to look like. We will at least want to let-bind each argument expression, so

-(e1, e2, e3)


should desugar to something like:

(LetC 'var1 (desugar e1)
(LetC 'var2 (desugar e2)
(LetC 'var3 (desugar e3)
... check for numbers and do the subtraction ...)))


We can fill in a bit more of what we want to check and do, also. Here's a proposal, though we don't know how to write the stuff inside and yet:

(LetC 'var1 (desugar e1)
(LetC 'var2 (desugar e2)
(LetC 'var3 (desugar e3)
(IfC (and (Prim2C '== (Prim1C 'tagof (IdC 'var1)) (StrC "number"))
(Prim2C '== (Prim1C 'tagof (IdC 'var2)) (StrC "number"))
(Prim2C '== (Prim1C 'tagof (IdC 'var3)) (StrC "number")))
(Prim2C 'num- (Prim2C 'num- (IdC 'var1) (IdC 'var2)) (IdC 'var3))


See what we did here? We're using a combination of tagof and ==, primitives we have available in the core, to check for number-ness of each of the arguments that we've evaluated. We don't have a primitive for and, so we'll need to get to that. We also use num-, the binary operation from the core, to actually do the subtraction. Note that the order of operations in num- is important; we need to subtract the second from the first before subtracting the third, otherwise

-(9,2,1)


might give us 8 back instead of 6.

If we want to put this all together for an arbitrary number of expressions, we need a few pieces:

;; cascade-lets will build up the nested lets, and use body as the
;; eventual body, preserving order of evaluation of the expressions
(define (cascade-lets (ids : (listof symbol))
(exprs : (listof ExprC))
(body : ExprC)) : ExprC
(cond [(empty? ids) body]
[(cons? ids)
(LetC (first ids) (first exprs) (cascade-lets (rest ids) (rest exprs) body))]))

;; check-type builds an expression that checks the type of the expression
;; given as an argument
(define (check-type (expr : ExprC) (type : string)) : ExprC
(Prim2C '== (Prim1C 'tagof expr) (StrC type)))

;; and builds up an and expression from its two pieces
(define (and (expr1 : ExprC) (expr2 : ExprC)) : ExprC
(IfC expr1 expr2 (FalseC)))

;; all builds up a series of ands over the expression arguments
(define (all (exprs : (listof ExprC))) : ExprC
(foldl (lambda (exp result) (and exp result)) (TrueC) exprs))

;; map-subtract builds an expression that maps 'num- over a list of expressions
(define (map-subtract (exprs : (listof ExprC))) : ExprC
(foldl (lambda (expr result) (Prim2C 'num- result expr)) (first exprs) (rest exprs)))


Now we can put all of these things together:

(define (desugar-subtract (args : (listof ExprP))) : ExprC
(local ([define ids (make-ids (length args))]
[define id-exps (map IdC ids)])
(IfC (all (map (lambda (e) (check-type e "number")) id-exps))
(map-subtract id-exps)


So we've taken our list of args and desugared it to type-tests and binary operators, preserving the right order of operation, side effects, and the error case. Subtraction is one of the most complicated cases in the desugaring; you should use what we've shown you here to build up what you need for +, which needs to make a decision between strings and numbers.

### Getting Started From Here

The zip bundle, along with the appropriate binary, are enough to get started running your desugarer composed with our core interpreter. Your implementation won't do much initially, because it doesn't contain even the code for desugaring booleans and strings:

$echo "true" | debian-dist/bin/assignment3-debian --interp-with-desugar typed-desugar-template.rkt Haven't desugared a case yet: (TrueC)  We've added the "Haven't desugared..." message in the else clause of the type-case so you'll be informed when you're trying to desugar a program and haven't yet tackled a type of expression. We did add the number case for you, so you can see subtraction working: $ echo "-(9,2,1)" | debian-dist/bin/assignment3-debian --interp-with-desugar typed-desugar-template.rkt
6


You should start by filling in the cases in desugar for the constants, and then start building up the other expressions so you can test more and more.

### Test Cases

We're providing a really big test suite for you, built out of 73 files of our own tests (in brown/), and 420 tests that are made up of 4 different submissions of the first assignment. These aren't necessarily the best submissions (we've only had time to look carefully at a fraction of the hundreds that were submitted!), but they are far from the worst, and cover a broad set of features well.

We're only going to grade you on passing or failing the tests in brown/ for this assignment, as that's all the grading script checks. You should try to hit all the tests in the contrib/ directories as well, because we are going to use an expanded test suite for the next assignment, where you'll be implementing the core of ParselTongue, and it will include some of the other 400+ tests.

Remember that you can use the --test-with-desugar option to test single subdirectories at a time (for example, the distributed tests have some directories that contain only object tests). Feel free to move tests around and write new ones that are specifically targeted at features you're working on desugaring; the grading script doesn't care at all about the test directory you downloaded, as it has all the tests it uses included internally.

## Executable

The executable for this assignment has several options:

--stdin <filename>

Takes a filename to use as the standard input (useful for Windows
folks)

--stdout <filename>

Takes a filename to redirect standard output to (useful for Windows
users)

--stderr <filename>

Takes a filename to redirect standard err to (useful for Windows
users)

NOTE:  --stdin, --stderr, and --stdout must come before other
options to have their effect.

--interp

Runs stdin with the correct interpreter.  This is included mainly
for reference and convenience.

--interp-with-desugar <filename>

Expects <filename> to be the name of a Racket file that contains an
implementation of desugar (as described above).  Composes the
desugarer there with the correct interpreter, reads the program from
stdin, and outputs the result on stdout/err.

Note on filenames for Windows users:  The script complains if you have a
leading ".\" in front of the filename for your desugarer.  Do this:

$path\to\bin\assignment3-win32 --interp-with-desugar my-desugar.rkt Not this:$ path\to\bin\assignment3-win32 --interp-with-desugar .\my-desugar.rkt

--brief

Briefer output for test-with-desugar and test-with normal

--test-with-desugar <filename> <path>

Expects <filename> to be the name of a Racket file that contains
an implementation of desugar (as described above).  Composes an
evaluator together as with --interp-with-desugar, but runs it on
all the tests at <path>.

--test-with-normal <path>

Runs all the tests at <path> with the normal interpreter.  Useful for
reference and convenience.

--report <filename>

Generates grade report on standard out (or the argument to --stdout),

• A single Racket file that defines desugar, using the template above