# cs173: Assignment 6

### Part I: Unification

Implement the unification algorithm from the 2002-11-13 notes. The algorithm should work for a generic term representation, as defined below.

A term is either:

• an constant, which contains a symbol,
• a variable, which can contain any value, or
• a constructor of the form C(t1, ..., tn), where C is a symbol and the ti are a list of sub-terms.

In addition, you will need data types for representing a constraint (a pair of terms) and substitution (a variable and a term). The unification algorithm will consume a list of constraints and produce either a list of substitutions or an error string.

Errors can arise from two situations: when the unification of two terms is impossible, or when the occurs check fails. In both cases, you should return a string with an appropriate error message.

Finally, when comparing variables for equality, use Scheme's built-in eq? function. For symbols, it behaves exactly as symbol=?; for other values, it compares them for identity (like Java's == comparison). We will rely on identical variables being deemed equivalent by eq? when solving the constraints generated in the following section.

### Part II: Generating Type Constraints

Following the 2002-11-11 notes, derive type constraints for this language:

```   <expr> ::= <num>
| true
| false
| {+ <expr> <expr>}
| {- <expr> <expr>}
| {* <expr> <expr>}
| {iszero <expr>}
| {bif <expr> <expr> <expr>}

| <id>
| {with {<id> <expr>} <expr>}
| {rec {<id> <expr>} <expr>}
| {fun {<id>} <expr>}
| {<expr> <expr>}

| tempty
| {tcons <expr> <expr>}
| {tempty? <expr>}
| {tfirst <expr>}
| {trest <expr>}
```
The only novelty of this language is that the list operations are now polymorphic; that is, you can create lists of values of any type. Write a function which consumes an expression of this language, and returns a list of constraints (of the type defined in Part I). The correspondence between type constraints and the terms in Part I is as follows:
• The constants are the base types: they should contain either the symbol 'number or 'boolean.
• The variables correspond to the expressions and identifiers whose types you wish to learn. Thus, they should contain either an expression, or a symbol representing an identifier. You may assume that all identifiers are unique; i.e., no name is ever bound more than once.
• The constructors are the type constructors. They should contain either:
• the symbol '-> and a list of two arguments, or
• the symbol 'listof and a list of one argument.
In some cases, you may need to a fresh identifier when defining constraints. The Scheme function gensym returns a unique identifier on every call.

### Part III: Inferring Types

To infer the type of a program, first parse it, then generate constraints, and finally unify the constraints. The result will be a list of substitutions; by looking up the subsitution for the entire expression, you can access its type.

Your code needs to define only one function, infer-type, which consumes a concrete representation of the program, and produces either an error string or a representation of the inferred type. Represent types concretely as:

```   <type> ::= number
| boolean
| (listof <type>)
| (<type> -> <type>)
| <string>
```
where strings are used to represent type variables. For example, the type of length would be:
```   ((listof "a") -> number)
```

### Extra Credit

For a very small amount of extra credit, write a program in this language for which your algorithm infers the type ("a" -> "b"). You shouldn't attempt this problem until you've fully completed the assignment.

### What Not To Do

You do not need to implement an interpreter for this language.

You do not need to implement let-based polymorphism.

## FAQ

1. What should I do about unbound identifiers?

You may assume all identifiers are bound exactly once; i.e., there are no unbound identifiers, and all no name is ever bound more than once.

2. Does the rhs of rec binding have to be a syntactic function?

No. However, you may assume that the rec-bound identifier only appears under a {fun ...} in the rhs of the binding. In other words, the following expressions are legal:

```     {rec {f {fun {x} {f x}}}
...}

{rec {f {with {y 4}
{fun {x} {f y}}}}
...}
```
while the following are not legal:
```     {rec {f f}
...}

{rec {f {+ 1 f}}
...}
```