Type Inference

In this assignment you will implement type inference for the variant of Paret used for type-checker. The only difference is that there are no longer type annotations, since you will be inferring them.

The Language

Refer to the specification in type-checker for the precise behavior and typing rules of the language. The only difference in the grammar is that lam and empty no longer take type annotations (and as a result, empty is no longer wrapped in parentheses). Here is the complete grammar:

<expr> ::= <num>
         | <id>
         | true | false
         | (+ <expr> <expr>)
         | (num= <expr> <expr>)
         | (link <expr> <expr>)
         | (if <expr> <expr> <expr>)
         | (lam (<id>) <expr>)
         | (let ((<id> <expr>)) <expr>)
         | (<expr> <expr>)
         | (first <expr>)
         | (rest <expr>)
         | (is-empty <expr>)
         | empty

Types

Unlike in type-checker, we can now have types with type variables in them, when the type of an expression is not fully constrained.

The type inference algorithm isn't particularly dependent on what kind of types you have in your language, so it makes sense to pick a generic representation that can easily be extended later. Thus the Type data definition has just two forms:

data Constructor:
  | c-num  # No fields
  | c-bool # No fields
  | c-fun  # Two fields: arg-type and return-type
  | c-list # One field: element-type
end

data Type:
  | t-var(name :: String)
  | t-con(con :: Constructor, fields :: List<Type>)
end

We also defined helper functions t-num, t-bool, t-fun, and t-list to work like the constructors for the Type from type-checker, so you can more easily port your test cases. For example, t-fun(t-num, t-bool) evaluates to t-con(c-fun, [list: t-con(c-num, empty), t-con(c-bool, empty)]). (Keep in mind you can't pattern match with these functions using cases since they are not real constructors.)

Constraints

The first step in type inference is to generate a set of constraints for all the expressions in the program. In order to enumerate the relationships between the types of the expressions, you will associate a unique type variable with each expression. New type variables can be generated with fresh-var().

You will also need to associate identifiers with type variables. For this purpose, you can use a TEnv during constraint generation to map from identifier names (e.g. x) to type variables (made using fresh-var()):

type TEnv = List<TypeCell>

data TypeCell:
  | type-cell(id-name :: String, var-type :: TypeVar)
end

A constraint equates a type variable with a type (possibly containing type variables associated with other expressions):

type TypeVar = String

data Constraint:
  | constraint(ty-var :: TypeVar, ty :: Type)
end

We are providing this type to you just for convenience. You can make your own and use that instead if you want a different representation. (For instance, if you want to follow the book's definitions more closely, you're free to do so, though it may make a couple cases trickier.)

Inferring types

Once you have generated a set of constraints, you need to unify them to get a coherent mapping from type variables to (possibly generic) types. The exact way you go about this is up to you.

Your job is to write a function type-infer that takes in the source of a Paret program and returns its inferred type.

You don't need to write an interpreter for the language. You also don't need to implement let polymorphism.

For fun, see if you can write a program that has the type t-fun(t-var("a"), t-var("b")).

Errors

There are a few errors you can encounter during type inference:

data TypeInferenceError:
  | unification-err(type1 :: Type, type2 :: Type)
  | occurs-check-err(tvar :: String, ty :: Type)
  | unbound-id-err(name :: String)
end

Raise unification-err if two types can't unify.
Raise occurs-check-err if a constraint fails the occurs check.
Raise unbound-id-err if you encounter an unbound identifier during constraint generation.

Testing

You will write test cases for the type-infer function. Your test file should not test constraint generation or unification directly, since these functions may be implementation specific. (It is, however, a good idea to test constraint generation and unification for yourself in your code file.)

type-infer returns a Type, so you will need to test types for equality. This can be tricky due to type variables. We provided a function normalize-type which takes a type, and renames all type variables to be lowercase letters in alphabetical order that they appear in the type. So for example,

check:
  normalize-type(t-fun(t-var("foo"), t-fun(t-var("bar"), t-var("foo"))))
              is t-fun(t-var("a"),   t-fun(t-var("b"),   t-var("a")))
end

Your type-infer should call normalize-type on its result. Then you can either write tests that expect an exact type, or you can simply call normalize your expected type to compare two types up to renaming. For example, both tests are fine:

check:
  type-infer("empty") is t-list(t-var("a"))
  type-infer("empty") is normalize-type(t-list(t-var("foo")))
end

Because the order of constraint generation and unification is implementation specific, you shouldn't write test cases that have more than one error.

Abstract syntax

data Expr:
  | e-op(op :: Operator, left :: Expr, right :: Expr)
  | e-un-op(op :: UnaryOperator, expr :: Expr)
  | e-if(cond :: Expr, consq :: Expr, altern :: Expr)
  | e-let(name :: String, expr :: Expr, body :: Expr)
  | e-lam(param :: String, body :: Expr)
  | e-app(func :: Expr, arg :: Expr)
  | e-id(name :: String)
  | e-num(value :: Number)
  | e-bool(value :: Boolean)
  | e-empty()
end

data Operator:
  | op-plus
  | op-num-eq
  | op-link
end

data UnaryOperator:
  | op-first
  | op-rest
  | op-is-empty
end

Assignment

To get started, open the code stencil and test stencil. You can see the full types and all the helper functions we mentioned in the definitions.

Submission

First submit your test cases (named "type-inference-tests.arr") in Captain Teach.

Finally, submit a zip file containing both your test and code files for type-inference. Call the files "type-inference-tests.arr" and "type-inference-code.arr".