Implementing Python (Part 1)
It's now time to apply all you've learned about writing interpreters, designing core languages, and desugaring to the core. This is the first of four cumulative assignments in which you'll implement a large subset of the Python programming language.
Assignment Structure
Unlike previous assignments, for which you had a self-contained, concrete specification, the Python assignments will act more as rough guidelines for reaching benchmarks of completion. We want YOU to decide what features need to be implemented first, what is in your core language, how features are desugared, etc.
This may include passing a certain number of tests or implementing a certain number of features - not everyone will have completed the same parts of Python by the time each assignment ends. Our benchmarks are merely rough suggestions to ensure you are making progress on your implementation, while allowing everyone the freedom to work as they desire.
Design Checks
The TAs will hold mandatory design checks, short 30 minute meetings during which you will be asked to discuss your core language design, representation choices, desugaring techniques, and implementation strategy. The first set of design checks will be held between October 28 - November 1. Keep posted for signup links.
Pair Programming
You will be working in pairs for the entirety of Python. Pair programming will help you divide the workload and enable a collaborative design effort. If you are particularly enterprising and would like to work on your own, we'll also allow you to work solo.
You must select a partner (or state you're working solo) no later than Monday, October 22. Once you've chosen your group, exactly one of you from each group should send an email to cs173tas@cs.brown.edu with the logins of both people in your group (we will allow one group of three to account for odd numbers, if necessary), or just your own if you're working solo.
The Test Suite
When you implemented ParselTongue, your implementation's correctness was determined by its compliance with a large suite of tests. You may have noticed that the reference implementation exposed design choices that were not discussed in the formal specification, and it was up to you to toy with the language to pin down its exact semantics.
The correctness of your Python implementation will analogously be judged on your compliance with a test suite. This suite is a subset of the official Python 3.2 test suite for the CPython implementation. We believe the subset we've chosen offers a representative sample of important core language features in Python. If you successfully pass all the tests in the suite, your implementation is correct, because Python itself must pass these same tests!
The real CPython test suite is based around Python's unittest
module, which uses a combination of inheritance and reflection to run tests.
Bootstrapping an implementation that relies on these advanced features simply
for testing would be difficult, so we've done some work to make your life
easier. We've taken the tests out of their unit test frameworks and
distributed them amongst the files in python-reference/
. Where
unittest
used calls like self.assertEqual
, we have
rewritten them to ___assertEqual
. We expect you to implement
these predicates as built-in checks in your implementation.
Your implementation will need to support (as built-in functions):
___assertTrue
___assertFalse
___assertIn
___assertNotIn
___assertEqual
___assertNotEqual
___assertRaises
___assertIs
___fail
The file python-lib.rkt
defines a potential
___assertTrue
for you, though you might choose a different
representation and implementation. In all cases, the functions should match
the specification as given in the documentation for Python's unit
testing framework for determining if they succeed or not (note that the
link is to the 2.7 documentation, which defines all the predicates that can be
used. Python 3's docs appear to be missing some of these, though they are
defined. This is the only documentation hiccup we've found). If they don't
succeed, they should terminate the program and print some error message on
standard error. If they do succeed, they should have no output and be an
effective no-op.
A note on errors in general: We've set up the test suite so no file has any expected standard out or standard error; something comes out on standard error only in the event of a test failure, and assertions don't produce any output. Since we don't expect you to flawlessly replicate CPython's stack trace output, it would be unfair to demand that standard output matches what Python does. You should feel free to produce your own error messages with whatever content is helpful to you during development.
We have included a way to run tests as usual, with files like
foo.py.expected
and foo.py.error
, which you're
encouraged to use in your own development.
Features to Implement
The test suite tests language features such as
- Iterators
- Lists
- Scoping
- "Types"
- Functions
- Objects
For an A, your implementation needs to handle all of the tests that we've
distributed. If you only are aiming for a B, you can ignore the tests in the
directories iter/
and range/
. Note that while there
are only 10 files across these directories, this represents a significant
reduction in features that you need to implement.
With this in mind, you must design a Python core and pass all the tests in at least three files from scope/ and three files from types/ from the test suite by the due date. This is a somewhat low bar; we expect that you can pass some others as well, but this ensures that we'll have something concrete to talk about at design-check time.
What We're Giving You
The repo on Github contains all the support code and tests for this assignment. Many of the individual files contain instructions on their use and what they do. Keep in mind that if you want, you can ignore any of our advice here and choose your own directory layout and design for your implementation entirely; ours is merely a starting point. A brief overview of what we have provided is here:
- python-syntax.rkt
-
This contains a (currently tiny) datatype that is intended to be a Racket representation of Python programs. You are free to change this to suit the needs of your particular design.
- python-core-syntax.rkt
-
This contains a (currently tiny) specification for a core language. We write our interpreter over this data structure.
- python-desugar.rkt
Contains the definition for
desugar
, a function from Python surface syntax (from python-syntax.rkt), to core syntax (python-core-syntax.rkt).- get-structured-python.rkt and parse-python.rkt
-
These two files work together to parse Python files into the definition given in python-syntax.rkt. parse-python.rkt uses your system's implementation of Python, in coordination with python-parser.py, to generate a JSON representation of Python source. This JSON representation lines up with the AST specification of Python programs. Each dictionary has a "type" field that corresponds to the name of a constructor in the specification. For example:
def foo(a, b=5): pass
When parsed, produces a JSON structure like:{ 'type' : 'Module', 'body' : [{ 'type' : 'FunctionDef', 'name' : 'foo', 'args' : { 'type' : 'arguments', 'args' : [{ 'type' : 'arg', 'arg' : 'a', 'annotation' : None, }, { 'type' : 'arg', 'arg' : 'b', 'annotation' : None }], 'vararg': None, 'varargannotation': None, 'kwonlyargs' : [], 'kw_defaults' : [], 'kwarg' : None, 'kwargannotation' : None, 'defaults' : [{ 'type' : 'Num', 'n' : 5 }], }, 'body' : [{ 'type' : 'Pass' }], 'decorator_list' : [], 'returns' : None }] }
Part of your task is taking this representation and turning it into whatever representation of Python programs you choose in python-syntax.rkt. You may, for example, decide to not distinguish between statements and expressions in your syntax data structure, which the Python grammar does. This mapping is defined in get-structured-python.rkt. - python-lib.rkt
-
This is a piece of the implementation that we didn't have for ParselTongue. There are a number of built-in functions in Python, like
map
andfilter
, that are quite verbose to define in desugaring. This file specifies built-in library functions that your Python implementation will have access to. These library functions are implemented in the core language, and there is a function defined that wraps their definitions around a piece of core Python code. This is where, for example, you should define the built-inassert
methods we ask you to write, and any other normally available functions that you might want to provide to desugared code.To get you started, we've implemented
___assertTrue
and a primitiveprint
in this style. Both are defined asCFunc
s --- functions in the core language --- and thepython-lib
procedure wraps an expression in let-bindings for these constructs. We've also defined the Python identifierTrue
as a core language valueCTrue
. - python-primitives.rkt
-
Python has many more primitives than ParselTongue, so we humbly recommend that you keep them separated in a different file.
- python-evaluator.rkt and run-tests.rkt
-
These contains some utilities for wrapping the standard out and standard error of an interpreter and desugarer, and testing them against a suite. If you want finer-grained control over the test suite, you can start here.
- python-main.rkt
-
There are currently four options in python-main.rkt to get you started, but you are of course encouraged to add more that help you on your way. Since we might update this, too, and keeping the web page and the file in sync is difficult, just run the --help option (or see the file), to see what it's capable of at the moment.
Some Starting Points
Mechanically, you should:Use
git
to clone the repository:$ git clone https://github.com/brownplt/cs173-python.git
Make sure you have Python 3.2 installed (release link), and know how to use it to run Python programs from the command line. Note that on the department machines this is installed already at
/course/cs173/python/Python-3.2.3/python
Make sure your Python release passes all the tests:
$ racket python-main.rkt --python-path your/python/path --test-py python-reference/
Make sure you can run the simplest programs that we've implemented for you (consider creating a wrapper script so you don't have to keep specifying python-path):
$ echo "print(5)" | racket python-main.rkt --python-path your/python/path --interp-py 5 $ echo "___assertTrue(True)" | racket python-main.rkt --python-path your/python/path --interp-py
$ echo "___assertTrue(False)" | racket python-main.rkt --python-path your/python/path --interp-py something about an AssertionFailure
types/
are some of the simplest. Start
with figuring out how to parse and run a few of those tests through the whole
stack --- this is probably the easiest path towards getting started. First
you'll need to figure out what they look like, and how you want to parse them
in get-structured-python.rkt
, and then push them through
desugaring and the interpreter.
After that, you'll need to get the assert
functions working to run
the rest of the tests. We're asking you to focus on scope/
because a number of other tests use function definitions and lambda, so you'll
need to have those worked out anyway.
Things to consider trying, and keep in mind:
Consider building a nicer way to display the core language than just as
plai-typed
structs. You'll probably be printing out expressions in the core to debug sometimes, and making that readable will go a long way.Feel free to break the tests we gave you down into even smaller pieces. We're presenting them to you at slightly finer granularity than Python's unit test suite, but there are still a few tests in each file, so breaking them up might make certain features more tractable.
Think carefully about your representation of values; what we've provided is almost certainly too naive to work. There is a complete list of the built-in values at the Python documentation. The tests we've chosen cover (at least) booleans, floats, ints, strings, tuples, dictionaries, lists, functions, simple ranges, and objects, but not everything on that page. Study both the tests and the documentation to understand what values you will need to be able to represent.
-
The Python documentation is your friend, as is the Python command line. Try things out to see what they do; often we won't know what a feature does precisely without running it ourselves. Note that the Python command line won't quite be perfect for you, because we've added these
___assert
things to the default environment. To remedy this, we've provided the ability to wrap your Python implementation in a script that appends the relevant testing functions in python-main.rkt. If you run with the --interp-py option, you'll get a Python that comes populated with the right environment (see py-prelude.py). -
Google and StackOverflow are also your friends; Python has a lot of support posted around the web, so if you don't understand a feature, search first! (It's what we're going to do if you ask us, just so you know.)
What to Turn In
You will need to turn in any Racket files used in your Python implementation, along with a single README file that explains how your code is structured and your design choices, all in a single zip file. We also ask that you submit the README separately as well, just to make our lives easier if we need quick access to it. This you do have to turn in on time so that we can give you a design check, which we're explicitly scheduling time for.
We aren't collecting a real grade report from you this time, so there's no
finalizing happening, and no magical binary. We do want a report from you, so
we've added a --progress-report
option to
python-main.rkt
. You should submit the result of running:
$ racket python-main.rkt --python-path your/python/path --progress-report python-reference/This standard format helps us automatically get a head count of how folks are doing. We'll distribute a real grader once we get a sense of what implementation strategies people are taking, since that will affect what grading scripts we're able to distribute.
You can turn in all the files at this upload link.