Class summary: Evaluating Programs that use Datatypes
Copyright (c) 2017 Kathi Fisler
1 Recap: Defining Datatypes
Last class, we learned how to create our own datatypes, and how to create values of those datatypes. Here’s the TimeData type that we created, and two examples of time values.
data TimeData: |
| time(hour :: Number, mins :: Number) |
end |
|
noon = time(12, 0) |
fivepm = time(17, 0) |
We also saw how to write functions that take our new datatype as input. Here is a different function than the one we wrote last week, which determines whether one time is earlier than another.
fun earlier(td1 :: TimeData, td2 :: TimeData) -> Boolean: |
doc: "determine whether first time is earlier than second" |
(td1.hour < td2.hour) or |
((td1.hour == td2.hour) and (td1.mins < td2.mins)) |
where: |
earlier(time(5,45), time(5, 46)) is true |
earlier(time(5,45), time(5, 45)) is false |
earlier(time(15,45), time(5, 46)) is false |
earlier(time(5,45), time(15, 46)) is true |
end |
2 Revisiting how Programs Evaluate
We have talked before about how when a function is called, Pyret substitutes the arguments for the parameters. If we try that here, it feels a bit heavyweight. What happens if we substitute from the first example in our where clause?
earlier(time(5,45), time(5, 46)) is true |
|
(time(5,45).hour < time(5, 46).hour) or |
((time(5,45).hour == time(5, 46).hour) and |
(time(5,45).mins < time(5, 46).mins)) |
If our datatypes had even more components, the substituted function body could get rather large (and very hard to read). For a variety of reasons, programming languages don’t actually do substitution. Instead, they keep track of what values each parameter maps to, and they look up those values when they are accessed.
2.1 The Program Dictionary
Early in the semester, we said that when we run a program, Pyret maintains a record of all the definitions in the program. This record is called the Program Dictionary, and it is a mapping of names (of functions and constants) to values.
We have said that when programs run, Pyret substitutes arguments to functions in their bodies, looking up definitions when needed. Actually, Pyret doesn’t do substitution (though what it does is equivalent). Actually, Pyret adds the parameters to the dictionary, mapping them to the arguments supplied in the function call.
Assume your Pyret file contains the two code portions at the top of the handout (the datatype definition, definition of noon and definition of earlier). When you run the file, Pyret creates the dictionary with the following contents:
noon --> time(12, 0) |
fivepm --> time(17, 0) |
earlier --> function at line 8 |
Now, assume you run earlier(time(5,45), time(5, 46)). Rather than substitute, Pyret extends the dictionary
noon --> time(12, 0) |
fivepm --> time(17, 0) |
earlier --> function at line 8 |
td1 --> time(5, 45) |
td2 --> time(5, 46) |
then evaluates the body of earlier. When it gets to an expression like td1.hour, it looks up td1 in the dictionary, then extracts the hour component.
After running earlier, assume you enter td1 at the interactions prompt. What will you get? You’ll get an unbound identifier error. We’ve said that the parameters are only visisble within the body of the function, so this error makes sense.
But the dictionary we wrote has entries for td1 and td2. So what do we learn from the unbound identifier error?
When a function call ends, Pyret removes the entries for its parameters from the dictionary.
Think of the dictionary being in segments, one for the entries made when you initially run the program (the functions and constants), and a separate segment for each function call:
noon --> time(12, 0) |
fivepm --> time(17, 0) |
earlier --> function at line 8 |
------- <call to earlier> ------- |
td1 --> time(5, 45) |
td2 --> time(5, 46) |
When a function call ends, it’s segment is removed from the dictionary.
If we then call earlier again, do we just replace the values for td1 and td2? No – those entries were deleted when the first call ended, so we simply add a new segment with the new values for those parameters.
2.2 Stepping Back
Why are we introducing the dictionary, instead of staying with substitution? Thinking in terms of dictionaries is easier once we have values from datatypes with many components (which get painful to copy as we do with substitution). But more importantly, the dictionary sets us up for some crucial ideas that we will confront when we get to Python in November.
3 Another Data Design Exercise
Calendar entries have a description, a date, a start time, and a duration (in minutes). You want to write a program that helps someone manage their calendar for an entire year. What combination of lists, tables, and datatypes would you use? Indicate the types of list contents, the types of table columns, and define any new datatypes that you need.
Possible ideas included:
Have a list of tables, one per month
Have a table with one row per day, then a list of entries for that day, where entries are a new datatype
Have a list of entries, where entries are a new datatype, with no separation into months
Which of these (or other) organizations makes sense depends heavily on what computations you want to perform. If you are going to do many calendar searches within a month, then an organization that clusters entries by months will make it easier to focus in on the entries you care about when doing a computation. If you will look mostly within a day, maybe you want to cluster by days, and so on.
This example sets us up for a discussion we will have over the rest of the course: how do we organize data to perform tasks efficiently? Depending on your data organization, computations can be faster or slower. We’re ready to start exploring these issues.