Memory
Identity and equality
Let’s say we’re making a 2D game similar to the one in Project 2. We represent coordinates as a dataclass:
from dataclasses import dataclass @dataclass class Coord: x: int y: int
(This code can be found here.)
Our character starts out at (0, 0):
>>> char_coord = Coord(0, 0)
Our character has a hat and a sword. They should both start out at the same position as the character:
>>> hat_coord = Coord(0, 0)
Maybe we don’t want to type out the coordinate again, so for the sword we say:
>>> sword_coord = char_coord
We move our character:
>>> char_coord.x = 1
What happens to the sword and the hat?
>>> hat_coord Coord(0, 0) >>> sword_coord Coord(1, 0)
There’s something strange going on here. If we move the character back,
>>> char_coord.x = 0
We can look at the program directory:
name | value |
---|---|
char_coord |
Coord(x=0, y=0) |
hat_coord |
Coord(x=0, y=0) |
sword_coord |
Coord(x=0, y=0) |
These all look the same. But, we just saw that there’s something else going
on. sword_coord
and char_coord
are connected in some way–they’re actually
pointing at the same data. We’re going to need a more accurate model than the
program directory in order to predict how Python programs will behave!
A new model
The weird thing that happened in the program above was that we created two
Coord
instances, but had three names. So, it seems like we’ll need to separate
out data from names.
In our new model, we’ll still have the program directory. But we’ll also have the memory, which is where mutable data live. The memory for the program above looks like this:
Memory | |
---|---|
loc 1000 |
Coord(0, 0) |
loc 1001 |
Coord(0, 0) |
Those weird loc ...
things are memory locations. Whenever we create mutable
data, Python puts it into memory at the next location. We start the location
count at 1000
just to make it really clear what’s a location and what’s
not–the specific numbers are otherwise unimportant.
So, memory is where our two coordinates live. What does the program directory look like?
Program directory | |
---|---|
char_coord |
loc 1000 |
hat_coord |
loc 1001 |
sword_coord |
loc 1000 |
char_coord
and sword_coord
are pointing at the same location in
memory. When we change the data at that location, that change is reflected in
both names.
Let’s see a few examples of how statements execute in this new programming model.
Lookups
We’re starting out with this program state:
Program directory | |
---|---|
char_coord |
loc 1000 |
hat_coord |
loc 1001 |
sword_coord |
loc 1000 |
Memory | |
loc 1000 |
Coord(x=0, y=0) |
loc 1001 |
Coord(x=0, y=0) |
What happens when we execute this line at the console?
>>> sword_coord.y
In order to evaluate this statement, we’ll first look up sword_coord
in the
program directory. It’s pointing at loc 1000
, so we’ll look at that location in
memory. We’re looking at the y
field, so we look up that field in the value at
loc 1000
and find 0
.
Updates (1)
How about this one?
> char_coord.x = 1
We first look up char_coord
in the directory and find loc 1000
. We then change the
x
field of the Coord
at loc 1000
to 1
. So, our state is now
Program directory | |
---|---|
char_coord |
loc 1000 |
hat_coord |
loc 1001 |
sword_coord |
loc 1000 |
Memory | |
loc 1000 |
Coord(x=1, y=0) |
loc 1001 |
Coord(x=0, y=0) |
Updates (2)
> char_coord = Coord(0, 4)
We’re building a new Coord
, so we’ll need to add it to memory at loc
1002
. Then, we change the directory entry of char_coord
to point at this
new location. We’re not changing the contents of memory at loc 1000
–we’re
changing what the char_coord
name points to!
Program directory | |
---|---|
char_coord |
loc 1002 |
hat_coord |
loc 1001 |
sword_coord |
loc 1000 |
Memory | |
loc 1000 |
Coord(x=1, y=0) |
loc 1001 |
Coord(x=0, y=0) |
loc 1002 |
Coord(x=0, y=4) |
Update rules
- We add to memory when a data constructor is used
- We update memory when a field of existing data is reassigned
- We add to the directory when a name is used for the first time (this includes parameters and internal variables when a function is called)
- We update the directory when a name that is already in the directory is reassigned to a different value
Atomic values
Some values are atomic–they don’t have components, and can’t be modified in place. These values include numbers, booleans, and strings. When variables are bound to these values, we record them directly in the program dictionary.
> a = 2 > b = 3 > char2_coord = Coord(a, b)
Directory | name | value |
---|---|---|
char_coord |
loc 1002 |
|
hat_coord |
loc 1001 |
|
sword_coord |
loc 1000 |
|
a |
2 |
|
b |
3 |
|
char2_coord |
loc 1003 |
|
Memory | location | value |
loc 1000 |
Coord(x=1, y=0) |
|
loc 1001 |
Coord(x=0, y=0) |
|
loc 1002 |
Coord(x=1, y=4) |
|
loc 1003 |
Coord(x=2, y=3) |
The move
function
Let’s write a function to change a Coord
:
def move(c: Coord, dx: int, dy: int): """adds dx to c.x and dy to c.y"""
How would we write this function? Here are two possibilities:
def move1(c: Coord, dx: int, dy: int): """adds dx to c.x and dy to c.y""" c.x = c.x + dx c.y = c.y + dy
def move2(c: Coord, dx: int, dy: int): """adds dx to c.x and dy to c.y""" c = Coord(c.x + dx, c.y + dy)
Are these two functions different? Which one is right?