Memory

Identity and equality

Let’s say we’re making a 2D game similar to the one in Project 2. We represent coordinates as a dataclass:

from dataclasses import dataclass

@dataclass
class Coord:
    x: int
    y: int

(This code can be found here.)

Our character starts out at (0, 0):

>>> char_coord = Coord(0, 0)

Our character has a hat and a sword. They should both start out at the same position as the character:

>>> hat_coord = Coord(0, 0)

Maybe we don’t want to type out the coordinate again, so for the sword we say:

>>> sword_coord = char_coord

We move our character:

>>> char_coord.x = 1

What happens to the sword and the hat?

>>> hat_coord
Coord(0, 0)
>>> sword_coord
Coord(1, 0)

There’s something strange going on here. If we move the character back,

>>> char_coord.x = 0

We can look at the program directory:

name value
char_coord Coord(x=0, y=0)
hat_coord Coord(x=0, y=0)
sword_coord Coord(x=0, y=0)

These all look the same. But, we just saw that there’s something else going on. sword_coord and char_coord are connected in some way–they’re actually pointing at the same data. We’re going to need a more accurate model than the program directory in order to predict how Python programs will behave!

A new model

The weird thing that happened in the program above was that we created two Coord instances, but had three names. So, it seems like we’ll need to separate out data from names.

In our new model, we’ll still have the program directory. But we’ll also have the memory, which is where mutable data live. The memory for the program above looks like this:

Memory  
loc 1000 Coord(0, 0)
loc 1001 Coord(0, 0)

Those weird loc ... things are memory locations. Whenever we create mutable data, Python puts it into memory at the next location. We start the location count at 1000 just to make it really clear what’s a location and what’s not–the specific numbers are otherwise unimportant.

So, memory is where our two coordinates live. What does the program directory look like?

Program directory  
char_coord loc 1000
hat_coord loc 1001
sword_coord loc 1000

char_coord and sword_coord are pointing at the same location in memory. When we change the data at that location, that change is reflected in both names.

Let’s see a few examples of how statements execute in this new programming model.

Lookups

We’re starting out with this program state:

Program directory  
char_coord loc 1000
hat_coord loc 1001
sword_coord loc 1000
Memory  
loc 1000 Coord(x=0, y=0)
loc 1001 Coord(x=0, y=0)

What happens when we execute this line at the console?

>>> sword_coord.y

In order to evaluate this statement, we’ll first look up sword_coord in the program directory. It’s pointing at loc 1000, so we’ll look at that location in memory. We’re looking at the y field, so we look up that field in the value at loc 1000 and find 0.

Updates (1)

How about this one?

> char_coord.x = 1

We first look up char_coord in the directory and find loc 1000. We then change the x field of the Coord at loc 1000 to 1. So, our state is now

Program directory  
char_coord loc 1000
hat_coord loc 1001
sword_coord loc 1000
Memory  
loc 1000 Coord(x=1, y=0)
loc 1001 Coord(x=0, y=0)

Updates (2)

> char_coord = Coord(0, 4)

We’re building a new Coord, so we’ll need to add it to memory at loc 1002. Then, we change the directory entry of char_coord to point at this new location. We’re not changing the contents of memory at loc 1000–we’re changing what the char_coord name points to!

Program directory  
char_coord loc 1002
hat_coord loc 1001
sword_coord loc 1000
Memory  
loc 1000 Coord(x=1, y=0)
loc 1001 Coord(x=0, y=0)
loc 1002 Coord(x=0, y=4)

Update rules

  • We add to memory when a data constructor is used
  • We update memory when a field of existing data is reassigned
  • We add to the directory when a name is used for the first time (this includes parameters and internal variables when a function is called)
  • We update the directory when a name that is already in the directory is reassigned to a different value

Atomic values

Some values are atomic–they don’t have components, and can’t be modified in place. These values include numbers, booleans, and strings. When variables are bound to these values, we record them directly in the program dictionary.

> a = 2
> b = 3
> char2_coord = Coord(a, b)
Directory name value
  char_coord loc 1002
  hat_coord loc 1001
  sword_coord loc 1000
  a 2
  b 3
  char2_coord loc 1003
Memory location value
  loc 1000 Coord(x=1, y=0)
  loc 1001 Coord(x=0, y=0)
  loc 1002 Coord(x=1, y=4)
  loc 1003 Coord(x=2, y=3)

The move function

Let’s write a function to change a Coord:

def move(c:  Coord, dx: int, dy: int):
  """adds dx to c.x and dy to c.y"""

How would we write this function? Here are two possibilities:

def move1(c:  Coord, dx: int, dy: int):
  """adds dx to c.x and dy to c.y"""
  c.x = c.x + dx
  c.y = c.y + dy
def move2(c:  Coord, dx: int, dy: int):
  """adds dx to c.x and dy to c.y"""
  c = Coord(c.x + dx, c.y + dy)

Are these two functions different? Which one is right?