Class summary: Memory and the Known Names
Copyright (c) 2017 Kathi Fisler
Last class, we looked at the following sequence of program operations, and how subsequent changes to the contents of the ToDoItems were were visible through the different names.
call1 = ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call2 = ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call3 = call1 |
We discussed how modifying a field of either call1 or call3 would be visible through the other, while modifications to fields of call2 aren’t visible to either call1 or call3. We said there was a relationship between call1 or call3 that isn’t visible if known names are written as follows:
call1 --> ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call2 --> ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call3 --> ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
This suggests that maybe our known-names needs additional information.
1 Introducing Memory
Every time you use a constructor to create data, the new data gets stored in a part of your computer called memory. Memory is a mapping from locations (or addresses) to values. Just as a street address refers to a specific building, a memory address refers to a specific piece of data.
In our example above, call1 and call3 are referring to the same location in memory, whereas call2 refers to similar data in a different location in memory (sort of like identical twins living at different addresses, though there is no analog for call1 and call3 with this metaphor).
So to really understand how updating variables and fields affects your program, you have to track two pieces of information: memory, which indicates which pieces of data you have, and the known names, which indicates which data value each name refers to. For compound data (like ToDoItems), we represent data values by their locations in memory.
Returning to our original code fragment (plus a different ToDoItem for illustration), here’s how memory and the known names appear.
milk = ToDoItem("buy milk", date(2018, 11, 15), ["shopping"]) |
call1 = ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call2 = ToDoItem("call fred", date(2018, 11, 15), ["urgent"]) |
call3 = call1 |
Known Names Memory |
------------------------------------------------------------------- |
milk --> (loc 1000) (loc 1000) --> ToDoItem("buy milk" ...) |
call1 --> (loc 1001) (loc 1001) --> ToDoItem("call Fred" ...) |
call2 --> (loc 1002) (loc 1002) --> ToDoItem("call Fred" ...) |
call3 --> (loc 1001) |
The known names maps each name to the location where its data is stored. Memory maps locations to values. call1 and call3 refer to the same memory location (the is relationship in Python), but call2 refers to a different location.
[Note that the initial address 1000 is arbitrary – you can start from any address number. By convention, I use four digit numbers so we don’t get confused with smaller numeric data (like 0, 4, 12) that we often refer to in our programs.]
2 Updating Memory and the Known Names
So what happens if we write something like
call3.descr = "call Tina" |
Python finds call3 in the known names, then hops over to memory to find the descr part of the data. Within memory, it changes the description. So now our diagram looks like:
Known Names Memory |
------------------------------------------------------------------- |
milk --> (loc 1000) (loc 1000) --> ToDoItem("buy milk" ...) |
call1 --> (loc 1001) (loc 1001) --> ToDoItem("call Tina" ...) |
call2 --> (loc 1002) (loc 1002) --> ToDoItem("call Fred" ...) |
call3 --> (loc 1001) |
Following the arrows from call1 into memory, you can see how the change is visible to call1 as well as call3.
3 What About Atomic Values, like Nums and Bools?
What if we also had names mapping to simple numbers, strings, or bools? Do they look the same across the known names and memory?
Not exactly. Simple data (called atomic data) just lives in the known names. For example, assume our original program also included the following two lines:
x = 4 |
y = 5 |
Then our dictionary/memory diagram would appear as follows:
Known Names Memory |
------------------------------------------------------------------- |
milk --> (loc 1000) (loc 1000) --> ToDoItem("buy milk" ...) |
call1 --> (loc 1001) (loc 1001) --> ToDoItem("call Tina" ...) |
call2 --> (loc 1002) (loc 1002) --> ToDoItem("call Fred" ...) |
call3 --> (loc 1001) |
x --> 4 |
y --> 5 |
Wait – why do atomics live in the known names rather than in memory? It’s partly a question of space consumption. Compound data has multiple fields, and hence needs multiple "spaces" in memory (one for each field). Atomic data only needs one "space". It’s also partly a question of what you can do to data: we can’t change the number 4 into the number 5 (4 will always be 4), whereas we can change the contents of fields in compound data. So since atomics only need one space and can’t be changed anyway, we leave them in the dictionary.
3.1 Updating Values of Names that Map to Atomics
So what if we now execute the line
x = y |
Python looks up the value for y in the dictionary, and updates the value for x to match:
Known Names Memory |
------------------------------------------------------------------- |
milk --> (loc 1000) (loc 1000) --> ToDoItem("buy milk" ...) |
call1 --> (loc 1001) (loc 1001) --> ToDoItem("call Tina" ...) |
call2 --> (loc 1002) (loc 1002) --> ToDoItem("call Fred" ...) |
call3 --> (loc 1001) |
x --> 5 |
y --> 5 |
Note that this is exactly the same thing that happened when we ran call3 = call1 earlier: Python looked up the value for call1 in the known names and mapped call3 to it. In that case, the known names value for call3 became a reference to a memory location, but it was the same operation from the standpoint of the known names.
So if we now change x again, will y also change? For example, what if we wrote
x = 12 |
Follow our rule: the new value for x is 12. We update the known names entry for x to 12. But the entry for y is left alone:
Known Names Memory |
------------------------------------------------------------------- |
milk --> (loc 1000) (loc 1000) --> ToDoItem("buy milk" ...) |
call1 --> (loc 1001) (loc 1001) --> ToDoItem("call Tina" ...) |
call2 --> (loc 1002) (loc 1002) --> ToDoItem("call Fred" ...) |
call3 --> (loc 1001) |
x --> 12 |
y --> 5 |
3.2 Wait – So Names Don’t Track Each Other???
Yesterday, we said that writing call1 = call3 meant that changes to one were visible to the other. But now we see that isn’t true for x = y. What’s the difference?
There was a very subtle distinction here: yesterday, we changed the contents of a component of call1. We didn’t change the value that call1 refers to. This difference is HUGE. Changing component contents updates memory, so names that refer to the same address both see the change. Reassigning what a name refers to, however, does not carry over across names.
Setting one name equal to another does not mean that the two names will always refer to the same value. Using name = new_value changes the value associated only with name. Setting one name equal to another when they refer to compound data means that changes within that compound data are visible through both names.
This is a bit confusing at first – we know that. We will continue to work with these ideas in lecture, so you’ll have time to get the hang of it.
4 Summary: The Update Rules
We add to memory when a data constructor is used
We update memory when a field of existing data is reassigned
We add to the known names when a name is used for the first time (this includes parameters and internal variables when a function is called)
We update the known names when a name that is already in the known names is reassigned to a different value)