Class summary: Circular References
Copyright (c) 2017 Kathi Fisler
1 Circular References
One of the interesting things about our Book and Patron definitions is that they refer to each other: A patron refers to books, which refer to their borrowers, which refer to their books, and so on. Let’s look at these circular dependencies in data a bit more closely.
Here are versions of the Account and Customer classes that set up a circular reference: each Account refers to its owners, and each Customer refers to their account. To keep the example simple, we will assume that each Customer can have only one account.
@dataclass |
class Account: |
id: int |
balance: int |
owners: list # of Customer |
|
@dataclass |
class Customer: |
name: str |
acct: Account |
|
Let’s now create a new customer and a new account for them:
new_acct = Account(5, 150, Customer("Kathi", __________)) |
How do we fill in the blank in the Customer? We’d like to say new_acct but Python (and most other languages) will raise an error that new_acct isn’t defined. Why is that?
When given this assignment, Python first evaluates the right side, to get the value or memory location that should be stored in the dictionary for new_acct. If we filled in the blank with new_acct, Python would start by running:
Account(5, 150, Customer("Kathi", new_acct)) |
To do this, it needs to look up new_acct in the dictionary, but that name isn’t in the dictionary yet (it only goes in after we compute the value to store for that name). Hence the error.
1.1 Use assignment to create circular data
To get around this, we leverage the ability to update the contents of memory locations after names for data are in place. We’ll create the account partially, but without filling in the Customer. Then we create the customer to reference the new Account. Then we update the account owners with the now-created customer:
new_acct = Account(5, 150, []) # note the empty Customer list |
new_cust = Customer("Kathi", new_acct) |
new_acct.owners = [new_cust] |
Note here that each part gets a spot in memory and an entry in the dictionary, but the data hasn’t been finished yet. Once we have the data set up in memory though, we can update the owners component to the correct value.
Here’s what this looks like at the level of memory and the dictionary after running the first two lines:
Dictionary Memory |
----------------------------------------------------------------------- |
(loc 1015) -> [] |
new_acct -> (loc 1016) (loc 1016) -> Account(5, 150, loc 1015) |
new_cust -> (loc 1017) (loc 1017) -> Customer("Kathi", loc 1016) |
Then, when we run the third line, we create a new list containing new_cust and update the owner list within new_acct:
Dictionary Memory |
----------------------------------------------------------------------- |
(loc 1015) -> [] |
new_acct -> (loc 1016) (loc 1016) -> Account(5, 150, loc 1018) |
new_cust -> (loc 1017) (loc 1017) -> Customer("Kathi", loc 1016) |
(loc 1018) -> [loc 1017] |
Notice that the two owners lists each live in memory but aren’t associated with names in the dictionary. They are only reachable going through new_acct, and after the update, the empty list isn’t reachable at all.
If we had instead done the owner update using append, as in:
new_acct = Account(5, 150, []) # note the empty Customer list |
new_cust = Customer("Kathi", new_acct) |
new_acct.owners.append(new_cust) |
We would have updated the list at location 1015 instead of create a new location for a new list, as follows:
Dictionary Memory |
----------------------------------------------------------------------- |
(loc 1015) -> [loc 1017] |
new_acct -> (loc 1016) (loc 1016) -> Account(5, 150, loc 1015) |
new_cust -> (loc 1017) (loc 1017) -> Customer("Kathi", loc 1016) |
Either approach (append or a new list) works fine. The only difference is whether a new list gets created, as shown in these two memory examples.
1.2 Testing Circular Data
When you want to write a test involving circular data, you can’t write out the circluar data manually. For example, imagine that we wanted to write out new_acct from the previous examples:
test("data test", new_acct, |
Account(5, 150, [Customer("Kathi", Account(5, 150, ...)]) |
Because of the circularity, you can’t finish writing down the data. You have two options: write tests in terms of the names of data, or write tests on the components of the data.
Here’s an example that illustrates both. After setting up the account, we might want to check that the owner of the new account is the new customer:
test("new owner", new_acct.owner, new_cust) |
Here, rather than write out the Customer by hand, we drop in the name of the existing item in memory. This doesn’t require you to write ellipses. We also focused on just the owner component, as a part of the Account value that we expected to change.
1.3 Bonus: A Function to Create Accounts for New Customers
What if we turned the sequence for creating dependencies between customers and their accounts into a function? We might get something like the following:
def create_acct(new_id: int, init_bal: int, cust_name: str) -> Account: |
new_acct = Account(new_id, init_bal, []) # note the empty Customer list |
new_cust = Customer(cust_name, new_acct) |
new_acct.owners.append(new_cust) |
return new_acct |
This looks useful, but it does have a flaw – we could accidentally create two accounts with the same id number. It would be better for us to maintain a variable containing the next account id to use, to guarantee that the same id gets used only once. How might we augment the code to do this?
next_id = 1 # stores the next available id number |
|
def create_acct(init_bal: int, cust_name: str) -> Account: |
new_acct = Account(next_id, init_bal, []) # note the empty Customer list |
next_id = next_id + 1 |
new_cust = Customer(cust_name, new_acct) |
new_acct.owners.append(new_cust) |
return new_acct |
Here, we create the next_id variable to hold the next id number to use. When we create an Account, we update next_id to the next unused number. Problem solved!
Well, almost. Now we’re at something that is specific to Python.
When we were still working in Pyret, we talked about what happens to the dictionary when we call a function: we make a separate area of the dictionary for that function, and we put the values of the parameters in that area of the dictionary. When the function ends, its piece of dictionary goes away.
In our code above, we have a variable next_id set up outside the create_acct function. Inside the function, we are assigning to a variable next_id. Is this the same variable from outside the function though, or are we trying to create a new variable (as we do for new_acct or new_cust). Python can’t tell which one we want.
To tell Python that we are trying to use the copy of the variable that is outside the function, we have to add a line telling it that. We do that by using a global annotation inside the function. Here’s the final code:
next_id = 1 # stores the next available id number |
|
def create_acct(init_bal: int, cust_name: str) -> Account: |
# next line says to use the next_id from outside the function |
global next_id |
new_acct = Account(next_id, init_bal, []) # note the empty Customer list |
next_id = next_id + 1 |
new_cust = Customer(cust_name, new_acct) |
new_acct.owners.append(new_cust) |
return new_acct |
This isn’t something you’ll be tested on – it is just here to show you how to do this in case you are interested.