Circular references and hash tables
An example: ecology simulator
Imagine we’re doing a simulation of animal behavior patterns. We have a couple of classes:
@dataclass class Habitat: food: int # could have other fields here--location, temperature, etc. @dataclass class Animal: time_since_food: int food_consumption: int habitat: Habitat
Each habitat is going to support multiple animals, who will consume the food in that habitat via this function:
def maybe_eat(animal: Animal): if animal.time_since_food > 8: animal.habitat.food = animal.habitat.food - animal.food_consumption animal.time_since_food = 0 else: animal.time_since_food = animal.time_since_food + 1
Let’s say we wanted to create a couple of animals sharing the same habitat, which will start with 50 food. How would we do it?
Option 1:
> animal1 = Animal(0, 10, Habitat(50)) > animal2 = Animal(0, 20, Habitat(50))
Option 2:
> animal1 = Animal(0, 10, Habitat(50)) > animal2 = Animal(0, 20, animal1.habitat)
Option 3:
> habitat = Habitat(50) > animal1 = Animal(0, 10, habitat) > animal2 = Animal(0, 20, habitat)
Option 4:
> food = 50 > animal1 = Animal(0, 10, Habitat(food)) > animal2 = Animal(0, 20, Habitat(food))
See the lecture capture for the details.
Circular references
Let’s say we want a Habitat
to have a list
of all of the animals in that
habitat:
@dataclass class Habitat: food: int animals: list
We could try to set this up as follows:
habitat = Habitat(50, [Animal(0, 10,
We get stuck here–what should we list as the habitat of the animal we’re building?
We can do something like this:
habitat = Habitat(50, [])
habitat.animals.append(Animal(0, 10, habitat))
habitat.animals.append(Animal(0, 20, habitat))
We have circular references here–the habitat references each animal, and each animal references the habitat. It can be helpful to draw arrows in memory to keep track of this (see the lecture capture).
We could write a function to do this:
def animal_in_habitat(habitat: Habitat, food_consumption: int) -> Animal: animal = Animal(0, food_consumption, habitat) h.animals.append(animal) return animal
How would we test this function?
def test_animal_in_habitat(): h = Habitat(50, []) a = animal_in_habitat(h, 10) # assert h == Habitat(50, [Animal(0, 10, ...)]) assert h.animals = [a] assert a.habitat == h
Hashtables
Let’s say we’re working on our ecology simulator and we want to track a number
of animal species and how much food they consume from the habitat. We’ve seen
ways we might structure this kind of data in both Python and Pyret. In both
Pyret and Python, we might use a list of datatypes; in Pyret, we might instead
use a Table
. Let’s say we decide to use a list of datatypes, something like
this:
@dataclass class Species: name: str food: int species = [Species("dog", 40), Species("cat", 30), Species("beetle", 5)]
We can use this list to find the food consumption for a given species:
def find_food_consumption(name: str) -> int: for s in species: if s.name == name: return s.food raise Exception("not found")
This solution works fine for our example.
However…as it turns out, there are a lot of species! The number of total
animal species on the planet is hard to estimate, but it’s at least several
million (the majority of which are insects and other invertebrates). What if our
species list had millions of entries? Our function to find a given species’ food
consumption has a running time that is linear in relation to the number of
species; if that list has a million entries, find_food_consumption
could take
a while to run!
Luckily, there’s another option. Using a data structure called a hashtable, we can access any species’ food consumption in constant time.
We’ll talk about hashtable usage, as well as how hashtables work behind the scenes, later this week. For now, here’s how we could rewrite the food consumption example using hashtables:
species = { "dog": 40, "cat": 30, "beetle": 5 } def find_food_consumption(name: str) -> int: return species[name]