Circular references and hash tables

An example: ecology simulator

Imagine we’re doing a simulation of animal behavior patterns. We have a couple of classes:

@dataclass
class Habitat:
  food: int
  # could have other fields here--location, temperature, etc.

@dataclass
class Animal:
  time_since_food: int
  food_consumption: int
  habitat: Habitat

Each habitat is going to support multiple animals, who will consume the food in that habitat via this function:

def maybe_eat(animal: Animal):
  if animal.time_since_food > 8:
    animal.habitat.food = animal.habitat.food - animal.food_consumption
    animal.time_since_food = 0
  else:
    animal.time_since_food = animal.time_since_food + 1

Let’s say we wanted to create a couple of animals sharing the same habitat, which will start with 50 food. How would we do it?

Option 1:

> animal1 = Animal(0, 10, Habitat(50))
> animal2 = Animal(0, 20, Habitat(50))

Option 2:

> animal1 = Animal(0, 10, Habitat(50))
> animal2 = Animal(0, 20, animal1.habitat)

Option 3:

> habitat = Habitat(50)
> animal1 = Animal(0, 10, habitat)
> animal2 = Animal(0, 20, habitat)

Option 4:

> food = 50
> animal1 = Animal(0, 10, Habitat(food))
> animal2 = Animal(0, 20, Habitat(food))

See the lecture capture for the details.

Circular references

Let’s say we want a Habitat to have a list of all of the animals in that habitat:

@dataclass
class Habitat:
  food: int
  animals: list

We could try to set this up as follows:

habitat = Habitat(50, [Animal(0, 10,

We get stuck here–what should we list as the habitat of the animal we’re building?

We can do something like this:

habitat = Habitat(50, [])
habitat.animals.append(Animal(0, 10, habitat))
habitat.animals.append(Animal(0, 20, habitat))

We have circular references here–the habitat references each animal, and each animal references the habitat. It can be helpful to draw arrows in memory to keep track of this (see the lecture capture).

We could write a function to do this:

def animal_in_habitat(habitat: Habitat, food_consumption: int) -> Animal:
  animal = Animal(0, food_consumption, habitat)
  h.animals.append(animal)
  return animal

How would we test this function?

def test_animal_in_habitat():
  h = Habitat(50, [])
  a = animal_in_habitat(h, 10)
  # assert h == Habitat(50, [Animal(0, 10, ...)])
  assert h.animals = [a]
  assert a.habitat == h

Hashtables

Let’s say we’re working on our ecology simulator and we want to track a number of animal species and how much food they consume from the habitat. We’ve seen ways we might structure this kind of data in both Python and Pyret. In both Pyret and Python, we might use a list of datatypes; in Pyret, we might instead use a Table. Let’s say we decide to use a list of datatypes, something like this:

@dataclass
class Species:
  name: str
  food: int

species = [Species("dog", 40), Species("cat", 30), Species("beetle", 5)]

We can use this list to find the food consumption for a given species:

def find_food_consumption(name: str) -> int:
  for s in species:
    if s.name == name:
      return s.food
  raise Exception("not found")

This solution works fine for our example.

However…as it turns out, there are a lot of species! The number of total animal species on the planet is hard to estimate, but it’s at least several million (the majority of which are insects and other invertebrates). What if our species list had millions of entries? Our function to find a given species’ food consumption has a running time that is linear in relation to the number of species; if that list has a million entries, find_food_consumption could take a while to run!

Luckily, there’s another option. Using a data structure called a hashtable, we can access any species’ food consumption in constant time.

We’ll talk about hashtable usage, as well as how hashtables work behind the scenes, later this week. For now, here’s how we could rewrite the food consumption example using hashtables:

  species = {
    "dog": 40,
    "cat": 30,
    "beetle": 5
  }

def find_food_consumption(name: str) -> int:
  return species[name]