Hashtables

Hashtables

Last time, we ended with this example of a Python hashtable:

  species = {
    "dog": 40,
    "cat": 30,
    "beetle": 5
  }

def find_food_consumption(name: str) -> int:
  return species[name]

We can see that this data structure maps keys (such as “dog” and “cat”) to values (such as 40 and 30).

If you read the Python documentation, you might see hashtables referred to as “dictionaries,” and the name for the type (if you’re using one in a function) is dict. We’ll use the term hashtable to avoid confusion with the program dictionary.

Hashtable lookups

Our find_food_consumption function demonstrates how to look up the value for a given key:

> species["dog"]
40

This works just like looking up a particular element of a list:

> lst = [1, 2, 3]
> lst[1]
2

What happens if we try to get the value for a key that isn’t in the table?

> species["lion"]
ERROR

Python throws an error, complaining that the key isn’t present. Again, this is similar for lists:

> lst = [1, 2, 3]
> lst[3]
ERROR

What if we want to check if a key is present? We can use in:

> "dog" in species
True
> "lion" in species
False

Loops over hashtables

Let’s say we want to find the species with the greates food consumption. In order to do this, we probably need to look at the food consumption of every species.

We can do this with a for-loop:

def largest_food_consumption(species: dict) -> int:
  largest = 0
  for name in species:
    if species[name] > largest:
      largest = species[name]
  return largest

For hashtables, for-loops examine every key. So, in order to get the values, we have to index into the hashtable.

Updating hashtables

What if we want to change the value of a key?

> species["dog"] = 60
> species["dog"]
60

We can use the same syntax to add a new key:

> species["lion"] = 120
> species["lion"]
120

Complex values

What if we wanted to have multiple values for a single key? For instance, different dog breeds probably consume different amounts of food.

species = {
  "dog": [40, 60],
  "cat": [30],
  "beetle": [5]
}

The values in a hashtable can be anything–numbers, strings, datatypes, lists, even other hashtables!

What does memory look like after we create this new species dictionary?

Note: why is it important that every value in the hashtable be a list?

The keys in a hashtable are more restricted: for reasons we will discuss next time, they must be atomic values.

Design example: a song database

See the lecture capture for details.