Lists, hashtables, sets

Today we’ll review how lists and hashtables work in Python, as well as discussing a new data structure: the set.

Lists

Lists are used to represent sequences of items in a particular order. We can build and add to lists like this:

> story = ["It", "was", "a", "dark", "and", "stormy"]
> story.append("night")

We can access and modify particular elements:

> story[1]
"was"
> story[1] = "is"
> story
["It", "is", "a", "dark", "and", "stormy", "night"]

We can add lists together:

> ["NARRATOR:"] + story
["NARRATOR:", "It", "is", "a", "dark", "and", "stormy", "night"]

And we can loop over the elements:

> for word in story:
    print(word.upper())
IT
IS
A
DARK
AND
STORMY
NIGHT

Hashtables

Hashtables (often called dictionaries in Python) are used to represent mappings between keys and values.

> status = {"brightness": "dark", "weather": "stormy"}
> status["time"] = "night"

We access elements by key rather than by index:

> status["weather"]
"stormy"
> status["weather"] = "pleasant"
> status
{"brightness": "dark", "weather": "pleasant", "time": "night"}

We can check whether the hashtable contains a key:

> "weather" in status
True

We can loop over the keys:

> for attribute in status:
    print(status[attribute].upper())
"DARK"
"PLEASANT"
"NIGHT"

Would we ever want a hashtable where the keys are numbers?

Sets

Sets store unordered collections of elements:

> night = {"dark", "stormy"}

We can add elements:

> night.add("frightening")
> len(night)
3
> night.add("stormy")
> len(night)
3

We can test whether elements are present:

> "frightening" in night
True
> "inauspicious" in night
False

We can loop over the elements:

> for quality in night:
    print(quality.upper())
FRIGHTENING
DARK
STORMY

We can combine sets:

> night | {"inauspicious"}
{"dark", "stormy", "frightening", "inauspicious"}

We can convert a list to a set, and vice versa:

> monster = ["very", "very", "scary"]
> set(monster)
{"very", "scary"}
> list(set(monster))
["very", "scary"]

Sets are very useful when we care about which elements are present, but not about their order.

When to use lists, hashtables, and sets

Let’s say we’re looking at the text of Frankenstein again and want to answer a few questions. Which data structure would we use in order to compute each of the following?

  1. the number of unique non-capitalized words in Frankenstein
  2. all of the characters in Frankenstein, ordered by when they appear
  3. The longest word in Frankenstein