Sets, memory, and testing

Sets and memory

In terms of memory, Python’s sets work like lists: they are not atomic values, and store collections. To review how that works, let’s look at this Python program.

recipes = {'pbj': {'peanut butter', 'jelly', 'bread'},
           'smoothie': {'peanut butter', 'banana', 'oat milk'}}
kale_smoothie = recipes['smoothie']
kale_smoothie.add('kale')
spinach_smoothie = recipes['smoothie'] | {"spinach"}

What does kale_smoothie look like after the program is executed? What does spinach_smoothie look like?

One more restriction on sets in Python: they must contain only atomic values. This is because of the way sets are implemented (think hashtable keys).

Testing

Let’s test our word count code, as a review of testing in Python.

Since our Python code lives in wordcount.py, we’ll put testing code in wordcount_test.py. At the top of that file, we can put:

from wordcount import *

We’ll see more examples of this later. For now, the only thing to remember is that this lets us refer to all of the functions we have written in wordcount.py.

We can now test our actual functions. The convention we’ll use for this class (and which is often used in practice) is each function in the file we’re testing (wordcount.py) corresponds to a function in the test file whose name starts with test_. So, we’ll write tests like this:

def test_count_words():
   pass

What would be a good set of test cases?

def test_count_words():
   assert count_words("") == {}
   assert count_words("word") == {"word": 1}
   assert count_words("word word") == {"word": 2}
   assert count_words("word other word") == {"word": 2, "other": 1}

If we run pytest in the terminal, we see that our tests pass. Hooray!