Class summary: Dataclasses and Testing

1 Dataclasses

The analog to a Pyret data definition is called a dataclass in Python. Here’s an example of creating a dataclass to capture a todo-list item:

from dataclasses import dataclass # to allow use of dataclasses

from datetime import date # to allow dates as a type in the todoitem

''' A ToDoItem datatype in Pyret

data ToDoItemData:

| todoItem(descr :: String,

due :: Date,

tags :: List[String]

end

'''

@dataclass

class ToDoItem:

descr: str

due: date

tags: list

# a sample list of ToDoItem

MyTD = [ToDoItem("buy milk", date(2019, 7, 27), ["shopping", "home"]),

ToDoItem("grade hwk", date(2019, 7, 27), ["teaching"]),

ToDoItem("meet students", date(2019, 7, 26), ["research"])

]

Things to note:

There is a single name for the type and the constructor, rather than separate names as we had in Pyret.
There are no commas between field names (but each has to be on its own line in Python)
There is no way to specify the type of the contents of the list in Python (at least, not without using more advance packages for writing types)
The @dataclass annotation must be part of the definition.

Other than these notational differences, this concept carries over nicely from Pyret to Python. Making a datatype with multiple cases is a bit harder in Python – we’ll get back to that in a few days.

2 Functions for ToDoLists

Assume we’re writing programs to manage todo lists. Typical operations might be to add items, remove items, and search for items. Here are functions for each of these.

2.1 Adding Items

The function to add an item to a todo list is similar to what we wrote last week to register for courses. Notice that this function simply modifies the list, but does not return anything.

def add_item(tdi: ToDoItem, TDL: list):

"add given item to the given todo list"

TDL.append(tdi)

2.2 Finding Items

Here are two different versions of functions for finding items in a todo-list. The first uses filter, while the second uses a for loop, but both otherwise do the same thing. There is no reason to use one over the other – it’s more matter of preference (though the former will likely be more common in professional settings).

Both versions show that we access fields within a piece of data using the dot-notation (e.g., item.descr), just as we did in Pyret.

def find_items(term: str, TDL: list) -> list:

"get the list of items that have the term in their description"

def has_descr(item: ToDoItem): return term in item.descr

return list(filter(has_descr, TDL))

def find_items2(term: str, TDL: list) -> list:

"get the list of items that have the term in their description"

matches = []

for item in TDL:

if term in item.descr:

matches.append(item)

return matches

Notice that find_items2 looks very similar to the loop programs we wrote last week. There is a variable (here matches) in which we build up the running result, a for-loop that traverses the items and builds the result, then a return of the result after the for-loop. The matches variable is initially set to what the function would return if given the empty list as input.

2.3 Removing Items

For removing items, we will rely on a Python operation for removing items from lists. Here’s an example with numbers:

>>> nums = [6, 2, 8, 1, 5, 1]

>>> nums.remove(1)

>>> nums

[6, 2, 8, 5, 1]

The remove operation modifies the list to remove the first occurrence of the given item from the list.

Let’s say we want to be able to give remove items from the todo list based on strings that are part of their descriptions. For example, we want to be able to write:

rem_item("milk", MyTD)

rem_item_all("hwk", MyTD)

where the first will remove the first item that mentions "milk", while the second will remove all items that mention "hwk".

In order to do this, we have to get from a piece of the description (like "milk") to the ToDoItem(s) that should be removed from the list (since Python’s remove needs to be given the full element to remove from a list). Luckily, we wrote find_items for that tasks, so we can use that as part of rem_item:

def rem_item(des: str, TDL: list) -> list:

'''remove item with given descr from the todolist, if it is there'''

# find item(s) that matches des in the todolist

item_to_remove = find_items(des, TDL)[0]

# remove the first one from TDL

TDL.remove(item_to_remove)

return TDL

Here, we call find_item, which gives us a list of all items that match the description. If we want to remove only the first one, we take the list of found items, then use [0] to tell Python to extract the 1st item from the list (Python lists, like Pyret and most other programming languages, start counting from 0).

If we want to remove all the matching items, we go through the list of found-items with a for loop, removing each in return:

def rem_item_all(des: str, TDL: list) -> list:

'''remove all item with given descr from the todolist'''

list_of_matches = find_items(des, TDL)

for item_to_remove in list_of_matches:

TDL.remove(item_to_remove)

return TDL

3 Testing Todo Lists

Let’s talk a bit about testing. We start by writing tests for each of these functions.

For add_item: how do we test a function that doesn’t return anything? We end up testing that the function had the effect that we intended. In this case, if we add something then try to find it, it should be in the list. We write this as follows:

def test_add():

newItem = ToDoItem("exercise", date(2021, 4, 12), ["home"])

add_item(newItem, MyTD)

test("add added exercise", len(find_items("exercise", MyTD)) > 0, True)

First, we create an item to add as part of our test. We call add_item. Since the function doesn’t return, we can’t compare the result of the add_item call within the test. Instead, we call add_item on its own, then test that we found an item.

Why are we comparing to True? Because the expression we want to run as part of the test, len(find_items("exercise", MyTD)) > 0, returns a Boolean. We write True to make sure we got the expected Boolean. We could also have tested for the exact number of found items. We’re just showing this as an example of how to test.

The tests for finding and removing itesms are similar:

def test_find():

test("find milk", len(find_items("milk", MyTD)) > 0, True)

test("don't find sleep", find_items("sleep", MyTD), [])

def test_rem():

rem_item_all("milk", MyTD)

test("don't find milk", find_items("milk", MyTD), [])

Things get interesting if we run the tests. Let’s say I tried the following sequence:

>>> test_add() # will pass

>>> test_find() # will pass

>>> test_rem() # will pass

>>> test_find() # will FAIL!!!

Whoa – the test for find sometimes passes and sometimes fails depending on WHEN we run it relative to the other tests. Why? Because the functions are all modifying the overall todo-list, and the tests are making certain assumptions about the contents of the list before the tests are run. Some tests break the assumptions made by other tests.

Welcome to the sublety of working with data that can be changed while your program is running. Now, your functions and tests will often have assumptions about what the data look like, but other functions may have broken those assumptions. This needs some new discipline and practices beyond what we’ve seen so far (but hey, that’s why we waited until later in the course to bring this up!).

Think about how we might have set up the tests differently to protect against these interactions. Next class, we’ll talk about ways to safely test functions that modify data.