Python: dataclasses
HW 8 preview
Review of last time
Last time, we ended with this function:
def zs_in_list(lst: list) -> int: count = 0 for s in lst: for c in s: if c == 'z': count = count + 1 return count
Let’s step through this call to the function:
> zs_in_list(["pizza", "dog", "adze"])
What would happen if we indented the return
statement once? Twice? Thrice?
Dataclasses
We spent a while talking about Pyret’s datatypes. Datatypes give us a way of storing related data together. How can we do the same thing in Python? We’ll use something called dataclasses.
from dataclasses import dataclass from datetime import date # Pyret: # data TodoItem: # | todo(deadline :: Date, tags :: List<String>, description :: String) # end @dataclass class TodoItem: deadline: date tags: list description: str done: bool
Don’t worry too much about the class
keyword. We’ll talk much more about what
it means in 112 and 18.
Here we’re defining a dataclass called TodoItem
with three components: a
deadline, a list of tags, and a description. Unlike in Pyret, there’s no
distinction between the name of the dataclass and the name of its constructor;
we can build a TodoItem
like so:
> TodoItem(date(2019, 11, 8), ["class"], "Prepare for CSCI 0111", False)
This means that we can’t have a dataclass with multiple constructors in the same way we could in Pyret. Python has other idioms for data with multiple shapes, which we’ll see in future CS classes.
Let’s build some TODO items:
class_item = TodoItem(date(2019, 11, 8), ["school", "class"], "Prepare for CSCI 0111", False) avocado_item = TodoItem(date(2019, 11, 13), ["home", "consumption"], "Eat avocado", False) birthday_item = TodoItem(date(2019, 11, 20), ["home", "friends"], "Buy present for friend", False) todo_list = [class_item, avocado_item, birthday_item]
We can look at the members of our TODO list:
> todo_list[0] > todo_list[0].description > todo_list[2].deadline > todo_list[3] > todo_list[2].abc
We can write a function to see if a TODO item is past due:
def past_due(item: TodoItem, today: date) -> bool: return item.deadline > today and not item.done
We can test this function:
# in test_todo.py from todo import * import pytest def test_past_due(): assert past_due(TodoItem(date(2019, 11, 8), ["class"], "Prepare for CSCI 0111", False), date(2019, 11, 7)) == True
So we can access the components of a dataclass with dot-notation, just like we did in Pyret.
Functions on Todo lists
def find_items_by_description(todo_list: list, descr: str) -> list: """return all items whose description matches descr""" return list(filter(lambda item: descr in item.description, todo_list)) def find_items_by_tag(todo_list: list, tag: str) -> list: """return all items tagged with tag""" return list(filter(lambda item: tag in item.tags, todo_list))
Notice that in
is doing a different thing in each of these functions.
We can modify our TODO list:
def remove_finished(todo: list): """remove completed items from the TODO list""" completed_items = list(filter(lambda item: item.done == True, todo)) for item in completed_items: todo_list.remove(item)
We can test this function:
def test_remove_finished(): lst = [TodoItem(date(2019, 11, 8), [], "a", False), TodoItem(date(2019, 11, 20), ["a"], "b", True)] remove_finished(lst) assert lst == [TodoItem(date(2019, 11, 8), [], "a", False)]
We’ve defined a todo_list
variable in todo.py
–why not use that variable in
our test?