List operations and dataclasses
A for-loop that builds a list
Let’s modify our courses function to return a list of courses in a given department:
# courses_in_dept(["CSCI0111"], "CSCI") == ["CSCI0111"] def courses_in_dept(courses: list, dept: str) -> list: """returns a list of courses whose names start with dept""" found_courses = [] for course in courses: if course.startswith(dept): found_courses.append(course) return found_courses
In Pyret, we wrote list processing functions using both cases
expressions
(which, as we’ve seen,, we will replace with for-loops when we write Python
code) and the built-in list operations such as filter
, map
, etc. Python also
has built-in list operations; for example, the above loop could be re-written as
a filter
expression:
def courses_in_dept(courses: list, dept: str) -> list: """returns a list of courses whose names start with dept""" return list(filter(lambda course: course.startswith(dept), courses))
Things to notice here:
- Python has lambda-expressions, just like Pyret does. The syntax is slightly different, but they are doing the same thing.
- We need to call
list
on the result offilter
–we won’t talk about why in this course.
Dataclasses
We spent a while earlier this semester talking about Pyret’s datatypes. Datatypes give us a way of storing related data together. How can we do the same thing in Python? We’ll use something called dataclasses.
from dataclasses import dataclass from datetime import date # Pyret: # data TodoItem: # | todo(deadline :: Date, tags :: List<String>, description :: String) # end @dataclass class TodoItem: deadline: date tags: list description: str done: bool
Don’t worry too much about the class
keyword. We’ll talk much more about what
it means in 112 and 18.
Here we’re defining a dataclass called TodoItem
with three components: a
deadline, a list of tags, and a description. Unlike in Pyret, there’s no
distinction between the name of the dataclass and the name of its constructor;
we can build a TodoItem
like so:
> TodoItem(date(2019, 11, 8), ["class"], "Prepare for CSCI 0111", False)
This means that we can’t have a dataclass with multiple constructors in the same way we could in Pyret. Python has other idioms for data with multiple shapes, which we’ll see in future CS classes.
Let’s build some TODO items:
class_item = TodoItem(date(2020, 11, 11), ["school", "class"], "Prepare for CSCI 0111", False) avocado_item = TodoItem(date(2020, 11, 16), ["home", "consumption"], "Eat avocado", False) birthday_item = TodoItem(date(2020, 11, 20), ["home", "friends"], "Buy present for friend", False) todo_list = [class_item, avocado_item, birthday_item]
We can look at the members of our TODO list:
> todo_list[0] > todo_list[0].description > todo_list[2].deadline > todo_list[3] > todo_list[2].abc
We can write a function to see if a TODO item is past due:
def past_due(item: TodoItem, today: date) -> bool: return item.deadline < today and not item.done
We can test this function:
def test_past_due(): todo = TodoItem(date(2020, 11, 8), ["class"], "Prepare for CSCI 0111", False) test("old TODO", past_due(todo, date(2020, 11, 9), True)
So we can access the components of a dataclass with dot-notation, just like we did in Pyret.
As we’ve seen, Python allows us to modify data. Dataclasses are no exception:
> avocado_item.done = True > avocado_item > todo_list
Functions on Todo lists
def find_items_by_description(todo_list: list, descr: str) -> list: """return all items whose description matches descr""" return list(filter(lambda item: descr in item.description, todo_list)) def find_items_by_tag(todo_list: list, tag: str) -> list: """return all items tagged with tag""" return list(filter(lambda item: tag in item.tags, todo_list))
Notice that in
is doing a different thing in each of these functions.