List operations and dataclasses

A for-loop that builds a list

Let’s modify our courses function to return a list of courses in a given department:

# courses_in_dept(["CSCI0111"], "CSCI") == ["CSCI0111"]
def courses_in_dept(courses: list, dept: str) -> list:
    """returns a list of courses whose names start with dept"""
    found_courses = []
    for course in courses:
        if course.startswith(dept):
            found_courses.append(course)
    return found_courses

In Pyret, we wrote list processing functions using both cases expressions (which, as we’ve seen,, we will replace with for-loops when we write Python code) and the built-in list operations such as filter, map, etc. Python also has built-in list operations; for example, the above loop could be re-written as a filter expression:

def courses_in_dept(courses: list, dept: str) -> list:
    """returns a list of courses whose names start with dept"""
    return list(filter(lambda course: course.startswith(dept), courses))

Things to notice here:

  • Python has lambda-expressions, just like Pyret does. The syntax is slightly different, but they are doing the same thing.
  • We need to call list on the result of filter–we won’t talk about why in this course.

Dataclasses

We spent a while earlier this semester talking about Pyret’s datatypes. Datatypes give us a way of storing related data together. How can we do the same thing in Python? We’ll use something called dataclasses.

from dataclasses import dataclass
from datetime import date

# Pyret:
# data TodoItem:
#   | todo(deadline :: Date, tags :: List<String>, description :: String)
# end

@dataclass
class TodoItem:
    deadline: date
    tags: list
    description: str
    done: bool

Don’t worry too much about the class keyword. We’ll talk much more about what it means in 112 and 18.

Here we’re defining a dataclass called TodoItem with three components: a deadline, a list of tags, and a description. Unlike in Pyret, there’s no distinction between the name of the dataclass and the name of its constructor; we can build a TodoItem like so:

> TodoItem(date(2019, 11, 8), ["class"], "Prepare for CSCI 0111", False)

This means that we can’t have a dataclass with multiple constructors in the same way we could in Pyret. Python has other idioms for data with multiple shapes, which we’ll see in future CS classes.

Let’s build some TODO items:

class_item = TodoItem(date(2020, 11, 11), ["school", "class"], "Prepare for CSCI 0111", False)
avocado_item = TodoItem(date(2020, 11, 16), ["home", "consumption"], "Eat avocado", False)
birthday_item = TodoItem(date(2020, 11, 20), ["home", "friends"], "Buy present for friend", False)

todo_list = [class_item, avocado_item, birthday_item]

We can look at the members of our TODO list:

> todo_list[0]
> todo_list[0].description
> todo_list[2].deadline
> todo_list[3]
> todo_list[2].abc

We can write a function to see if a TODO item is past due:

def past_due(item: TodoItem, today: date) -> bool:
    return item.deadline < today and not item.done

We can test this function:

def test_past_due():
     todo = TodoItem(date(2020, 11, 8), ["class"], "Prepare for CSCI 0111", False)
     test("old TODO", past_due(todo, date(2020, 11, 9), True)

So we can access the components of a dataclass with dot-notation, just like we did in Pyret.

As we’ve seen, Python allows us to modify data. Dataclasses are no exception:

> avocado_item.done = True
> avocado_item
> todo_list

Functions on Todo lists

def find_items_by_description(todo_list: list, descr: str) -> list:
  """return all items whose description matches descr"""
  return list(filter(lambda item: descr in item.description, todo_list))

def find_items_by_tag(todo_list: list, tag: str) -> list:
  """return all items tagged with tag"""
  return list(filter(lambda item: tag in item.tags, todo_list))

Notice that in is doing a different thing in each of these functions.