Class summary: Datatypes with Multiple Constructors
Copyright (c) 2017 Kathi Fisler
1 Data Design Problem – Multiple Media for Library
The local library wants you to help them manage information on the items they have available for checkout. They want you to focus on books and movies.
books have an author, title, and year of publication
movies have a name and a format (dvd or blueray)
Propose data structures for managing information about library items. What combination of lists and tables would you use?
1.1 Using Tables
We could think about smashing all the data into one table, but this is messy and leaves us with a lot of irrelevant data (that could accidentally get mis-used). Books and movies are too different to fit into the same table.
catalog1 = table: author, name-title, year, format |
row: "George R. Martin", "Game of Thrones", 1996, "book" |
row: "who knows", "Hunger Games", 2015, "dvd" |
end |
Why not use two tables?
books = table: author, title, year |
# insert rows |
end |
|
movies = table: name, format |
# insert rows |
end |
With two tables, we no longer have idea of a single catalog. What if we wanted the whole catalog sorted by name of item, or a search that returned both books and movies?
Key idea in CS: Tables a good data structure for multiple cases of data with the same attributes.
1.2 Using Datatypes
Alternative: let’s create a datatype that allows multiple kinds of library items:
data LibItem: |
| book(author :: String, title :: String, year :: Number) |
| movie(name :: String, format :: String) |
end |
This creates a new type (LibItem), and two functions (constructors) for creating LibItem values (book and movie).
With this we can define a catalog as a List<LibItem>
catalogL = |
[list: |
book("George R. Martin", "Game of Thrones", 1996), |
movie("Hunger Games", "dvd") |
] |
How would we search the catalog for items whose title/name matched (contained) a given search string? Let’s start by writing a function that takes a LibItem and determines whether it contains a search term:
This function will have to do a different computation depending on whether the item is a book or a movie. Thus, we should figure out what kind of LibItem we have using cases
fun item-matches(item :: LibItem, term :: String) -> Boolean: |
cases (LibItem) item: |
| book(aut, title, year) => string-contains(title, term) |
| movie(name, fmt) => string-contains(name, term) |
end |
where: |
item-matches(book("George R. Martin", "Game of Thrones", 1996), |
"hunger") is false |
item-matches(book("George R. Martin", "Game of Thrones", 1996), |
"Game") is true |
item-matches(movie("Hunger Games", "dvd"), "Game") is true |
end |
With item-matches in hand, we can search the catalog
fun search-catalog(catalog :: List<LibItem>, term :: String) -> List<LibItem>: |
doc: "return list of items that has term as part of its title or name" |
cases (List) catalog: |
| empty => empty |
| link(item, rst) => |
if item-matches(item, term): |
link(item, search-catalog(rst, term)) |
else: |
search-catalog2(rst, term) |
end |
end |
where: |
search-catalog(catalogL, "Hunger") is [list: movie("Hunger Games", "dvd")] |
search-catalog(catalogL, "Game") is catalogL |
search-catalog(catalogL, "Pyret") is empty |
search-catalog(empty, "Game") is empty |
end |
Finally, we could have searched the catalog using L.filter
fun search-catalog3(catalg :: List<LibItem>, term :: String) -> List<LibItem>: |
L.filter(catalg, lam(i :: LibItem): item-matches(i, term) end) |
end |
2 Key Takeaways
Tables have their limits. They work best for data with uniform attributes. When you have multiple versions of data, you are better with a list of data from a datatype that has multiple variants (what datatypes do).
When you have to process a list containing data with variants, make a helper to process the items, and use it in the main function that processes the list.