Homework 10: Book Bandit (Part Two)

Note: Hashtables and dictionaries refer to the same to the same data structure in Python.

The Assignment

Now that the book club members have been using your website to access contraband books, they have been complaining at how inefficient and clunky the website is, and several of your tech savvy cowboy friends decided to investigate your code under the hood (or under the saddle, we might say), and have noticed that you use a ridiculous number of lists, they suggest that there is a more efficient way of storing book data: hashtables!

Getting Started

Download the support code here!

It will contain all of the support code needed to run the website (same as HW9), testlight.py, hw10_test.py (where you will write your tests), and the two new files described below. Please make sure the project you’re working in has all of these files.

You might notice that we don’t provide a hw10_code.py stencil file! We’ll be providing this on Canvas, since it contains our solutions to some of the functions from HW9. This assignment will be building off of these implementations.

If you haven’t yet turned in hw9, please don’t get the hw10 stencil file from Canvas before you do (unless you aren’t going to turn in hw9 at all). Bear in mind that Canvas can show us when people access files.

The Stencil

We’ll be providing the hw10_code.py stencil on Canvas. Please copy this file into the same folder where you downloaded the hw10 source code, so that the stencil, along with any files that were in hw10.zip (including the subfolders) all live in the same folder.

In the stencil, we’ve clarified which functions you should edit with TODOs, and have provided our implementations of the list-versions of all of the functions. Because the main goal of this homework is to edit these functions, please make sure to understand them before getting started, and feel free to ask at hours/Campuswire if any of them are unclear!

New Files

    # loads list of genres to display on main page into list of strings
    display_genres = list(map(lambda line: line[0], csv.reader(f, delimiter="\n

This will read in all our genres from genres.txt as a list of strings called display_genres.


New Global Variables

You should have three hashtables. We’ve defined these at the top of the stencil:

  1. id_to_book_dict Hashtable that maps a ID number to a Book
    Key: ID
    Value: Book

  2. genre_to_books_dict Hashtable that maps a genre (a string) to a list of all the books in that genre
    Key: Genre
    Value: A list of all the books that are in that genre
    Note: This global variable is replacing the functionality of the list created by HW9’s make_genre_list() function. We will be filling it out in setup(), just like the id_to_book dict. For more details, read about the setup() function below.

  3. book_to_score_dict Hashtable that maps a Book to its recommendation score
    Key: Book
    Value: Recommendation score for that book

    Note: In this class, in order to tell Python that we can use the Book data type as a key for our hashmap, we need to make sure we can’t alter a Book’s fields once its created. To do so, we’re adding the frozen=True annotation to our dataclass. This change is present in the stencil.

Note: Data structures dealing with carts and purchases will stay as lists. Reference the tasks below for the things you do need to change.


Coding Tasks

Your primary task will be updating our old implementation to use hashtables instead of lists for various features.

Instead of using lists to keep track of our books, genres, and recommendations, we’ll be using hash tables!

Functions to Change from HW 9 (required for passing)

The stencil code from Canvas will provide detailed TODOs in the comments, so be sure to check those out.

  1. setup():
    Reads through the full books.csv CSV file and populates a global book_dict, which is a hashtable with integer book IDs as keys and Books (the dataclass) as values.
    Another important change: We’ll also be using this function to populate the global genre_to_books_dict. This is different from HW9, in which we used the make_genre_list() function to organize our books by genre. We no longer have this function, so we will do this in setup().
    Note: books.csv contains 32 different genres, while genres.txt contains 25! genre_to_books_dict should contain all of the books from books.csv.

  2. get_book(ident: int) -> Book
    Returns the book with ID ident. If a book with the IDident is not found, Python will throw an error for you (since we’re looking in a hashtable), so no worries about raising your own error here.

  3. search_books(query: str) -> list:
    Given a query, return up to 50 of the books that contain the query in the title, author, or genre (case-insensitive in both directions). If there are fewer than 50 books that match, return all of those books. (We no longer have a master list of books to search through, so think about how you can access the books from id_to_book_dict)

New Function to (Re)Implement (required for passing)

The functions below will be ones we made entirely for you in HW9! To account for our de-listifying of HW9, they’re going to need a few changes.

  1. get_book_dict() -> dict: will return a hashtable mapping each genre in genres.txt to a list of 25 randomly selected books in that genre. To do this, we’ll need a list of genres to display, which we stored in display_genres.
    • Hint: Your function will much different from HW9’s get_book_dict! It will be way more concise and will allow for more genres other than the 6 we specified in HW9.
    • Note: In HW9, this function sampled 10 books for each genre, but you will be sampling 25, since there is more data to work with!
    • Note: genres.txt does not contain all of the genres in books.csv! This is okay. You are only responsible for displaying the books in the genres in genres.txt.

Recommendation system (for full credit; not needed for passing):

For Homework 10, you’ll be taking charge of Homework 9’s recommendation scoring system! In HW9, we implemented the recommendation system by using the dataclass BookScorePair, and keeping a list of those pairs. For this homework, you’ll be going into the functions we wrote and editing them as necessary so that they only use the book_to_score hashtable. In other words, we shouldn’t need the BookScorePair dataclass anymore, or any of the functions that work with them (i.e., find_book_score_pair)! It might be a good idea to look back at our list-heavy implementations of these functions, understand how the code is working, and then get to de-listifying. If you have any questions about what our code was doing, feel free to ask at hours/Campuswire!

Here are the rules for the recommendation system, and the functions that apply those rules.

Initially, there should be no books recommended for the user. This means that there are no keys and values in the book_to_score hashmap. When
(1) the user buys a book (which calls buy_books_in_cart),
(2) clicks on a book (which will call get_book), or
(3) searches for something (which calls search_books),

the recommendation scores for books should be updated:

(1) and (2):

Function to implement for rules 1 and 2:
update_recs(book: book):
This function will be called when get_book or buy_books_in_cart is called. It will update the recommendation score of the book passed in accordingly. You will be editing update_recs to work with the recommendations hashtable, instead of relying on a list of BookScorePairs.

Correction to Stencil: In buy_books_in_cart, the purchases.append(book) should happen before update_recs(book) in the loop.

(3):

Function to implement for rules 3:
update_recs_given_query_words(queries: list): will be called in the function search_books. You will take in a query in a list form and update the books’ scores accordingly based on each element in queries. Similar to update_recs, you will be editing our implementation to work with the recommendations hashtable.

Note: Even if the entire search query does not match a book, if part of it does, the book’s recommendation will still be updated accordingly. For example, if "Harry Potter and the Chamber" is searched, the recommendation score for "Harry Potter and the Sorcerer's Stone" will be updated because it partially matches (though it will not be returned in the search results because "Harry Potter and the Chamber" is not part of "Harry Potter and the Sorcerer's Stone").

Clarification: the only keys that should live in the book_to_score hashmap are books that are recommended: in other words, books with scores greater than zero. This is currently how the listified versions of update_recs and update_recs_given_query_words work. In other words, books are only added as keys to the dictionary when they become a recommended book.

Note: Books that have been purchased will not be recommended again.

get_recommendations:
Last but not least, re-implement/edit get_recommendations() which returns a list of books should be sorted by their scores. It should contain no more than 50 books.

Running the code

It’s the same as Homework 9, so refer to that handout for more detailed instructions.


Written Portion

For this homework, you will also be submitting a written reflection on what you have learned. In a block comment underneath your code in hw10_test.py, submit roughly a paragraph on each of the following questions:

  1. Contrast working with lists vs dictionaries from the programming perspective. When would you recommend using one vs the other? Which one do you prefer to use if given a choice?
  2. What are some of the main lessons you are taking away from the course about how to write working programs? What steps or habits have you found most helpful in doing programming assignments?

You will get credit for the written portion for (a) taking it seriously, and (b) making accurate statements about lists and dictionaries for the first part of question 1 (the last question in 1—about your preference—has no correct answer. You’ll get credit for that just for providing an answer).


Testing

Test the following functions in hw10_test.py:

You do not have to test setup(), get_book_dict(), or any of the functions dealing with carts/purchases.


Grading Requirements

Minimum Functionality:

Note: you will have to change portions of the recommendation functions to get the associated data from corresponding book in the data structures that you created, but you will not need to change the implementation details (i.e. you can still use the BookScorePair dataclass that is provided if you’re aiming for minimum functionality.)

Full Functionality:

Correctly implementing minimum functionality will warrant a passing grade, assuming some attempt at testing was made. This is a reminder that your code does not have to be functional to get testing points, so make sure to write tests for your code!


Yee-haw! Once you’ve completed this homework, you’ve successfully implemented what was previously Project 3. Also, congratulations on finishing your last CS111 homework assignment! It’s been one heck of a rodeo :)

Theme Song

The Last Rodeo by Kix Brooks