Project 3: Books for Days - Fall 2019

The 99th Precinct is up to something. As a part of their annual shenanigans, Detective Peralta, Boyle, Santiago, Diaz, Lieutenant Jeffords, and Gina are organizing a Secret Santa gift exchange; they are being quite nerdy this year and decide that they want to give each other books. However, on top of them having zero clue what books to buy, they are also too busy with their detective work to enjoy the pleasure of actually going to a bookstore.

You and your friend are good friends with Terry, and you know know that you both will be among those that Terry loves to the moon and back if you can build a system so that (1) they can get recommendations for what books to buy, and so that (2) they can check out and buy those books!

Deadlines

Mandatory Partners Signup (here): Wednesday, Nov 20th, 4.30PM
Optional Design Check Calendar Signup (here): Wednesday, Nov 20th, 4.30PM

Design Checks: Friday, Nov 22nd - Tuesday, Nov 26th
Design Check handin (Gradescope): Wednesday, Nov 27th, 12PM.

Final handin deadline: Monday Dec 9th 9:00 PM
Late policy: Normal late policy applies, with the late days being applied to both partners.

Note: Design check meetings are optional because of possible conflicts with Thanksgiving travel plans. However, all partners are still required to submit a design check handin on Gradescope. If possible, we highly recommend you sign up for a design check meeting to get feedback on your ideas. What will be graded for the design check is your final submission on Gradescope, which is due 12pm on Wednesday, November 27th. You are free to re-submit your design check document as many times as you’d like up until the deadline.

Handin Files

Design Check:

Final Handin:

Summary

What you need to do

You and a friend want to make a website for people to be able to view and purchase books. You think you have devised a clever algorithm for determining books to recommend to the user that will make your site unique.

Your friend knows a little bit about web development, so they have handled all of the “frontend” work of the website. That includes writing everything required to display the website and make it so that when the user clicks on things, the right functions are run.

However, your friend has no idea how to handle the backend of the site. The backend includes the logic for users searching, adding and removing from their cart, purchasing books, and viewing book recommendations. You will fill in the necessary functions to make the website run!

To prototype your idea, rather than build a massive site supporting many people buying books you will instead create an example site that supports only one user. This will be enough to let you work out some key data structures and functions (so you can get buy-in from other people, including potential funders!).

You will work with a dataset including the best sellers from amazon.com (we have already downloaded this as a CSV file, books.csv, for you - you can read more about this file at the end of this file!). The dataset gives information about the title, author, genre, number of reviews, and book rating (out of 5 stars). Not all books have review and rating information, but many do. We have already read all the information from the CSV file into a variable named lines in the code provided for you.

For the design phase, you will come up with a plan for the data structures required to handle the website logic (which we describe below), and write short dummy data examples that will make the website display a few books.

For the implementation phase, you will be filling in the functions to make the website actually run.

Overview: The website (provided)

The website:

A little bit more details about these pages:

You are not being asked to build these pages. That code will be provided. You just need to compute the data that gets displayed on the pages.

The home page

The home page shows a collection of books, organized by genre. There is no particular order to these books, they should be chosen randomly from the dataset every time the page is loaded.

Recommendations page

The recommendations page shows a list of books, sorted by how highly the system ranks those books for the user (for example, a book with 7 points comes before a book with 3 points; the point system is detailed below).

Search page

The search page shows a list of books matching the user’s search query, sorted by the rating of the books in descending order (for example, 4.8 stars comes before 3.5 stars).

Book viewing page

The book viewing page gives you more information about a book, and allows you to add the book to your cart.

Cart page

The cart displays all the books in the user’s cart, allowing them to remove individual books and purchase all the books in the cart.

Each page has a header with links to useful pages and a search bar to look for books, and a footer that displays the book covers of the user’s previously purchased books.


Overview: Your portion

When your friend implemented all of these pages, they made some assumptions about how your code would communicate with theirs. In particular, your friend has already defined the Book dataclass and the names (and input/output types) of several functions that the visual part of the website depends on.

Everything your friend defined already is in a file named books.py in the directory we are giving to you. You’ll have to complete that file to make the website work. In summary, you’ll need to provide code for the following operations (headers for which are already in the books.py starter file):

In order to write these functions, you’ll have to make some decisions about how to organize some of the required data. In particular, you’ll need to figure out how to represent the collection of all books from the dataset, the user’s shopping cart, the user’s prior purchases, recommendations for the user, and any additional internal datatypes that may be required.

Remember that even though lists (or hashtables) are the required output of many of these functions, your internal datatypes do not necessarily have to be lists (or hashtables).

Read the implementation details section carefully for more information about how to implement these functions. This section contains specific details about how we want you to implement the recommendation system and find search results based on the query.

Running the web application

The zip file we provide you (see Design Check section) contains all the files required for this assignment. Your code will go into books.py and test_books.py.

To run the web app, you need to run app.py (also in that folder). app.py contains the code to create what is called a “Flask application.” Flask is a python package (similar to how datetime and dataclasses are python packages) which allows you to make web servers.

In PyCharm, run app.py. You should see something like this:

While this is running, you will be able to go onto your web browser and go to the url http://127.0.0.1:5000 and see the web app! This will only work on the computer that is running the application.

You can only have one version of the web app running; if you try to run app.py twice, you will see an error message. You should be able to have app.py running constantly; if you then make changes to books.py, the app will automatically reload itself and you can refresh the website.

If you are having a hard time running the web app, let the TAs know!

Design Check

For questions that ask you to figure out data structures, you should provide a combination of prose and code that indicates the main data structure with the types of all required components of the data (i.e., contents for lists, keys and values for hashtables, fields for dataclasses, etc.). You should write your prose answers and explanations of your code in block comments (""" block comments go in between triple quotes""").

For example, if we asked you to set up data structures for a zoo, your answer might say “the zoo is a list of animals, where animals are a dataclass”, with the actual dataclass for an Animal written out in code.

You are not required to meet with a TA in person, so in-person design checks will not be graded. Rather, in-person design checks will be used like optional personal check-in hours so that you and your partner can share your design / ideas with a TA. This is intended to help you get feedback on what you have completed by the time of the design check.

What will be graded is your final design check submission on Gradescope, which is due 12pm on Wednesday, November 27th. You are free to re-submit your design check document as many times as you’d like up until the deadline.

Requirements

  1. Find a partner. You should work with someone other than your partners for the first two projects. Ideally, you and your partner probably should have similar goals for this project (for example, those heading for CS18 may want to think about the project in more depth, those aiming for less functionality should pair with others intending the same, etc.). If you don’t sign up on this form with your partner by Wednesday, Nov 20th at 4.30PM, we will randomly assign you with a partner in the class. Only one partner needs to fill out this form.

    Because of the timing of this project and Thanksgiving, design check meetings are optional. However, submitting a books.py file to Gradescope with answers to the following design check questions is mandatory. We recommend you sign up for a design check meeting if you can.

    You will by the file that you hand in by Wenesday, November 27th at 12PM, and not by the optional meeting with your design check TA. You, however, are still strongly encouraged to finish the things in the design check by the meeting with your project TA, as you would have a much more productive meeting with better questions to ask about the project once you have attempted at answering design check questions.

    Sign up for an optional design check appointment time here. Make sure to invite your project partner to the event!

  2. Setup: Download our starter zip file from this link. Instructions for the rest of the setup in a video at the bottom - please watch it and follow the steps accordingly! Let us know if you have any trouble setting up the project.

  3. Coding:

    • Manually create a simple hashtable that maps two genres (for example, Fantasy and History) each to a list of two books.
    • Make the get_book_dict() function return this hard-coded dictionary, and run the app. You should see that your website will have books displayed associated with the genres that you defined in the hard-coded dictionary.
    • When you hand in your code for the design check, make sure that you are returning this hard-coded dictionary for the function get_book_dict(). None of the other features need to work for this stage. This hard-coded dictionary that you made serves as a smaller example to the data structure that this function is supposed to return.
  4. Coding: Figure out which data structures you need to store the collection of books (that you will load from the provided csv file). This data structure must support efficiently finding the details of individual books. More specifically,

    • Instantiate this data structure in your books.py that you are going to hand in.
    • With this representation in mind, write the get_book function (in books.py).
  5. Coding: Figure out which data structures you need for the information about the user: their shopping cart, purchases, and recommendations. Remember that your data structures only need to support one user (rather than multiple users as a full bookstore site would have). More specifically,

    • Instantiate a data structure to store the user’s shopping cart.
    • Instantiate a data structure to store the things that the user has bought.
    • Instantiate a data structure to store information for the recommendation system. Specifically, the data structure should be able to somehow associate a Book with a recommendation score (int).
  6. Design: Given the data structures that you have provided above, skim through the implementation phase section (below) to get a general understanding of what you need to do for the project. Then, for each of the data structure, provide the following information:

    • What information will be carried by this data structure?
    • When will information be loaded to/changed in/deleted from this data structure? Will the data structure need to be loaded once (in the setup function) and remain unchanged, or will it change as the different functions in our program are being called?

The Implementation Phase

For this phase, you will fill in the rest of the functionality that’s missing from the books.py starter file. We have included some Python hints as a support code link on the course website; they may or may not be useful to you.

As with project 2, this project will go more smoothly if you implement it in stages, rather than all at once. The three levels of functionality in the grading section suggest how to break this down into more manageable chunks.

The next three sections provide additional details on specific features or functions:

Details: Setup Function

In books.py, you will need to set up several global variables for any data that will be modified as the web app runs. For example, if the cart is a list of books, that list of books will be appended to and removed from when appropriate.

The setup() function should populate all of your global variables.

Note: You will need to define the global variable outside of any function, otherwise it will not exist inside of that function.

lst = [] # our global variable
def setup():
    lst.append(3)
    lst.append(4)

setup()
print(lst) # prints [3, 4]

You only need to initialize your global variables in the setup() function. The Flask application will call setup() before any of the other functions. As a result, you do not need to repeat the setup tasks (i.e. converting the lines of the csv file into the right data structures) in subsequent functions, or call setup() in any of the functions that you write.

Details: Homepage

When the homepage is opened, the get_book_dict function is run. This should return a hashtable that maps genres to lists of up to 20 Books. Again to optimize the website’s loading speed, you should not return a hashtable with one key for every genre in the dataset; rather, there should be one key for each of the genres in display_genres, which is a list of genres loaded on line 12 of books.py.

The books in the list for each genre should be chosen randomly so each time the homepage loads, the user sees 20 different, say, Fantasy books. You can use the random module to accomplish this (Google it!).

When the user searches in the search bar, your code needs to produce a list of all books from the dataset that match the search query. A search query matches if the query is part of the book’s title, author, or genre. This should be case insensitive in both directions.
Examples:

The returned list should be sorted by the books rating (with highest rating first), and should contain no more than 50 books (so as to not overwhelm the website).

Details: Recommendations

Initially, there should be no books recommended for the user. When (1) the user buys a book, (2) searches for something, or (3) clicks on a book (which will call the function get_book). The function get_book should take in the ID of the book that is clicked on, update the recommendations based on that book, and return the Book with that ID.

The recommendation information should update as follows:

(1) and (3)

(2)

You will need some way of internally tracking points such that, when the user goes to the recommendations page (and the get_recommendations() function is called), a list of Books can be returned. The returned list should be sorted from most points to least points, should contain no more than 50 books (again so as to not overwhelm the website), and should not contain any books that have zero points.

Grading

Functionality:

The three levels of functionality demonstrate how to break this project down into more manageable chunks.

Minimum functionality

Mid-tier functionality

Full functionality

Testing

For this project, we are most interested in how you test functions that update data structures. Which function we will focus on for grading depends on how far you got:

You do not need to write tests for functions that use the random module. If you want to, you’ll have to think of non-explicit ways to test those functions (for example, looking at the length of a list rather than its contents).

Design and Clarity

We will grade design and clarity in the same manner as previous assignments. You should write helper functions where necessary to abstract your code, and you should use any previously written functions where you can rather than rewriting their functionality.

Reflection

This project was designed to give you practice with organizing data for both updating and fast access. Answer the following questions about these topics in a text file called reflection.txt.

  1. Describe one key insight that each partner gained about programming or data organization from working on this project.

  2. Describe one or two misconceptions or mistakes that you had to work through while doing the project.

  3. State one or two followup questions that you have about programming or data organization after working on this project.

  4. In lecture, we discussed (or soon will) the idea that different approaches to features like recommendations can yield very different results. As a result, the underlying models of what to recommend change over time. If we asked you to replace the current method for determining recommendations with a new one, what are all the places in your code that would need to change? What places would NOT need to change? Do you feel that any of your code should have been isolated differently so that less code would need to be changed? (This last part will depend entirely on your data structures and code – there is no single right/wrong answer.)

  5. We gave you a dataclass for Books that did not include recommendation score. Why is that? There are a few reasons for this; try to think of at least one.

  6. Consider different designs for how websites with recommendation systems can handle data about user search queries and what the user clicks on. Describe one design choice for a recommendation system that could minimize ethical issues related to privacy. Think about different methods of data storage or transparency.

  7. Propose another feature or scoring rule that could be implemented to improve the recommendation system. Identify a potential ethical issue that could arise with its addition. Consider issues surrounding data storage and privacy or issues related to the readings in the last homework assignment.

Final notes

Support

If you have any questions, feel free to come to TA Hours or post on Campuswire!

As this project is out during Thanksgiving, we know that some of you might be out of town and will not be able to access TA help at the times you want to – so we will try to be of as much help on Campuswire as possible. However, as most of us are also back home for Thanksgiving as well, please be patient with us as we try to get back to you as fast as we can! :-)

You got this!

Feedback

Have any feedback about this assignment, or about the course in general? Submit your feedback here!

Conclusion

Hooray, you did it! Thanks to you and your friend, Detective Santiago finally bought a great gift for Captain Holt.

Appendix 1: Project Setup Instructions

If you are having trouble accessing the video player above, the video can be viewed here.

Text instructions at this link: https://cs.brown.edu/courses/csci0111/projects/project-3-installation.html

Appendix 2: Structure of the folder we provided to you

This project contains several default files that will be working in tandem with each other:

app.py: This is the file that will run the web app, get the returns from the functions that you wrote in books.py to display them on your the web app.

books.csv: A large comma separated file containing all of the books that the website could comprise. From left to right the columns that it contains are: Title, Author, Genre, Image, Rating, and Reviews.

books.py: A Python file that allows the web app to (1) get the books as the user interacts with the website, (2) keep track of the books that have been purchased/added to the cart, (3) keep track of the recommendations we want to give to the user.

genres.txt: A text file the genres that we are displaying (there are a lot other genres in the books.csv file, but we are not displaying every single one of them!)

static: A folder that contains .css and .js files that mostly allow us to style our snazzy web-app (in addition to displaying important information on the pages that you see, such as displaying the buttons as added to cart or not)!

templates: A folder that contains the structures of the pages in your web app

reflection.txt: A text file containing your answers to the reflection questions.

test_books.py: A test file where all of the test functions for the books class should be located.