Lab 11: Hashtables

As you may recall from Lab 10, Sherlock and Watson have recently graduated from Brown. The students and faculty missed them so much that the administration has asked them to return in case there are any more Muffin thefts. Sherlock and Watson have accepted offers to work on the team of registrars, since they had so much fun dealing with their own course selections.

For their first project, Sherlock and Watson need to rebuild the Courses@Brown system to keep track of courses, students, and registration. And they need YOUR help!

Part 1: Design

Sherlock and Watson will first have to consider the different elements necessary to keep track of information in a basic cataloguing system.

  1. Start by brainstorming with your partner: if you were to build a basic system for keeping track of courses and student registration, what information would you need to keep track of? Make a list of dataclasses and their respective attributes that would help group the information together.

  2. Next, consider the tasks you’d expect such a registration system to be able to handle. You might want:

    • A way to add or remove students or courses from the system.
    • A way to update information about a student or course.
    • An easy way for students to browse through the courses they can register for.
    • An easy way for the registrar to check how many courses are offered this semester and what the enrollment is for each.

    Use these examples to guide your thinking about data organization. How might you organize information about course registration to account for tasks like those above?

    Once again, brainstorm some data organization strategies with your partner. What data types would it make sense to use? Come up with at least two different strategies and briefly weigh their pros and cons.

Once you’re ready, take a look at the TA suggestions for how to organize this data. If you’re not sure why we chose something, call over a TA!

Part 2: Modeling the Data

With the results of your brainstorming and our suggestions in mind, Sherlock and Watson are now ready to start implementing the system! Fire up Pycharm, create a new Python file (you can put this in your Labs folder or wherever you’ve been saving your lab files), and name it cab.py.

Types

  1. The first thing you’ll need to do is create two separate dataclasses: one to represent students and one to represent courses. Below, we have outlined the minimum features that each dataclass should have. However, feel free to add any additional fields you want (including any that you brainstormed in Part 1).

    NOTE: Make sure to import dataclasses with from dataclasses import dataclass

    Course

    Each course should have at least:

    1. A unique course code (eg: “CSCI0111”)
    2. A list of registered students
    3. An indication as to whether or not the course is offered in the current registration period

    Student

    Each student should have at least:

    1. A unique identifying id
    2. A list of courses they’re registered for

The Model

  1. Now that the dataclasses have been created, Sherlock and Watson need a way to keep track of the multiple students and multiple courses. Create two hashtables:

    • course_directory - a hashtable that stores all courses.
    • student_directory - a hashtable that stores all students.

    You should initialize them to be empty.

  2. Rather than manually declaring the directories of students and courses, Sherlock and Watson would like to use functions that can streamline the process of adding students and courses. Write the following functions to add courses and students to their respective directories. Don’t worry about deletion for the purposes of this lab.

    • create_student - takes in any information necessary to create a new instance of your student dataclass, and then adds that student to student_directory. New students should not be registered for any courses.

    • create_course - takes in any information necessary to create a new instance of your course dataclass, and then adds that course to course_directory. New courses should not have any registered students.

      HINT:
      Note that while we need to artificially create unique identifiers to use as keys for students, courses already have unique identifiers: course codes. You will need to account for this difference when creating students and courses as follows:

      • For student identifiers, consider using an incrementing global variable (e.g., make the first id 1, the second id 2, and so on, so that each student can have a unique id). This choice allows a user to input a student without worrying about generating a unique id on their own.
      • Courses already have course codes, so the desired code should be one of the inputs to the function. Because they are not generated programmatically, you will need to check to ensure that the code is not already in the dictionary before adding the course. If it is, raise an error that tells the user that their desired course code is already in use.
  3. Sherlock and Watson find it important for students to be able to register for classes without having to wait forever for the cart to register and without the site crashing. Write the function register_student which takes in a student id and course code and either registers the student for that course, or fails informatively.

    (In this case, “failing informatively” means raising an error letting the user know why the student couldn’t be registered for the course.)

    The function should fail if:

    • The course code is not in the course directory
    • The student id is not in the student directory
    • The student is already registered for the course
    • The student is registered for more than 5 courses
    • The course is not offered in the current registration period

    If you included other information in your model that might be relevant (maximum class enrollment, semester level, etc), feel free to also write fail cases based on that information.

  4. Sherlock and Watson need to find appropriate classrooms for each course based on how many students they need to seat. Write the function check_enrollment which takes in a course code and returns an integer representing the number of students enrolled in that course.

  5. CPax has asked for a master list of all the courses arranged by enrollment numbers to make sure that Sherlock and Watson are doing a good job with their new registrar positions. Write the function sort_by_enrollment which does not have any inputs and produces a list of all of the course ids in course_directory, sorted in descending order by number of enrolled students.

    NOTE: By now, you’ve seen custom sorting functions on lists a few times before. For a refresher, check out Lab 9.

    Hashtables are a little trickier to handle because they are an unordered datatype, which means that they can’t be sorted. In order to sort them, we will need to convert them into some sort of list while still maintaining the relationship between keys and values.

    We can solve this problem by using a datatype called a tuple. In Python, a tuple is a collection of values that are a) grouped together, b) ordered, and c) immutable. Below are some examples of tuples:

    (3,4)                    # an (int, int) tuple
    (1, "hello")             # an (int, string) tuple
    ("John", "A.", "Doe")    # a (string, string, string) tuple
    ("T1", [0, 1, 5], True)  # a (string, list, bool) tuple
    

    Note that tuples can consist of any number of values and any combination of types. However, once you’ve created a tuple, the number of values in that tuple is fixed.

    (An interesting property of mutability happens if one of the items in a tuple its itself a mutable datatype, like a list. If you’re curious, try altering the 1st, 2nd, and 3rd values in the last tuple example to see what happens!)

    To access specific values in a tuple, use bracket notation to specify an index:

    >>> x = (1,3,4)            # create a variable x and assign its value
    >>> x[0]                   # get the value of x's first element
    1
    >>> x[2]                   # get the value of x's third element
    4
    

    Now, let’s circle back to our original problem - we have a hashtable consisting of keys and values, and we want to figure out how to sort it as a list. Tuples provide a natural way to represent a single key-value pair: we can use a tuple with the following structure:(key, value).

    In Python, hashmaps come with an items() method that converts the hashmap to a list of (key, value) tuples. As a result, in order to sort a hashmap you can write the following:

    example_hash_map = {1: "detectives", 2: "Frank Hardy", 3: "Nancy Drew"}
    sorted(example_hash_map.items(), key=lambda... reverse=...))
    

    It’s up to you to fill in key and reverse, depending on how you want the dictionary to be sorted.

  6. Sherlock and Watson want students to be able to graduate (aka not be undercredited at the end of their 4 years). Write the function underenrolled_students which does not have any inputs and returns a list of student ids. These ids should represent all of the students in student_directory that are registered for fewer than 3 courses.

Flying Colors

Of course, Sherlock and Watson did phenomenally well (thanks to you!). To celebrate, they’re planning a fun ski trip for winter break!