Tables🔗

    1 What is This Assignment’s Purpose?

    2 Theme Song

    3 Programming with Tables

    4 Starter

    5 Virtual Art Store

    6 Titanic

    7 Assigning Students to Labs

1 What is This Assignment’s Purpose?🔗

We live in a world awash with data. It’s important that we develop facility with being able to write small programs that can find useful information for us. This assignment gives you practice with that over an interesting, real-world data set.

In addition, a lot of the world’s data are structured as tables. This assignment therefore also gets you familiar with tables and writing programs over them.

2 Theme Song🔗

Turning Tables by Adele

3 Programming with Tables🔗

Tables are built into Pyret, so we can program with them directly. The Pyret documentation lists Pyret’s features for working with tables. You will find it helpful to read through this to familiarize yourself with the available support. You can also read chapter 4 of DCIC for a more pedagogic introduction.

4 Starter🔗

Template

5 Virtual Art Store🔗

An online store sells rights to digital content created by artists. Artists come from all over the world, as do clients. This requires currency conversions between the two.

The store has two tables. One is for artwork:

table: id :: Number, cost :: Number, currency :: String

This tracks, for each artwork (which has a unique id), its cost and the currency in which that cost is quoted. The other is for currency conversion:

table: from-c :: String, to-c :: String, conv-rate :: Number

This holds the multiplicative factor (conversion rate) to convert from the first currency to the second.

  1. Write the function

    get-art-in-1 :: ArtworkTable, ConversionTable, Number, String -> Number

    that takes an artwork’s id and a desired currency. It gets the price in the listed currency and, if that is not the desired one, uses the conversion table to find the conversion rate and translates it. You can assume that every artwork queried is listed exactly once in the artwork table, and that every pair of currencies you need is listed exactly once in the currency conversion table.

    In all these problems, both a conversion and its inverse may appear in the table. In this case, the currency conversion rate will be consistent across these two table entries (i.e., one will be the inverse of the other).

  2. Unfortunately, in practice, errors creep in. Either table may be faulty: entries may be deleted by accident, or there may be duplicate entries. If there are missing or duplicate entries for either the input artwork id or the necessary conversion factor, you should raise an exception. You should only compute and return a numeric answer if none of these happen. Call this function get-art-in-2 (with the same signature). Are there any abstractions you can write that can help you clean up your code?

  3. Sometimes, the currency conversion table may not list the conversion from A to B, but it may list that from B to A. If that happens, use the inverse of the conversion table’s ratio. Call this function get-art-in-3 (with the same signature). Make sure that get-art-in-3 also handles errors appropriately.

  4. Of course, sometimes you can get a conversion ratio by composing several: e.g., the table may have neither A-to-C nor C-to-A, but it may have a chain of conversions (e.g., A to B to D to C, potentially including inverses).

  5. In this problem, we’ve represented currency as strings. Another option is to represent it as a datatype: e.g.,

    data Currency: USD | EUR | CHF | INR | … end

    What are the strengths and weaknesses of each representation? In a separate document, explain the trade-offs between these two representations.

6 Titanic🔗

There are a few different versions of databases of passengers who sailed on the Titanic. (There is some ambiguity in this term, because the Titanic made a few (planned) stops on its maiden voyage.) Here is one such database.

The following code enables you to load this into Pyret:

titanic-raw-loader =

  GS.load-spreadsheet("1ZqZWMY_p8rvv44_z7MaKJxLUI82oaOSkClwW057lr3Q")

 

titanic-raw = load-table:

  survived :: Number,

  pclass :: Number,

  raw-name :: String,

  sex :: String,

  age :: Number,

  sib-sp :: Number,

  par-chil :: Number,

  fare :: Number

  source: titanic-raw-loader.sheet-by-name("titanic", true)

end

When submitting, comment out titanic-raw-loader = GS.load-spreadsheet(…) definition and any other lines importing google sheets. These statements are not compatible with the autograder and will raise errors if present.

Relative to the dataset, we define the following concepts:
  • A male has sex field "male", female has it as "female". (We are reflecting the standards of that time, not claiming any form of normativity.)

  • A title is the part of the raw-name field up to but not including the first period.

  • A first name is the part of raw-name between the first and second spaces, skipping over any leading parenthesis.

Your task is to determine the following:
  • The seven most popular male first names.

  • The seven most popular female first names.

  • The frequencies of the titles.

If there’s a tie, include all the tied names.

What to submit:
  1. Write a program that computes these values. Use sensible variable names and/or comments to make clear which parts of your program compute what.

  2. Answer the following questions on Gradescope:

    1. Write in descending order the 7 most common male first names.

    2. Do the same for the 7 most common female first names.

    3. Describe any observations you have about the above two answers.

    4. Describe any observations you have about the titles.

7 Assigning Students to Labs🔗

Imagine you are running a course that has labs on three days: Monday, Wednesday (two labs), and Thursday. Each lab has a capacity of 20 students.

You ask students to indicate their lab preferences. This results in a table such as this:

Timestamp

 

Name

 

Mon

 

Wed

 

Thu

9/10/2024

 

Amina

 

Second

 

First

 

Third

9/10/2024

 

Bindu

 

Can't

 

First

 

Second

9/11/2024

 

Chen

 

Third

 

Second

 

First

9/12/2024

 

Dalia

 

Can't

 

First

 

Second

The problem is to figure out how to assign students to labs. A solution is a lab assignment: i.e., a mapping of students to days (you can think of this as another table…).

Before you start programming, you should figure out what properties a solution should have. These can be split into two categories:
  • Properties that a solution must have.

  • Propeties that it would be nice for a solution to have, but are not essential. In the absence of a perfect solution (i.e., the property cannot be fully satisfied), what would a “good” solution look like?

Please list the properties you can think of in each category. Use blank lines to separate them. You will submit each category separately on Gradescope.

We are emphatically not asking you to write code for this! However, you may have some thoughts on how to go about solving it. We’ve provided another box on Gradescope where you can provide your thoughts. If you really have no idea how to solve the problem, that’s also okay, because we haven’t covered anything like this in the class. Just say so, no judgment!