Tables🔗

    1 What is This Assignment’s Purpose?

    2 Theme Song

    3 Programming with Tables

    4 Starter

    5 Virtual Art Store

    6 Titanic

1 What is This Assignment’s Purpose?🔗

We live in a world awash with data. It’s important that we develop facility with being able to write small programs that can find useful information for us. This assignment gives you practice with that over an interesting, real-world data set.

In addition, a lot of the world’s data are structured as tables. This assignment therefore also gets you familiar with tables and writing programs over them.

2 Theme Song🔗

Upside Down by Jack Johnson

3 Programming with Tables🔗

Tables are built into Pyret, so we can program with them directly. The Pyret documentation lists Pyret’s features for working with tables. You will find it helpful to read through this to familiarize yourself with the available support. You can also read chapter 4 of DCIC for a more pedagogic introduction.

4 Starter🔗

Template

5 Virtual Art Store🔗

An online store sells rights to digital content created by artists. Artists come from all over the world, as do clients. This requires currency conversions between the two.

The store has two tables. One is for artwork:

table: id :: Number, cost :: Number, currency :: String

This tracks, for each artwork (which has a unique id), its cost and the currency in which that cost is quoted. The other is for currency conversion:

table: from-c :: String, to-c :: String, conv-rate :: Number

This holds the multiplicative factor (conversion rate) to convert from the first currency to the second.

  1. Write the function

    get-art-in-1 :: ArtworkTable, ConversionTable, Number, String -> Number

    that takes an artwork’s id and a desired currency. It gets the price in the listed currency and, if that is not the desired one, uses the conversion table to find the conversion rate and translates it. You can assume that every artwork queried is listed exactly once in the artwork table, and that every pair of currencies you need is listed exactly once in the currency conversion table.

    In all these problems, both a conversion and its inverse may appear in the table. In this case, the currency conversion rate will be consistent across these two table entries (i.e., one will be the inverse of the other).

  2. Unfortunately, in practice, errors creep in. Either table may be faulty: entries may be deleted by accident, or there may be duplicate entries. If there are missing or duplicate entries for either the input artwork id or the necessary conversion factor, you should raise an exception. You should only compute and return a numeric answer if none of these happen. Call this function get-art-in-2 (with the same signature). Are there any abstractions you can write that can help you clean up your code?

  3. Sometimes, the currency conversion table may not list the conversion from A to B, but it may list that from B to A. If that happens, use the inverse of the conversion table’s ratio. Call this function get-art-in-3 (with the same signature). Make sure that get-art-in-3 also handles errors appropriately.

  4. Of course, sometimes you can get a conversion ratio by composing several: e.g., the table may have neither A-to-C nor C-to-A, but it may have a chain of conversions (e.g., A to B to D to C, potentially including inverses). We are not asking you to implement this.

  5. In this problem, we’ve represented currency as strings. Another option is to represent it as a datatype: e.g.,

    data Currency: USD | EUR | CHF | INR | … end

    What are the strengths and weaknesses of each representation? In a separate document, explain the trade-offs between these two representations.

6 Titanic🔗

There are a few different versions of databases of passengers who sailed on the Titanic. (There is some ambiguity in this term, because the Titanic made a few (planned) stops on its maiden voyage.) Here is one such database.

The following code enables you to load this into Pyret:

titanic-raw-loader =

  GS.load-spreadsheet("1ZqZWMY_p8rvv44_z7MaKJxLUI82oaOSkClwW057lr3Q")

 

titanic-raw = load-table:

  survived :: Number,

  pclass :: Number,

  raw-name :: String,

  sex :: String,

  age :: Number,

  sib-sp :: Number,

  par-chil :: Number,

  fare :: Number

  source: titanic-raw-loader.sheet-by-name("titanic", true)

end

When submitting, comment out titanic-raw-loader = GS.load-spreadsheet(…) definition and any other lines importing google sheets. These statements are not compatible with the autograder and will raise errors if present.

Relative to the dataset, we define the following concepts:
  • A male has sex field "male", female has it as "female". (We are reflecting the standards of that time, not claiming any form of normativity.)

  • A title is the part of the raw-name field up to but not including the first period.

  • A first name is the part of raw-name between the first and second spaces, skipping over any leading (not trailing) parenthesis.

Your task is to determine the following:
  • The seven most popular male first names.

  • The seven most popular female first names.

  • The frequencies of the titles.

If there’s a tie, include all the tied names.

Consider using spy instead of print to view the names.

What to submit:
  1. Write a program that computes these values. Use sensible variable names and/or comments to make clear which parts of your program compute what.

  2. Answer the following questions on Gradescope:

    1. Write in descending order the 7 most common male first names.

    2. Do the same for the 7 most common female first names.

    3. Describe any observations you have about the above two answers. Are these really the most common male and female names? What can you infer based on how the names are presented?

    4. Describe any observations you have about the titles. How do these titles contrast to what we might see today? What does that say about how society has evolved?