CSCI0050 Python Programming and Data
CSCI0050 Homework #8: Python Programming with Variables and Data

Due: Monday July 29, 6pm

Handin as usual through the Google form

Everybody should do the first two sets of exercises (Data Design and Program Tracing), then one of the two Programming options (Standard or Advanced). Pick whichever programming option matches your level and/or interests. You don’t get extra points for trying the advanced exercise. At the same time, don’t shy away from the advanced problems for fear of hurting your grade – there’s some grading leeway on the advanced problems, as we want you to work on what’s best for your learning. If you are interested in the challenge and think you are up to it, give it a try.

Collaboration Policy: You may work on this assignment with others. Include a collaboration statement describing how you did the work.

Data Design Exercise (No Programming)

Put your answers to this in a plain text file named email.txt.

Imagine that you were asked to build an email-system. The system has users, each of which has a real name, a username, and a password to the system. Each email message has a sender, one or more recipients, when it was sent, a subject, and its contents (ignore attachments for this assignment). For simplicity, we’ll assume that all senders and recipients are part of the same system (ie, that everyone has a gmail address, for example, rather than some in gmail and some in yahoo).

The system tracks which messages each recipient has read. A user can ask to see just their unread messages, or all of their messages.

The system needs to support people sending new messages, reading messages, and deleting messages. (We will ignore replies for now, treating each "reply" as a new message).

What data structures would you propose for maintaining information on users, email messages, and which messages have been read (by which recipients)? Your answer should specify which kind of data (lists, tables, data blocks, tuples, numbers, etc), including the types of any subcomponents or columns. You can think through this in either Pyret or Python (your choice). We just want to see how you envision organizing the data, with an eye towards supporting the operations that people want to do.

You are not being asked to write the code for the email system. This is only a question about the data structures you would use.

Tracing Program Execution (No Programming)

Put your answers to this in a plain text file named tracing.txt.

For the following program, show the memory and environment (known names) contents every time execution reaches the a line marked "memory point here" (the same point may be reached more than once). Present your contents as a series of mappings from names to values (written in comments), in the form of:

  ENVIRONMENT/KNOWN NAMES

  x --> 5

  y --> (loc 1001)

  z --> (loc 1001)

  

  MEMORY

  (loc 1001) --> [1, 2, 3]

The program uses a simple for loop:

  def concat_short(word_list : list):

      word = "";

      for elem in word_list:

          if len(elem) <= 3:

              # memory point #2

              word = word + elem

      return(word)

  

  concat_short(["a", "list", "has", "words"])

  # memory point 1

Python Programming: Standard

Put your work on these problems in a file called readings.py

An environmental agency gathers and tracks data on air-quality. The agency collects readings of key air-quality factors from various locations, combines the data, and uses it to issue reports about which locations have concerning air-quality data. In the USA, air-quality is tracked through 5 measurements: this assignment will work with two of them: ozone levels, and sulfur dioxide (so2) levels.

Here is a dataclass for air-quality readings. Use this as part of solving the problems that follow.

  from dataclasses import dataclass

  from datetime import date

  

  @dataclass

  class Reading:

    type: str         # what kind of reading (ozone, so2, etc)

    when: date        # what date was it taken

    level: float      # the numeric reading

    location: str     # where was the reading taken

Python Programming: Optional Challenge

If you choose do to these problems, put your work in a file called folders.py

If you don’t feel comfortable writing loops or assigning variables, do the standard problems instead. This problem set is more a test of your ability to write functions over complex data structures. These problems are NOT required.

An IT company wants to be able to write programs that explore directories of files. Think of the directories (folders) on your own computer: you have folders, each of which may contain files and other folders. This is again a form of tree-shaped data.

More concretely:

For example, my laptop has a folder named "CS50", which has a subfolder named "Python", which has files "todolist.py" and "testlight.py". The contents of the files are irrelevant (any string will do).

You will create data for files and folders, then write some programs to process them.

Warning: Think about recursion here!. That pattern provides a natural way to organize your code for these problems.

Tips and Hints

Sorting Lists

Python has a built-in operator for sorting lists. If you write

  L = [2, 3, 1]

  L.sort()

  print(L)

Python displays [1, 2, 3] – in other words, Python modifies the list to put the arguments in sorted order. When you have something more complicated that numbers, you have to tell Python which parts of the data to use for sorting, and in what order to consider the parts. Here is a function that sorts a list of dates into earliest to latest order (year most important, then month, then day):

  def sort_by_date(date_list : list) -> None:

      date_list.sort(key = lambda d: (d.year, d.month, d.day))

To get the list sorted from latest to earliest order, we add an argument to sort telling it to use reverse order:

  def sort_by_date(date_list) -> None:

      date_list.sort(key = lambda d: (d.year, d.month, d.day),

                     reverse = True)

You should be able to modify these procedures as needed to sort a list of readings by date, as required in this problem.

Grading Expectations

For the data design exercise, we want to see whether the data organization you choose captures all the relevant information, and does so in suitable data structures. There isn’t a single correct answer to that problem. We are looking for you to have a reasonable answer, as judged by the two criteria in the previous sentence.

For the programming problems, we’ll be looking at your functionality (does your code pass our test cases), design (did you pick appropriate constructs and break your work down properly into separate functions), and testing (do you have tests for all functions, and cover more than just the trivial cases). Review your feedback on previous assignments to see what we have been looking for in grading.

Use the testlight.py file from class for writing tests.

What to Turn In

Handin as usual through the Google form