Due: Wednesday, July 12 at 6pm (submit through Canvas)
To practice developing and testing functions
This assignment combines tables, functions, and lists. Functions are a key part of this assignment. Thus, while you could do this assignment by writing few functions (and creating a lot of definitions), your goal is to write a collection of functions that capture the computations that you need to solve the following problems. We don’t tell you exactly which functions to write (that’s part of what you are thinking about here), but as a general rule you should look to create functions for computations that you might reuse across similar problems to those here.
Collaboration Policy: Your work on this assignment must be entirely your own. Include a collaboration statement attesting that this was your own work.
Put your answers to these questions in a file named schools.arr.
For this set of problems, you will work with a real dataset about student demographics and test scores from schools across Rhode Island. The data is in a google sheet.
Examples are a key part of this assignment. You should be writing examples/tests for all of the functions you write as part of this assignment, unless a question explicitly states otherwise.
Develop a function called num-rows that consumes a table and produces the number of rows in it. Assume the table has a column named school that you can use in this computation.
Develop a function called low-english that consumes a table and produces the a list of the names of schools for which the passing_english score is strictly below 10%.
The school board is currently concerned about two issues: math scores and the impact of charter schools. They want to explore whether schools with strong math scores are more likely to be charter schools. For purposes of this problem, we define "strong math scores" as at least one standard deviation about the mean (explanation follows). Write a query that computes the percentage of charter schools that have strong math scores. Name the results of your query math-charters-percent.
What is Standard Deviation? Imagine that you had a set of numbers and you placed them along a number line. Standard deviation indicates the spread of values along that line: if most values are close to the mean (a.k.a., average), standard deviation is low; as more values are farther from the mean, standard deviation increases. You can use standard deviation to identify values that are farther from the average (according to pre-defined distributions—
read up separately on standard deviation if you want more information). Pyret’s statistics library provides operators mean and stdev, each of which takes a list and returns a number. Given a list mathL of math scores, the strong ones would be those that are larger than
mean(mathL) + stdev(mathL)
Are strong math schools also strong in english? Define a table named strong-me that contains schools that have both strong math scores and strong english scores. (This doesn’t fully answer the question at the start of this problem, but it is part of that computation.)
Write a query that computes the percentage of schools with strong math and english scores that are charter schools. Name the results of your query both-charters-percent.
Now the school board wonders whether strong math scores relate to student poverty levels. Write a query that produces a table of schools with poverty levels of at least 50% that also have strong math scores. The poverty level is defined by the percentage of total students who are eligible for either free or reduced price lunch. Name the result of your query poverty-strong-math.
What percentage of charter schools have student poverty rates above 70%? Name the result of your query charter-poverty-percent.
The school board’s data analyst is often asked to report the results of queries broken out by the levels of schools (elementary, middle, and high school). The school names in the school column end with one of ES, MS, or HS to indicate the school level.
Develop a function called count-by-level that consumes a table and produces a table. The resulting table should have exactly 2 columns named level and count. It should have three rows, whose levels are (in order) "Elementary", "Middle", and "High".
For an example of the format, the following table summarizes how many schools are in the "Providence" district:
table: level, count
row: "Elementary", 21
row: "Middle", 7
row: "High", 8
end
Write a query that computes the summary table of how many schools are in Providence. Use a check block to confirm that your computed table is the same as the one we gave above (this will make sure that you are producing tables in the form that our automated grading expects). For example:
providence-count-manual =
table: level, count
row: "Elementary", 21
row: "Middle", 7
row: "High", 8
end
check:
<YOUR QUERY GOES HERE>
is providence-count-manual
end
Compute the summary table of how many schools at each grade level have strong math scores. Name your table strong-math-levels.
For the school-count summaries to be accurate, (a) every row in the original table must have a level as part of the school name, and (b) your programs should detect a unique level for every school name. This is an example of a sanity check that we should do on the data before we compute with it.
Check whether your method for computing the count-by-levels tables is counting each school exactly once across the rows in the table. How you perform the check is up to you. Include a descriptive comment alongside your code that explains your approach. If your original approach does not count each row exactly once, modify it to count rows only once. Explain any modifications you had to make in a comment.
- How do we say that a parameter is a table?
Use Table as the type (as in fun f(t :: Table) ...).
- What is a check block?
We have been writing examples in a where block. check blocks let you write tests outside of a function. Here’s an example:
fun add1(n :: Number):
n + 1
where:
add1(5) is 6
end
check:
add1(10) is 11
end
check blocks are useful when you want to test the value of something without being inside a function. For example, perhaps you computed a table and associated it with a name, then want to check something about the table. Here’s a concrete example using our running gradebook table from lecture:
snc-table = sieve gradebook using SNC: SNC end
check:
num-rows(snc-table) is 2
end
Feel free to use check blocks in your code as well as where blocks. I tend to think of where blocks as containing a handful of descriptive examples, while check blocks capture more thorough testing of programs.
In grading, we will look for whether you created functions when appropriate and whether you showed good testing practice on the functions you created. Concretely, we will check:
Did you create functions for similar computations across your program?
Did you test your functions appropriately, especially in light of the large dataset you are working with?
We will grade your work both on the correctness of output you produce and on the structure of your code. We grade correctness by computer, and style by human inspection. For structure on this assignment, we want to see you using names, comments, and newlines to make your code readable to others.
Remember to include the collaboration statement.
If you want to check whether your file has the same names as our grading scripts will look for, insert the following code at the bottom of your file. This simply looks for the names and types that we stipulated in the assignment. If one of these checks fails, fix your code, not these checks.
check: |
is-function(num-rows) is true |
is-function(low-english) is true |
is-number(math-charters-percent) is true |
(num-rows(strong-me) >= 0) is true |
is-number(both-charters-percent) is true |
(num-rows(poverty-strong-math) >= 0) is true |
is-number(charter-poverty-percent) is true |
is-function(count-by-level) is true |
strong-math-levels |
end |
Create a directory named functions-hwk. Inside the directory, put a single Pyret file named schools.arr. Submit a zip of the functions-hwk directory. Even though you are submitting only one file, we need the directory to set up the scripts that will run some correctness checks against your homework.
Make sure you follow these directory and file names exactly, or we won’t be able to grade your work.