Data Scripting 2
For each problem, write two solutions, where each solution solves the problem using a different approach. You should also determine which solution structure you prefer. You will be given a form to provide your answers. Specify your preference with a brief discussion of why.
Approaches count as different if they cluster at least some subtasks of the problems differently (like we saw for the Rainfall solutions); merely syntactic differences, such as replacing an element-based for-loop with an index-based one, don’t count as different. It has to be a different decomposition of the tasks (i.e., compositions of different plans).
In the end, if after racking your brain you simply can’t think of two truly different ways of doing one of these problems, submit the two most different versions you can.
1 Programming Problems
A personal health record (PHR) contains four pieces of information on a patient: their name, height (in meters), weight (in kilograms), and last recorded heart rate (as beats-per-minute). A doctor’s office maintains a list of the personal health records of all its patients.
data PHR:
| phr(name :: String,
height :: Number,
weight :: Number,
heart-rate :: Number)
end
1.1 The BMI Sorter
BMI = weight / (height * height)
fun bmi-report(phrs :: List<PHR>) -> Report
data Report:
| bmi-summary(under :: List<String>,
healthy :: List<String>,
over :: List<String>,
obese :: List<String>)
end
Sample output: Given the following input list of PHRs:
[list: phr("eugene", 2, 60, 77),
phr("matty", 1.55, 58.17, 56 ),
phr("ray", 1.8, 55, 84),
phr("mike", 1.5, 100, 64)]
The output of bmi-report should be:
bmi-summary(
[list: "eugene", "ray"], # under
[list: "matty"], # healthy
[list: ], # over
[list: "mike"] # obese
)
1.2 Data Smoothing
In data analysis, smoothing a data set means approximating it to capture important patterns in the data while eliding noise or other fine-scale structures and phenomena. One simple smoothing technique is to replace each (internal) element of a sequence of values with the average of that element and its predecessor and successor. Assuming that extreme outlier values are an abberation caused, perhaps, through poor measurement, this averaging process replaces them with a more plausible value in the context of that sequence.
For example, consider this sequence of heart-rate values taken from a list of personal health records (defined above):
95 102 98 88 105
The resulting smoothed sequence should be
95 98.33 96 97 105
102 was substituted by 98.33: (95 + 102 + 98) / 3
98 was substituted by 96: (102 + 98 + 88) / 3
88 was substituted by 97: (98 + 88 + 105) / 3
This information can be plotted in a graph such as below, with the smoothed graph superimposed over the original values.
fun data-smooth(phrs :: List<PHR>) -> List<Number>
Sample output: Is given in the descriptive example above, assuming the initial sequence is instead a list of PHRs with the given values as the heart-rates.
1.3 Most Frequent Words
fun frequent-words(words :: List<String>) -> List<String>
the input will have at least three different words
all characters are lowercase letters (there will be no numbers, punctuations, or white spaces)
multiple words with the same frequency will have different lengths
Sample output: Given the input list:
[list: "silver", "james", "james", "silver",
"howlett", "silver", "loganne", "james", "loganne"]
The result of frequent-words should be
[list: "james", "silver", "loganne"]
1.4 Earthquake Monitoring
Geologists want to monitor a local mountain for potential earthquake activity. They have installed a sensor to track seismic (vibration of the earth) activity. The sensor sends measurements one at a time over the network to a computer at a research lab. The sensor inserts markers among the measurements to indicate the date of the measurement. The sequence of values coming from the sensor looks as follows:
20151004 150 200 175 20151005 0.002 0.03 20151007 130 0.54 20151101 78
The 8-digit numbers are dates (in year-month-day format). For example, the first number 20151004 above is October 4th, 2015.
Numbers between 0 and 500 are vibration frequencies (in Hz). This example shows readings of 200, 150, and 175 on October 4th, 2015 and readings of 0.002 and 0.03 on October 5th, 2015. There are no data for October 6th (sometimes there are problems with the network, so data go missing).
Assume that the data are in order by dates (so a later date never appears before an earlier one in the sequence) and that all data are from the same year. Also, assume that every date that appears has at least one measurement.
fun daily-max-for-month(sensor-data :: List<Number>, month :: Number) -> List<Report>
data Report:
| max-hz(date :: Number, max-reading :: Number)
end
Sample output: Given the following input list (repeated from above)
[list: 20151004, 150, 200, 175, 20151005, 0.002, 0.03,
20151007, 130, 0.54, 20151101, 78]
and the month 10 (for October), the result of daily-max-for-month should be
[list: max-hz(20151004, 200),
max-hz(20151005, 0.03),
max-hz(20151007, 130)]
2 Submission Guidelines
Please create eight files, two per problem. Name them bmi-1.arr, bmi-2.arr, datasmooth-1.arr, datasmooth-2.arr, frequentwords-1.arr, frequentwords-2.arr, earthquake-1.arr, and earthquake-2.arr.
Use this form to upload your code, and this one to upload your reviews.