7 The Shell
7.1 What is This Assignment’s Purpose?
The Unix shell is one of the most widely-used programming languages, and has a powerful notion of laziness central to its evaluation model. In this assignment we will relate the shell to other things we have studied this semester.
7.2 Reading
Read this paper, at least to the end of section 2.1.Local interest stories: first-author Greenberg is a Brown CS alum, and third author Vasilakis is now a professor here.
The paper references this famous exchange (use this version to get around the paywall from Brown) in the journal Communications of the ACM, where Donald Knuth presented a “literate” program for a task and then Doug McIlroy wrote a response in which he showed the same program written in terms of Unix shell commands. McIlroy’s review begins on page 478 and his solution is on page 480. You are not expected to read this document in whole! It’s here as a historical reference (and a fascinating contrast between different styles of programming, and also a statement of the power of programming languages). But it is very useful to read the critique starting on page 480 starting from “Knuth’s purpose was” to the end of page 481.
If you have not seen the shell before, these slides do a very nice job of introducing it (especially through slide 41). Note that slide 4 of that deck echoes the Knuth-McIlroy exchange.
7.3 Program
tr -cs A-Za-z ' |
' | |
tr A-Z a-z | |
sort | |
uniq -c | |
sort -rn |
cat <filename> | |
tr -cs A-Za-z ' |
' | |
tr A-Z a-z | |
sort | |
uniq -c | |
sort -rn | |
less |
Note that it is important that you have a newline inside the quote: see McIlroy’s explanation of the program.
7.4 Task
Let us now think about this from the perspective of retrofitted type systems.
One of the reasons the Unix shell is so powerful is because its commands compose very easily: to a first approximation, everything consumes a stream of strings and returns a stream of strings.If you pay really close attention below, you may notice that perhaps some things make more sense on finite lists than on infinite streams. But you can ignore this distinction unless you really want to. However, this perspective is useless from the perspective of debugging programs: it means there is only one type, and everything consumes and produces values of that type. That means “composition” is automatic, but we cannot tell with any more refinement what kind of answer a program would produce.
But obviously, that is not what is going on here! Each line is doing
something interesting and producing a much richer kind of
stream-of-strings—
(1) tr -cs A-Za-z ' |
' | |
(2) tr A-Z a-z | |
(3) sort | |
(4) uniq -c | |
(5) sort -rn |
For each line, indicate the “deep” structure—
i.e., beyond the fact that both are strings— both before and after the line. Of course, each line’s after is the next line’s before, so you can think of this as one before, four intermediates, and one after, for a total of six structures. Generalize from this concrete example, and your general understanding of these operations, to give a “type” to each operation that explains how it changes the input structure. Note that each of these commands, in Unix, has a gazillion options, but this is emphatically not a test of your Unix knowledge. Only consider the options listed here (e.g., uniq -c, not uniq in general) as if they were the whole command. We would expect to see three, four, or five type specifications (you can choose whether or not to condense the two uses of tr and sort into one).
From the point of view of the structure of the lines, assuming there was no structure to the input (beyond being a stream of strings), we know that in the output, the strings all contain "CSCI 1730".
From the perspective of grep:
It strengthens whatever per-line structure the input had with the constraint that every line contains the argument string.
If the input is of finite length, the output has no more lines, and may well have fewer lines, than the input.
In each case, be concise, crisp, and as precise as possible. Your answers do not need to be any longer than those above.
7.5 What to Turn In
Please respond to each part above. Because there are 9–11 responses (six structures and 3–5 types), please make very clear which response goes with which question.
The example above said that it strengthens “whatever per-line structure the input had”. Why is the phrase “per-line” in there?