Accept No Substitutions |
|
|
|
|
I couldn't find time to write a journal entry yesterday but I did think about what I wanted to say and so I'm a little more prepared had I not had a day to ponder. I also had a chance to play with a couple of versions of Scheme and looked around a bit on the internet for additional resources for learning to program in Scheme.
I'm finally going to talk a bit about programming. But I'll repeat here what I've said earlier, namely that I'm not going to teach you programming; we're just going to talk about programming, conversationally as it were, and either you already have some experience programming in some language, in which case hopefully this will be a useful meditation for you, or you don't have any prior programming experience and eventually you'll feel motivated to try your hand at one of the programming languages that we talk about.
Today we'll consider Scheme, the dialect of Lisp that we introduced a couple of days back. If you find yourself sufficiently interested to try your hand at programming in Scheme, I suggest you download the DrScheme programming environment and then either purchase or look at the on-line version of [Felleisen et al., 2001]. My colleague at Brown, Shriram Krishnamurthi, is one of the authors of HTDP and has certainly influenced my thinking about programming languages and computation.
I downloaded and played with several implementations of Scheme and in the end I settled on DrScheme in part because I thought it was nicely designed and offered both text-based and graphics-based interfaces. The Download DrScheme web site offers both binary and source code for several operating systems including Apple Mac OS (several flavors), Linux, Solaris and Microsoft Windows (several flavors). DrScheme provides feature-rich programming environments for beginning and advanced programmers. In "Beginning Student" mode, the "stepper" provides a useful graphical interface for watching the interpreter "step through" the evaluation of a program; I particularly liked this last feature as it allowed me to step through programs and copy the "steps" into this journal entry to illustrate the algebraic "plug and chug" model of computation. (I could have done this by hand, but why chance making mistakes when you have a computer that can perform a task faster and more reliably.)
In the spirit of in-line acknowledgements, Paul Hudak taught me about programming language theory, operating systems and compilers when I was a graduate student at Yale University and I very nearly changed fields I found it so interesting. Paul forced us to write a compiler for an obscure language called Haskell (after the logician, Haskell Curry) which at first I thought was cruel and unusual punishment and then later learned it was just cruel (professors who conduct research in programming languages often inflict their language preferences on their students). Now I see that Haskell has its own web page. I also got to witness a new dialect of Lisp being born and participate in its birthing pains by being required to program in it. The language was called T and it is (or was) similar in many respects to Scheme. Jonathan Rees was the person whom I most closely associated with T but there were several other people working on the project. When Jonathan took off for MIT to work on Scheme write his doctoral dissertation with Gerald Sussman (Hal Abelson also of SICP fame was on his thesis committee) those of us who were building programs in T felt abandoned.
I've heard computer scientists say that the most important prerequisite for learning how to program is knowing a little algebra. I think this is largely true but it is rarely a comfort to students contemplating taking a class in computer science. For many people, learning algebra was painful and made all the more so because it wouldn't go away - you needed it for all sorts of other things later on down the road. Fortunately, the sort of algebra required to carry out basic programming tasks is of a particularly benign sort. And for those of you who aced algebra, don't get too excited because being good at algebra is not the only skill you need to become a good computer scientist or programmer.
To follow the connection from algebra to programming, consider the following simple "word" problem which you might have encountered in grade school.
Anne wants to buy carpeting to cover the floor of her living room. Anne's living room is 4 meters long. Along the other side of the room there is a large fireplace that is 3 meters wide flanked by two identical bookcases which together with the fireplace take up the entire width of the room. The bookcases are 2 meters wide each. The carpeting that Anne wants for her living room costs $5 per square meter. How much will Anne have to pay for her new carpet?
How would you solve this problem? First you might recall the
mathematical relationship involving area and the dimensions, length
and width, of a rectangle. Algebraically, you would write this down
as a formula involving three variables or terms which we specify using
easily remembered (mnemonic) names: living_room_area,
living_room_length, living_room_width.
living_room_area = living_room_length * living_room_width
Following the usual conventions, the * in the above
formula stands for multiplication. You'd also write down the the
specified quantities and relationships in the word problem introducing
additional terms as needed.
living_room_length = 4 bookcase_width = 2 fireplace_width = 3 cost_per_square_meter = 5 living_room_width = fireplace_width + (2 * bookcase_width)
And the formula for answering the posed question.
total_cost = cost_per_square_meter * living_room_area
Now to calculate the answer, it's just "plug and chug" as they say. Starting with the basic formula:
total_cost = cost_per_square_meter * living_room_area
You would then proceed by substituting (the "plug" part) the appropriate formulas and values for the terms that appear in the formula and simplifying (the "chug" part) by performing arithmetic operations (addition, subtraction, multiplication) as appropriate:
total_cost = cost_per_square_meter * (living_room_length * living_room_width)
total_cost = cost_per_square_meter *
(living_room_length * (fireplace_width + (2 * bookcase_width)))
total_cost = cost_per_square_meter * (living_room_length * (fireplace_width + (2 * 2)))
total_cost = cost_per_square_meter * (living_room_length * (fireplace_width + 4))
total_cost = cost_per_square_meter * (living_room_length * (3 + 4))
total_cost = cost_per_square_meter * (living_room_length * 7)
total_cost = cost_per_square_meter * (4 * 7)
total_cost = cost_per_square_meter * 28
total_cost = 5 * 28
total_cost = 140
Note how our original formula,
total_cost = cost_per_square_meter * living_room_area
underwent expansions and contractions as we made substitutions and simplified subformulas.
In a language like Scheme, we'd specify the entities and relationships in a manner very much like what we did above, though in the case of Scheme we have to adhere to the syntax of Scheme which takes a little getting used to. In Scheme, an expression like
(define living_room_length 4)
tells the Scheme interpreter that we are defining the term
living_room_length to be 4. If we were to subsequently
ask Scheme to "evaluate" living_room_length it would look
up this definition and return the "value" 4. Here's what it looks
like when you interact with the Scheme interpreter.
> (define living_room_length 4) > living_room_length 4
The first typed expression was interpreted as a request to make the indicated definition. Scheme didn't have anything to say about this. The second typed expression was interpreted as a request to "evaluate" the expression which Scheme did printing by out the result. This form of interaction is called a "read evaluate and print loop" for obvious reasons. We can continue to define the other terms in our word problem.
> (define bookcase_width 2) > (define fireplace_width 3) > (define cost_per_square_meter 5) > (define living_room_width (+ fireplace_width (* 2 bookcase_width)))
This last one is bit different. The definition is a compound
expression that involves previously defined terms as well as
mathematical operators employed in reverse polish notation. When
Scheme sees an expression of this form it evaluates the compound
expression by substituting the values of the terms which appear in the
expression with their values as defined previously and then
simplifying the expression to obtain a value to use as the definition
of the new term, living_room_width, in this case. If we
were to submit this term to the interpreter now, it would look up the
newly minted definition as you might expect.
> living_room_width 7
The next definition is a very different animal.
> (define (total_cost unit_cost quantity) (* unit_cost quantity))
Scheme behaves very differently when it encounters an attempt to
define an expression of the form: (some-term args) where
some-term is a place holder for the name of a "function" and
args is a place holder for one or more terms called "formal
parameters" (also called the "arguments" to the function). A function
is just a way of specifying a computation. Computer scientists often
use the words "function" and "procedure" interchangeably. We can use
any terms we like for both the function name and the names of the
formal parameters. For example, in the above function definition, we
define a function called total_cost which takes two
formal parameters unit_cost and quantity).
The definition specifies how the function is to be called.
Unlike the earlier definition in which Scheme evaluated the compound
expression, in the case of defining a function, Scheme will store both
the calling pattern, (total_cost unit_cost quantity), and
the function definition, (* unit_cost quantity) in this
case. It doesn't evaluate the function definition as it did
some-expression in a definition of the form (define
name some-expression). A definition of the form
(define (name args) some-expression)
is very special in that it provides a general recipe for performing a
computation; a recipe that will be carried out whenever Scheme is
asked to evaluate an expression of form (name
args).
> (define (living_room_area width length) (* width length))
Now we've defined all of the relevant terms and formulas we can ask Scheme to evaluate an expression which corresponds to the answer to the word problem.
> (total_cost cost_per_square_meter
(living_room_area living_room_width
living_room_length))
140
Here are the steps that Scheme takes to evaluate this expression.
(total_cost cost_per_square_meter
(living_room_area living_room_width
living_room_length))
(total_cost 5
(living_room_area living_room_width
living_room_length))
(total_cost 5
(living_room_area 7
living_room_length))
(total_cost 5 (living_room_area 7 4))
(total_cost 5 (* 7 4))
(total_cost 5 28)
(* 5 28)
140
To be precise, I should say that this is just one possible sequence of
steps that a Scheme interpreter might take in order to evaluate the
expression. Keep in mind that the Scheme interpreter is just another
program. In writing the code for a Scheme interpreter, a programmer
has two primary goals in mine. First, to make sure that the
interpreter evaluates expressions according to the semantics of the
language and, second, to make sure that those evaluations take place
quickly. The semantics
of a language specifies what the expressions formed in accord with the
syntax of that language mean. Scheme has a formal and an informal or
intuitive semantics and I'll rely entirely on the latter in this
journal. As with a spoken or so-called natural language, an
intuitive semantics for a programming language explains the meaning of
more complex expressions in terms of the meaning of simpler ones,
e.g., the meaning of (+ 1 2) is 3. In
general, we'll borrow from our understanding of arithmetic and algebra
in deciphering the meaning of Scheme expressions.
As an alternative to the above specification and to make the Scheme
calculations look a little more like our algebraic manipulations, we
can define a special-purpose function thereby replacing our previous
definition of living_room_width.
> (define (living_room_width bookcase_width fireplace_width)
(+ fireplace_width (* 2 bookcase_width)))
Note that in defining this function as in invoking total_cost
earlier we used more than one line. When functions become
longer and more complicated, Lisp programmers use indentation and
formatting to make reading them a little clearer. For example, some
programmers like to line up the arguments to functions as you see in
the following alternative formatting of
living_room_width. Formatting is just one way in which
programmers strive to make code more readable, easier to understand
and hence easier to write and modify both for the original programmer
and for other programmers who might want to use or adapt the code at a
later stage.
> (define (living_room_width bookcase_width fireplace_width)
(+ fireplace_width
(* 2
bookcase_width)))
With this new function, the computation looks more like the algebraic
substitutions and simplifications that we described earlier. (The
model of computation as a sequence of substitutions and
simplifications is sometimes referred to as the substitution
model.) You'll note that when we first defined
living_room_width we assigned it the number 7 which was
the result of a computation. The second time we defined
living_room_width we defined it as a function which when
invoked with the proper arguments computes the width of the living
room from a formula (specified in the function definition). Each time
living_room_width is invoked with the correct number of
arguments it will substitute the arguments into the formula and
evaluate the result. You might have been puzzled by the fact that the
formal parameters to living_room_width have the same
names as two terms which we defined earlier:
bookcase_width and fireplace_width. And
indeed in the following invocation we call
living_room_width with arguments corresponding to
bookcase_width and fireplace_width.
The whole business of deciphering the meaning of terms and establishing different contexts in which terms can have different meanings is important in designing programming languages that can be used to build large programs involving many different programmers. Even a single programmer is likely to use the same term in multiple independent contexts.
The term "x" must be one of the most used variable names in mathematics. In the following exchange, it should be pretty clear what I mean even though I used "x" twice as a formal parameter, once in assigning it a value, and once in referring to it as the argument to a function.
> (define (square x) (* x x)) > (define (double x) (+ x x)) > (define x 3) > (define y (double (square x))) > y 18
Here's the new method of calculating the answer to our word program.
> (total_cost cost_per_square_meter
(living_room_area (living_room_width bookcase_width
fireplace_width)
living_room_length))
140
And here is series of substitutions and simplifications which the Scheme interpreter performed.
(total_cost cost_per_square_meter
(living_room_area (living_room_width bookcase_width
fireplace_width)
living_room_length))
(total_cost 5
(living_room_area (living_room_width bookcase_width
fireplace_width)
living_room_length))
(total_cost 5
(living_room_area (living_room_width 2
fireplace_width)
living_room_length))
(total_cost 5
(living_room_area (living_room_width 2 3)
living_room_length))
(total_cost 5
(living_room_area (+ 3 (* 2 2))
living_room_length))
(total_cost 5
(living_room_area (+ 3 4) living_room_length))
(total_cost 5 (living_room_area 7 living_room_length))
(total_cost 5 (living_room_area 7 4))
(total_cost 5 (* 7 4))
(total_cost 5 28)
(* 5 28)
140
Could we do this same trick of substitution to explain the calculations performed by the factorial function that we ran so quickly by you a couple of days ago. Sure.
> (define (factorial n) (if (= n 1) 1 (* n (factorial (- n 1))))) > (factorial 3) 6
Here's the sequence of substitutions and simplifications. Note that in the following we don't simplify any subformulas unless we have to.
(factorial 3) (if (= 3 1) 1 (* 3 (factorial (- 3 1)))) (if false 1 (* 3 (factorial (- 3 1)))) (* 3 (factorial (- 3 1))) (* 3 (factorial 2)) (* 3 (if (= 2 1) 1 (* 2 (factorial (- 2 1))))) (* 3 (if false 1 (* 2 (factorial (- 2 1))))) (* 3 (* 2 (factorial (- 2 1)))) (* 3 (* 2 (factorial 1))) (* 3 (* 2 (if (= 1 1) 1 (* 1 (factorial (- 1 1)))))) (* 3 (* 2 (if true 1 (* 1 (factorial (- 1 1)))))) (* 3 (* 2 1)) (* 3 2) 6
You may have stumbled over how to interpret an expression of the form
(if (= 3 1) 1 (factorial (- 3 1))). Scheme interprets
this expression as saying if 3 is equal to 1 then this expression
simplifies to 1 and otherwise it simplifies to (factorial (- 3
1)). A so-called "conditional" statement of the form (if
test then else) where test,
then and else are expressions is interpreted as follows:
if the test expression evaluates to true, evaluate the
then expression and substitute its value for the conditional
statement, but if the test expression evaluates to false, don't
mess with the then expression at all and rather evaluate the
else expression substitute its value for the conditional
statement. The test statements can be quite complex in general
combining simpler tests using boolean operators such as "or" and "and"
and "not". For example,
(if (and test1 (or test2 (not (test3)))) (+ 1 2) (* 1 2))
will evaluate to 3 if test1 is true and either test2 is true or test3 is false and it will evaluate to 2 otherwise.
The exact sequence of substitutions and simplifications used in
calculating (factorial 3) may not be apparent in the
above listing; so I've used the DrScheme stepper to produce a sequence
of transformations in which the substitutions are highlighted by
enclosing the pre and post expressions in square brackets:
[(factorial 3)] => [(if (= 3 1) 1 (* 3 (factorial (- 3 1))))]
(if [(= 3 1)] 1 (* 3 (factorial (- 3 1)))) => (if [false] 1 (* 3 (factorial (- 3 1))))
[(if false 1 (* 3 (factorial (- 3 1))))] => [(* 3 (factorial (- 3 1)))]
(* 3 (factorial [(- 3 1)])) => (* 3 (factorial [2]))
(* 3 [(factorial 2)]) => (* 3 [(if (= 2 1) 1 (* 2 (factorial (- 2 1))))])
(* 3 (if [(= 2 1)] 1 (* 2 (factorial (- 2 1))))) => (* 3 (if [false] 1 (* 2 (factorial (- 2 1)))))
(* 3 [(if false 1 (* 2 (factorial (- 2 1))))]) => (* 3 [(* 2 (factorial (- 2 1)))])
(* 3 (* 2 (factorial [(- 2 1)]))) => (* 3 (* 2 (factorial [1])))
(* 3 (* 2 [(factorial 1)])) => (* 3 (* 2 [(if (= 1 1) 1 (* 1 (factorial (- 1 1))))]))
(* 3 (* 2 (if [(= 1 1)] 1 (* 1 (factorial (- 1 1)))))) => (* 3 (* 2 (if [true] 1 (* 1 (factorial (- 1 1))))))
(* 3 (* 2 [(if true 1 (* 1 (factorial (- 1 1))))])) => (* 3 (* 2 [1]))
(* 3 [(* 2 1)]) => (* 3 [2])
[(* 3 2)] => [6]
I should mention that I introduced without any fanfare an idea that is very powerful and sometimes throws students for a loop (pun intended as you'll soon discover). The definition of factorial is recursive, meaning that I used factorial to define factorial, the term being defined appears in or "recurs" in the definition.
(define (factorial n)
(if (= n 1)
1
(* n
(factorial (- n 1)))))
I promise you that it really isn't a big deal and it seems perfectly natural to lots of students, particularly when they see how it works and they aren't scared off by a teacher who feels recursion is somehow "abnormal." Some folks think that the "normal" way of computing factorial runs something like the following.
factorial(n) {
x = 1 ;
for (i = 2 to n) {
x = x * i ;
} ;
return x
}
In prose form, this definition would read something like: in order to compute the function n factorial, first assign the variable x to be equal to 1, then begin a loop in which the "iteration" variable i is first set to 2 and then incremented by 1 on each iteration until i is equal to n; on each iteration in which the body of the loop is executed (those iterations in which i is assigned one of 2, 3, ... n - 1) reassign the variable x to be equal to its old value times the value of the iteration variable, i. Finally after you've completed the loop return the last assigned value of x.
Asked to calculate factorial(3) (this is C-like or
ALGOL-like syntax), this program first sets the variable x to
be 1, then it starts through the "for loop" setting x to be
x times 2 (x is now equal to 2) on the first iteration,
and x to be x times 3 (x is now equal to 6) on
the second and last iteration, before returning the final value of
x, which is 6 in this case. Each time through the loop, the
program checks to see if i is equal to 3 and if so, then it
exits the loop without executing the body. In C, the body of a loop
is set off with "curly brackets" { and }
following the keyword for and the iteration variable
specification (i = 2 to n) and so, in the above
definition, the body is just the single statement x = x *
i. This is a perfectly good way of calculating factorial and I
have no problem with it but the recursive definition is also fine and,
to my mind, simple and elegant.
You might worry that a recursive definition might keep calling itself
indefinitely and indeed this is a concern when you're writing
recursive function definitions. The trick is to arrange things so
that some argument (assignment to a formal parameter) in the recursive
call steadily gets "closer to" the terminating condition, (= n
1) in the case of factorial. Once you get the hang of it,
designing recursive definitions with appropriate terminating
conditions becomes pretty easy. But if you really have a thing about
recursion, then there are always other languages that are less elegant
and less cool ... just kidding most dialects of Lisp support other
methods of iteration besides recursion.
It's late afternoon and my back is killing me. I want to take a swim and get out of the house for a while. Tomorrow I have to go into the office for most of the day and so I probably won't return to this journal until Thursday at the earliest. Tomorrow I'm going to see about getting a cast-off robot arm for the undergraduate robotics lab, eat lunch with the Artemis students, give a talk about robots to a bunch of high school students from Saudi Arabia visiting Brown, meet with Joel, one of my graduate students, to talk about his progress in writing his dissertation, and talk with Eliot, one of my UTRA students, who is working with me this summer on robots and, more recently, planning and search algorithms which have somehow captured his attention despite my best efforts to divert him elsewhere. I mention these bits of minutiae in the spirit of Gertrude Stein's admonition to "use everything" not because I believe in the value of her suggestion generally (neither did GS) but rather because I thought some readers might wonder what an academic does on a typical day in the middle of the summer.