Thursday, July 11, 2002

Structured Hacking

Driving into the department this morning, I passed a sign near a grade school in Tiverton advertising a "Bug Safari". When I got to work, I found out from the web that "Butterfly Walks", "Spider Safaris", "Pond Dips" and similar excursions are regular fare at summer camps and neighborhood activity centers. This intrigued me as I wondered who would lead such guided tours. Being an academic and believing in the value of consulting an expert, I of course immediately thought that the most likely candidates would be trained entomologists or, as appropriate, specialists such as lepidopterists. An entomologist studies bugs, beetles, butterflies, spiders and the like (not to be confused - as I usually do - with the similarly sounding etymologist, someone who studies the derivation of words) while lepidopterists specialize in lepidoptera, butterflies and moths.

But then it occurred to me that it's a rare specialist who can give a talk to a general audience without talking over the heads of most of the audience. I thought about what this tendency might portend for my own attempt to speak to a general audience about computer science through this journal. By most accounts, I'm an expert in computer science. And certainly it's the case that I am an expert in certain narrow areas of computer science. But just as a trained and practicing physicist is not an expert in all matters spanning the width and breadth of physics, from quantum electrodynamics to cosmology, so I'm not current on all areas of computer science. And realizing the seed of truth in my broad generalization about specialists, I'll steer clear of getting too much into those areas where I do research. Not entirely clear I'll accomplish this, because I probably won't be able to help myself, but I'll try to contain myself. As your recreational director for the day, if on our "Computational Safari" I spot some rare, but plainly colored, computational fauna, I'll point it out but then move quickly on to the more lively and colorful species.

In Tuesday's journal, we encountered with little fanfare our first model of computation, and a powerful one at that. It's called the substitution model and, in a theoretical sense, it provides all the power or "expressiveness" you will ever need. Models are characterized in terms of their expressiveness, the extent to which they allow you to express or specify computer programs. One model, call it the vanilla model, would be less expressive than a second model, vanilla with savory chocolate bits, if, say, you could specify how to add numbers of any size in the vanilla model but only how to add numbers less than 1,000,000 in the vanilla with chocolate bits model.

We'll get back to what we mean by expressivenees and the power of computational models later on, but for now, suffice it to say that the subsitution model using the sort of simple Lisp (Scheme) syntax that we described earlier is as powerful as any computational model can be. There are characteristics apart from power that support the use of a given computational model and paramount among these characteristics are simplicity, comprehensibility, and the ability to sensibly organize (or structure) large programs. The substitution model is certainly simple to learn and easy to comprehend but it doesn't particularly help in thinking about how to structure large programs.

This notion of structuring large programsis important enough that it might be instructive (and even interesting) to think about for a few minutes. The first thing you should realize when considering a large program is that if the program is large enough, e.g., several tens of thousands of lines of code, then it's likely to have been written by more than one programmer. Even if it was written by one person, most mere mortal programmers can't keep all of the details that comprise a large software project in their head at one time; so it's almost as though there are several programmers. The way that most professional programmers deal with large amounts of code is to break the code down into reasonable chunks, called variously libraries, packages, classes, and modules.

Each chunk is designed to provide a particular service or solve a particular part of the overall problem. The Application Programming Interface (or API) for a given chunk specifies how programs that want to the functionality supported by the chunk would be able to do so. Schematically, the API for a library that supports drawing geometric diagrams might include instructions on how to call procedures, say, to draw rectangles or circles on your screen; these would be the parts of the geometry library that are made public so that other programmers might use the library.

In order to implement a geometry library you might also need a lot of support functions, e.g., functions that use trigonometry to rotate and redraw rectangles, but that aren't part of the library's interface. These trig functions are "private" in the following sense. Generally the programmer who uses a library knows only the public procedures specified in its API. If a programmer does know about the private procedures, she doesn't allow this to influence her design. And the author of a library should feel free to rewrite the internals of the library and, in particular, to rewrite or rename its private functions.

Aside from hiding unnecessary detail from the programmer, it allows a group of programmers to work together, share code (APIs at least) and improve existing libraries without interrupting one another. It's even possible for a team of programmers to partition a large project into a bunch of interacting modules, publish the interfaces and then set to work even though none of the modules are yet working. Libraries and modules also establish a discipline for naming things. Within a library you can use any name you like as long as you're consistent within that library. Someone else writing the internals of another library can use the same names found in the internals of other libraries with no fear of confusion.

Two libraries could use same public name, for example both the geometry library and euclid library might have functions called circle that perform very differently. If you want to use both libraries in the same program, you'd have to disambiguate your use by indicating the context somehow, syntactically for example, by appending the library name, so that geometry::circle is unambiguously different from euclid::circle. All of this might seem a little tedious and picayune when you're just starting out but such common-sense conventions are the foundation for building large programs.

This is an example of what happens when you hire a myrmecologist (someone who studies ants) to lead a bug safari and then run into a bunch of interesting moths - hopefully you'll get some useful information about moths but not the same sort of information had you hired a lepidopterist to lead your expedition. I no longer write large programs though I commonly write programs that use several libraries. Occasionally I write a library myself and you get a sense after awhile of when it makes sense to take a chunk of code and turn it into a separate library. I've never worked as part of team of programmers to develop a large project as a set of loosely coupled modules whose interaction is completely specified by their respective APIs, but it seems a perfectly reasonably way of proceeding and my students and professional programmer friends tell me it works. Now lets move on to this fascinating little colony of ants over here.

My cursory overview of the issues involved in structuring large programs emphasized techniques for avoiding cognitive overload in programmers, avoiding conflicts in naming and a discipline for separating the public and private parts of larger software components. The substitution model does little to support our thinking about such issues in part because it assumes such a simple model of memory. Where is does memory come up in thinking about the evolution of a computational process?

If you type a set of define statements to a Scheme interpreter you essentially create a set of associations in memory. For examples, in the following interaction I define a function for converting inches to centimeters, specify a conversion rate and invoke the function on my approximate height in inches to obtain my height in centimeters.

 
> (define (inches_to_centimeters x) (* x conversion_rate))
> (define conversion_rate 2.54)
> (inches_to_centimeters 73)
185.42000000000002

The function name, inches_to_centimeters, and the constant, conversion_rate, are said to be "global" in the sense that they can be seen (referred to) by any other function that I might define. What if I wanted conversion_rate to be "local" to inches_to_centimeters, in the sense that it was only seen by the inches_to_centimeters function? I could do this by defining inches_to_centimeters as follows.

 
> (define inches_to_centimeters 
    (let ((conversion_rate 2.54))
       (lambda (x) (* x conversion_rate))))

As usual, please squint at the syntax and don't sweat the details (I'll explain the lambda stuff in just a minute. For the time being, think of lambda as introducing a format for defining a function that doesn't have a name.) This expression can be paraphrased as follows: I'm going to define a function called inches_to_centimeters and in defining this function I'm going to introduce some special terminology so that only in the context of this definition we'll let the variable conversion_rate refer to the number 2.54. Other functions won't be able to see inches_to_centimeters's local conversion_rate and indeed they should feel free to define their own local variables with the same name if they like. Another way of thinking about this is that I'm going to allocate some space in my memory for the definition of inches_to_centimeters and along with the space for the definition I'm going to set aside space for a constant that will be seen only by inches_to_centimeters. The way I'm using this in inches_to_centimeters I could have just defined it as:

 
> (define (inches_to_centimeters x) (* x 2.54))

But we can do more once we have the means to change the contents of memory. We want to be able to forget as well as remember, to add to as well as delete from memory. Consider the following.

 
> (define behind_the_times
     (let ((x 0) (y 0) (z 0))
       (lambda (a b)
          (set! z (* x y))
          (set! x a)
          (set! y b)
          z)))

The (set! term expression) allows us to change the contents of memory (the specific location in memory in this case is referenced by term) to refer to the result of evaluating expression. And so we get the following behavior:

 
> (behind_the_times 1 3)
0
> (behind_the_times 2 3)
3
> (behind_the_times 3 4)
6

Not terribly useful to have a multiplication function that returns the result of multiplying the numbers that were given as the value of the formal parameters the previous time that the function was invoked, but I'll bet you can figure out something more useful to do with this basic capability of remembering the past. For example, if you computed a result that took a lot of time then you might want to remember the result in case you're asked to compute the result again. The function could have its own memory set aside so it can remember information from one invocation to the next. The Fibonacci sequence provides a good example of why this might prove useful.

The Fibonacci sequence is 0, 1, 1, 2, 3, 5, 8, 13, 21, ... so that the nth term in the Fibonacci sequence for n greater than 1 is the sum of the n-2 and n-1 terms in the sequence.

 
> (define (fibonacci n) 
     (if (= n 1) 0
         (if (= n 2) 1
             (+ (fibonacci (- n 2)) 
                (fibonacci (- n 1))))))
> (fibonacci 8)
13

Think about how many times the function fibonacci is called in order to compute the nth term in the Fibonacci sequence. It's called once in computing the first and second terms, three times in computing the third term, five times for the fourth term, and so on. Extending this sequence out a bit further we get a new sequence: 1, 1, 3, 5, 9, 15, 25, 41, 67, 109, 177, 287, 465, ..., well, you can see the pattern; the nth term in this sequence is one plus the sum of the (n - 2) and (n - 1) terms. This sequence is upsetting to a computer scientist in that it tells us that the work required to compute the nth term of the Fibonacci sequence approximately doubles each time we increase n by 1; such a sequence is said to grow exponentially in n (think 2ⁿ in which n the exponent or power to which 2 is raised). In order to compute the 100th term using the above procedre, we would have to call fibonacci approximately 2¹⁰⁰ or 1,267,650,600,228,229,401,496,703,205,376 times. That would take a long time even on a very fast computer. But if you think about it, this just isn't necessary; if you compute the sequence in order, remembering all the earlier terms (or at least the two most recent ones) then computing the next term requires simply adding together the previous two terms, the only trick is you have to remember the previous two terms.

So let allows us to allocate or lay aside memory and associate it with a function definition and set! allows us to modify the contents of memory. The lambda syntax is just another way of providing the definition for a procedure, (lambda (formal-parameters) body) where body is the full description of the function. But it also provides us with a powerful way of extending functions. What do you think the following does?

 
> (define (your_head initial_contents)
    (let ((your_memory initial_contents))
      (lambda (y) 
        (if (equal? y 0) your_memory
           (set! your_memory y)))))

It would be very hard for you to guess unless you remembered some odd facts about "define" that I mentioned earlier, namely that (define name body) evaluates body and assigns the result to name while (define (name args) body) stores body along with the calling pattern (name args) in the definition of name and trots out body as a recipe for the computation assigned to name whenever Scheme is asked to evaluate something of the form (name args). This distinction, evaluate now or store in order to evaluate later, is important and I admit that it would be nice if Scheme made the distinction more syntactically obvious. Note that (define (name args) body) is semantically the same (means the same) as (define name (lambda (args) body) and if you think of lambda as saying, "Stop evaluating! I'm a recipe for a procedure; store me away for when you want to apply this recipe to an appropriate set of arguments", then perhaps this double use of define makes sense. Perhaps the following example will shed some light. (I slipped in equal? in the definition of your_head so I could use something more interesting than numbers - some mathematicians and computer scientists might protest, "What could be more interesting than numbers?" - as arguments to functions. The function equal? is like = except that it checks whether two items are the same in cases in which the items are other than numbers. The notion of equality is complicated even in the case of numbers - Is the integer 0 equal to the "real" number 0.0? - but we won't sweat the details here. Using equal? we can compare strings which in Lisp are entered as illustrated in the following.)

 
> (define freds_head (your_head "blank"))
> (define annes_head (your_head "empty"))
> (freds_head 0)
"blank"
> (annes_head 0)
"empty"
> (freds_head "swelled")
> (freds_head 0)
"swelled"
> (annes_head "level")
> (annes_head 0)
"level"
> (freds_head 0)
"swelled"

We created freds_head and annes_head to be distinct functions with their own, separate associated memory. It's as though we created two separate computational spirits which we can invoke and which retain a memory of their past invocations. We could use your_head to create as many of these spirits as we like.

 
> (define two_heads 
     (lambda (right left)
        (let ((right_head (your_head right))
              (left_head (your_head left)))
          (lambda (which what)
            (if (equal? which "right")
                (right_head what)
                (left_head what))))))
> (define cerberus_minus_one (two_heads "sleepy" "dopey"))
> (cerberus_minus_one "right" 0)
"sleepy"
> (cerberus_minus_one "right" "awake")
> (cerberus_minus_one "right" 0)
"awake"
> (cerberus_minus_one "left" "angry")
> (cerberus_minus_one "left" 0)
"angry"

Well, you get the idea; we just used two_heads to create a procedure which we assigned to the name, cerberus_minus_one, equipped with two newly minted personal procedures, each of which has its own person memory. We could write a variant of two_heads to create a fully headed Cerberus (according to Greek mythology, Cerberus was a three-headed dog belonging to Hades, the god of the underworld; Cerberus guarded the entrance to the underworld) or even a nine-headed Hydra. Better yet we could use lambda and let to create a nine-headed Hydra which, like the Hydra in the tale of Hercules, would grow two heads to replace any one head that was injured.

Using lambda and let you can create functions that are said to be "first-class objects" - objects with their own mutable internal state and their own varied set of capabilities embodied in their associated functions. Such objects form the basis for a powerful model of programming called object-oriented programming which supports software engineering in the large, the name given to the art and science of building large programs. But more than just a boon for engineering, functions, or objects, with state and associated procedures, often called methods, make for much more interesting spirits to conjure up and with which to cast powerful spells. Now procedures can give rise to spirits that persist over time, have memory and are even capable of casting their own spells by creating and invoking new procedures.