Sunday, July 14, 2002

Analyze This

As usual, right after I finished Friday's entry and right up until now as I'm about to start the next one, I thought about what to write about next. The content of the journal has been planned in the sense that I've thought about what I want to write but it hasn't been scripted in the sense that I know when to write what nor has it been designed in the sense that I know what ideas or concepts will follow or precede what other ideas and concepts. The journal is naturally growing to take advantage of serendipity and to fit my current mood and interests. That said each morning I have to figure out what to write about next. And whatever sequence of ideas and concepts I might have thought was appropriate last week will likely have been altered by what I thought about in writing the last entry.

In addition to a few anecdotes about computing and some ice-cold plunges into the arcane syntax of half a dozen programming languages, we've touched on a bunch of pretty important ideas. We talked about the idea of interacting with computers and looked at a bunch of different ways of doing this including command line interpreters for communicating with a computer's operating system, scripting languages for more complicated communication, and specialized interpreters for programming languages like Scheme and prolog and interactions with database and web servers.

We investigated the basic notion of a function, alternatively called a procedure, as a set of rules or recipe for carrying out a computation. A computer program consists of a set of function definitions together with data in some form. We distinguished the program itself from the running of the program which we identified with the notion of process, thereby capturing the dynamic, evolutionary characteristics of running programs. We enhanced the notion of a static procedure defined once and for all and oblivious to its past to allow procedures to have mutable state (memory) and to accept procedures as input, invoke them at their whim, and indeed generate and return newly minted procedures of their own design. Data and procedures are interchangeable and for that reason are replaced by the general notion of objects, that perform computations and have computations performed on them.

We looked at several informal models of computation including the client/server model for thinking about services such as those offered by database and web servers, the substitution model for thinking about computation algebraically, and object-oriented programming for thinking about software engineering in the large by dividing a large project into smaller entities each of which provides a particular set of services, keeping private the irrelevant and distracting details making public only the interface. But these notion just begin to scratch the surface of computer science. Here are some of the "biggies" that have been left largely untouched and that haunt my dreams when I get overly invested in this project.

First, there is the idea of a "formal" computational model, a mathematical framework for investigating what is possible to compute and how hard (how much time and memory they require) such computations are likely to be. With a little work, the substitution model can be made precise and used as a formal model. Formal computational models try to answer questions such as the following: Is one model of computation more expressive than another in the sense that there things you can compute in one but not the other? Is there a program specified in a given formal model, that can determine whether another program specified in the same or another formal model will ever halt on a given input? It would be nice if you could tell whether a procedure you wrote will ever stop running and return you an answer. Can you think of a reason why you'd want to adopt a less expressive, more restrictive model of computation, i.e., write in a programming language that wasn't able to specify some kinds of computation? Suppose with the less expressive model/language you could get some nice properties, like the ability to check whether a program would halt or a guarantee that every procedure would take only a relatively short amount of time?

As much as I like this stuff, I plan to steer clear of the formal issues or at least only bring them up when they impinge directly on more practical issues at hand. In keeping with our Bug Safari metaphors from Thursday, this is less because I'm a lepidopterist (in this case, a theoretical or mathematically-inclined computer scientist) and you don't want to get me started on butterflies but rather because I hang around (and occasionally work) with a lot of lepidoterists and so I can easily get caught up speaking the argot of the lepidopterists. Besides, there are some excellent texts by computer scientists with a knack for making the mathematically complex accessible, e.g., [Davis, 2000] and [Harel, 2000], and I prefer to leave the more careful treatment of these issues to those with the requisite gift.

Next there is the area broadly called "the design and analysis of algorithms and data structures" which has theoretical as well as practical consequences. A data structure refers to a particular method for storing, retrieving and, more generally, manipulating data. An algorithm refers to the bare-bones description of a procedure definition usually specified in terms of a simplified programming language. Usually any associated data structures used by the algorithm are considered part of the recipe. Such a recipe can be implemented by translating from the simplified programming language to a specific programming language. For example, consider the very simple procedure to square a number. You could use a specification in natural language, e.g., "To square an integer, multiply the number by itself", but typically algorithm designers use some agreed-upon standard language such as ALGOL (for ALGO(rithmic) L(anguage)). Here's how to specify the square function in ALGOL:

integer procedure square(n)
    integer n;
    begin
        square := n * n 
    end;

Squint at the notation as usual but note that the integer statements are "type declarations" and can safely be ignored, := indicates assignment (x := 2 assigns the variable x the value of 2) whereas = (:= without the :) indicates an equality test (x = 2 evaluates to true if x is equal to 2 and evaluates to false otherwise), and, internal to the definition of a procedure, assigning the procedure name to a value (square := n * n) is the way to specify the return value for the procedure. Way too briefly, type declarations are annotations made by the programmer to help the programmer and other programmers understand programs and to help various programs that interpret, compile and otherwise try to make sense of programs do a better job.

You could then translate this ALGOL specification into Scheme as:

(define (average n) (* n n))

or into the C programming language as

int square(int n) { return n * n ; }

The touted advantage of using an agreed-upon standard like ALGOL is that everybody knows ALGOL (not true!) and that it doesn't favor any particular language (well, ok, ignoring dialects of ALGOL itself, in some sense this is true but it does favor languages with similar syntax to ALGOL). But this is one of the least interesting observations about the design and analysis of algorithms. You write algorithms in a general specification language to be clear about what the algorithms mean, to make it easy to implement algorithms in a variety of languages, and to enable computer scientists to analyze the characteristics of algorithms independent of the particular language in which they are written.

To get a taste of what it's like to analyze algorithms, let's return to the problem of computing the nth element of the Fibonacci sequence. Here is an algorithm written in ALGOL. It doesn't look too different from the implementation in Scheme that we looked at on Friday.

integer procedure fibonacci(n);
    integer n;
    if n = 1 
       then fibonacci := 0  
       else if n = 2 
               then fibonacci := 1 
               else fibonacci := fibonacci(n - 1) + fibonacci(n - 2);

Now here's an alternative (and, incidentally, more efficient) algorithm also written in ALGOL.

integer procedure fibonacci(n);
    integer n;
    if n = 1 
       then fibonacci := 0
       else if n = 2
               then fibonacci := 1
               else begin 
                        integer penultimate; penultimate := 0;
                        integer ultimate; ultimate := 1;
                        integer temporary;
                        for i = 3 step 1 until n do 
                            begin
                                temporary := penultimate + penultimate; 
                                penultimate := ultimate;
                                ultimate := temporary;
                            end ;
                        fibonacci := ultimate
                    end

In analyzing these two algorithms, a computer scientist would say that the first scales exponentially in n while the second scales linearly in n. Practically speaking, if each addition (application of the + operator) takes on the order of one millionth of a second and ignoring all other aspects of the computation, computing fibonacci(100) with the second algorithm would take considerably less than a second (10²/10⁶ = 1/10⁴ seconds) and computing fibonacci(100) with the first algorithm would take ... well, let's see, there are how many seconds in a year?

> (* 60 60 24 365)
31536000

or around 32 million not worrying about leap years and such. And about how many seconds will our calculation take with the first algorithm?

> (define (raise m n) (if (= n 0) 1 (* m (raise m (- n 1)))))
> (* (raise 2 100) (/ 1 (raise 10 6)))
19807040628566084398385987584/15625

Oops, this version of Scheme is trying to be clever by returning the number I requested as a rational number. It does this so it doesn't have to return me an approximation of what I asked for. How would you print out 1/3 as a "real" number? 0.33 or perhaps 0.3333333? Both of these answers are approximations to 1/3. If I include a "real" number in the calculation then Scheme will try to print the result as a real.

> (* (raise 2 100) (/ 1.0 (raise 10 6)))
1.2676506002282293e+24

Well, I guess that's progress! Now Scheme is offering me an approximation but it displays the approximation in scientific notation with a lot more precision (the digits on right of the decimal point before e+24) than I can make use of. (Don't believe anyone who tells you that computer arithmetic is simple; what goes on behind the scenes when you multiply and divide numbers in a calculator or general-purpose computer is fascinating and comprises an important subfield of computer science called scientific computing.) Let's just say 1.27 * 10²⁴ which is a pretty big number. Since 31536000 is 3.1536000e+7 in scientific notation or approximately 3.15 * 10⁷ we can estimate that it would take 4 * 10¹⁶ years. Asking Scheme to confirm this we get the following answer.

> (/ (* (raise 2 100) (/ 1.0 (raise 10 6))) 
     (* 60 60 24 365))
4.019693684133147e+16

That's an incomprehensibly long amount of time and that's just to compute fibonacci(100). Remember that things get exponentially worse as n gets larger, so fibonacci(101) takes twice as long as fibonacci(100).

> (/ (* (raise 2 101) (/ 1.0 (raise 10 6))) 
     (* 60 60 24 365))
8.039387368266294e+16

This sort of analysis is obviously quite useful and every student of computer science needs some exposure to this part of the discipline which includes coverage of a range of special topics in mathematics. I won't spend a lot of time performing formal analyses but from time to time I'll refer to various results from this area of computer science. The text by Cormen, Leiserson, Rivest and Stein provides a comprehensive introduction to algorithms and their analysis [Cormen et al., 2001].

After formal models and analyzing algorithms and data structures (these are both on the theoretical side of computer science - I'm showing the bias that results from my hanging around with lepidopterists), there is the general area of "systems". By some accounts systems includes the field of databases but, by other accounts, "core" systems emphasizes operating systems, programming languages, computer architectures, networks, distributed computing and a bunch of other specialty areas. Here are some areas of systems that I'd like to delve into, albeit briefly.

operating systems - an operating system is a program, typically quite large and complicated but not necessarily so, that sits between you and the bare guts of the computer. I like to think of the operating system as providing a layer of abstraction, a language of sorts, that lets me command the computer to quickly perform tasks that are routine, essential and require special knowledge of the computer hardware. At one time, I learned how file systems and virtual memory were implemented in the Unix operating system but I've forgotten and now such knowledge doesn't draw me like it used to. There is one area of operating systems however that does continue to intrigue me and that is the general idea of making a single computer appear to perform multiple tasks at the same time. I'm logged into my computer at work, klee, working remotely from home and, when I checked a few minutes ago, I noted that a graduate student was also logged into klee and running a couple of "processes" (related but not quite the same as the abstract processes that we talked about a couple of days ago). There are dozens of processes running on klee in addition to those that my graduate student hitchhiker is running; most of these processes were spawned by the operating system to perform routine maintenance chores. As far as I'm concerned, the operating system is attending only to me. I can't tell that in the last second the operating system switched between attending to me and attending to some other process dozens if not hundreds of times. There are cases however involving slower computers or more needy users when you can tell and it makes a difference.
computer architecture - what do the raw guts of the machine look like? And given the incredible hardware complexity of a modern computer are there any useful abstractions that allow me to understand what's going on, say, in the operating system without becoming overwhelmed with the details. Indeed, there is a simple and elegant model of computation that was devised by one of the earliest scientists and mathematicians to work with electronic computers, John von Neumann, that has formed the architectural basis of most modern computers and that still provides useful insights into the basic operations of modern computers. Understanding the von Neumann architecture is prerequisite to understanding some of the most basic principles of operating systems.
networks - I'm still astounded when I realize that I'm working on a computer many miles away and using information resources on the web some of whose servers are located on computers many thousands of miles away. For all I know, the computers providing these services might as well be in the next room; for the most part, the delays I experience are imperceptible and I feel like those resources are dedicated to my needs. The protocols that reliably and efficiently govern the transmission of information among computers constitute one subarea of the area of computer science dealing with networks. The area of networks deals with algorithms that break up large messages into bite-sized pieces called packets, plan paths to route these packets through networks of interconnected computers, and reassemble the messages from their constituent packets when they arrive at their destination. Doing this at all is pretty amazing; doing for millions (or is it trillions by this time) of messages every minute world wide is hard to comprehend. I don't anticipate any long-winded exposition on computer networks but I'm sure that related issues will come up from time to time as wind our way through the topics that catch my eye.
distributed computing - how to you perform useful work on a network of computers in which at any moment any particular computer may fail or become disconnected from the network? It's clear that no particular computer can perform in a central or irreplaceable role in such a model of computing. How do you keep track of the time or perform even the simplest of computations like counting? More and more often today, large, expensive, extremely reliable computers (what used to be called "big iron") are being replaced by networks of smaller, less costly, and not-so-reliable computers and expected to perform the same operations but faster and even more reliably than before. In particular, by "distributing" the computation among a host of smaller computers, computer scientists hope to avoid the vulnerability that comes from having a "single point of failure" - one computer which if it fails catastrophically would seriously threaten a key service. I don't know much about how to handle database transactions in a highly reliable distributed manner but I'm fascinated with the analogous problem of how to get a group of robots to carry out tasks with no central control.
programming languages - we've already talked about several programming languages and a bunch of related issues. I'm sure I'll have more to say about programming and programming languages in particular. One issue that I've skirted so far but which I find interesting (and a source of confusion for lots of students) concerns the issue of how a program written in a particular language is evaluated/executed/animated to give rise to computational process and thus perform useful computations. The issue is concerned with interpreters, compilers, parsers, translators, linkers, loaders, assemblers, and host of other facilitator programs that convert "symbols on a page" intose ries of primitive instructions that a particular piece of computer hardware can understand and execute.

Finally, after systems there is my area of expertise, artificial intelligence, about which I'm sure I'll have a few things to say. I'm particularly looking forward to writing a bit about robotics which I think of as a wonderful application for motivating a wide range of computer science topics and one of the most exciting direct applications of the technologies coming out of artificial intelligence.

I have no wish to be exhaustive (or exhausting) in my survey of the field, but I suppose I should say in deference to some of my colleagues and friends that I'm leaving out graphics and computer animation, scientific visualization, human computer interaction, and a host of other areas that I don't know enough to talk about coherently (but I'm sure are quite fine areas to be working in).

That felt great. I feel a little more organized now, like I've got a better idea of where I want to go in the next couple of weeks. I hope to rest a little more easily tonight (actually, I hope to lighten up a little on several fronts, so I won't get upset or lose sleep worrying about a silly journal entry).