Getting Oriented

Sample Lecture Material

Each chapter in Talking With Computers attempts to illuminate a single thread in a larger cloth constituting an area of computer science. You can unravel such a thread in myriad ways. A well versed computer scientist is likely to explain the thread in a different fashion scrutinizing some nearby threads, citing interested details about some aspects and glossing over others.

Chapter 6 is my attempt to illuminate how procedures and memory can be used to build powerful models of computation and the object oriented programming (OOP) model in particular. Objects in OOP have internal state (they can remember stuff) and internal procedures (they can do stuff). Objects can also inherit the properties of other objects and encourage a style of programming that lends itself to developing large software systems, but I don't talk about these aspects of OOP until Chapter 7. In this chapter, I concentrate on objects with memory and procedural capabilities.

Organizing Memory

When you set a shell variable in a shell like csh, that variable is accessible to any command you type to that shell.

% set tmp="~/email/"
% ls $tmp
INBOX OUTBOX

It's not accessible to other users nor is it accessible to you if you start up another shell in another window. The operating system partitions memory so that variables (and command names) can have different meanings (assignments) in different portions of memory. C, C++, Java and Scheme also organize memory to give programmers the ability to reuse the same names in different contexts. You've seen this already this in Chapter 5 where we define a constant and then use the same name as a variable in a procedure definition:

(define width 17)
(define (area length width)
  (* length width))

In this case, the instance of width used in defining area is a formal parameter that defines the calling pattern for area and allows us to substitute the values of area's arguments into its definition to evaluate a procedure invocation. This view is influenced by the substitution model, but there are other ways of thinking about memory and the role of arguments that reveal different perspectives on organizing memory. In Scheme, we use set! to modify the value of variables introduced using define.

> (define width 17)
> width
17
> (set! width 31)
> width
31

But consider the next exchange:

> (define width -17)
> (define (area length width)
     (set! length (* length 1.0936133))
     (set! width (* width 1.0936133))
     (* length width))
> (area 1 1)
1.1959900499368898
> width
-17

For some demented reason, this version of area assumes that the input is specified in meters and it produces an output in square yards (1.0936133 yards equals one meter). Can you explain the above behavior in terms of the simple substitution model that we described in Chapter 5?

Let's take a glimpse at another model called the environment model. In this model, memory is structured as a set of hierarchically organized tables. Each table keeps track of a set of variable names and their currently assigned values. New tables are created on the fly during evaluation and linked to other tables. An environment is a sequence of nested tables. At any time during evaluation, there is a current environment available to look up the value of variables.

When a statement of the form (define name value) is evaluated, it causes a new entry to be made in the innermost table of the current environment and this entry is assigned value. If an entry with the same name exists, then Scheme simply uses the old entry and overwrites the existing value. Here we depict what happens when you define a variable in an environment consisting a single, initially empty table:

[table:
  (define width 17) ] =>
[table: width = 17 ]

Similarly, (set! name value) assigns value to name in the innermost table of the current environment if the entry exists; if the entry doesn't exist then Scheme attempts to assign the value in the next innermost table if an entry exists there; if an entry doesn't exist in the next innermost table, Scheme keeps working its way out until it succeeds or fails at the outermost nested table, in which case Scheme gets upset and whines about an error:

[table:
  (set! height 47) ] =>
set!: cannot set undefined identifier: width

Whenever Scheme encounters a variable during evaluation, it looks in the innermost nested table to see if the variable is defined there. If so it gets the value from that table, otherwise it looks in the next table out and so on. In the environment model, the outermost table is called the global table and the environment consisting of just the global variable is called the global environment. If Scheme exhausts all the nested tables in looking for the value of a variable and can't find the variable listed in the global table then Scheme gives up and claims there's an error.

[table: width = 17,
  [table: length = 1.0936133, width = 1.0936133, 
    (* length width) ]

Now let's get back to explaining the performance of our odd little area procedure. Suppose that each time you invoke a procedure you create a new table associated with that procedure invocation and containing an entry for each variable corresponding to a formal parameter in the procedure definition. This new table is nested inside whatever table defines the current context for evaluation (it isn't obvious what this context should be, but we'll return to this issue shortly). In this model, the values of arguments are assigned to their respective formal parameter names in the associated environment table. Now instead of substituting values into the function definition as in the substitution model, Scheme looks up the value of variables in the current environment and we imagine evaluation proceeding something like this:

[table: width = 17,
  (area 1 1) ] =>
[table: width = 17,
  [table: length = 1, width = 1,
    (set! length (* length 1.0936133))
    (set! width (* width 1.0936133))
    (* length width) ] =>
[table: width = 17,
  [table: length = 1.0936133, width = 1,
    (set! width (* width 1.0936133))
    (* length width) ]
[table: width = 17,
  [table: length = 1.0936133, width = 1.0936133,
    (* length width) ] =>
[table: width = 17,
  [table: length = 1.0936133, width = 1.0936133,
    (* 1.0936133 1.0936133) ] =>
[table: width = 17,
  [table: length = 1.0936133, width = 1.0936133,
     1.1959900499368898 ] =>
[table: width = 17,
  1.1959900499368898 ]

This is a crude and incomplete description of the environment model but it should give you a little better idea of how memory plays a role in the evaluation of procedures that involve the use of set!. As an exercise, think about the structure of tables that results from nested procedure calls:

> (define (area length width)
    (* length width))
> (define (cost area rate)
    (* area rate))
> (cost (area 12 17) 31)
6324

The main question to resolve is whether a procedure inherits the environment from the context in which the procedure was defined (this context is determined when the procedure is defined and hence this strategy is said to be static) or the context in which it is evaluated (this context is determined at evaluation time and hence the strategy is said to be dynamic). There's no right answer to this question; different dialects of Lisp employ different strategies. To determine Scheme's strategy, you might conduct experiments such as the following:

> (define z 0)
> (define (one z)
    (printf "~A~%" z)
    (two))
> (define (two)
    (printf "~A~%" z))
> (one 1)
1
0

During the ``life'' of a procedure invocation, a procedure with formal parameters has a little ``local'' memory to play with. By controlling the context in which functions are defined, we can arrange for that memory to persist over multiple invocations. (Actually, every procedure has an associated environment and every environment includes the global environment; so in this sense every procedure has persistent state. We're interested, however, in ``private'' procedures that don't want to share their state with just anybody.)

As an aside, SICP starts with the substitution model but then abandons it for the environment model part way through the book citing problems with the substitution model. HTDP maintains a variant of the substitution model throughout by extending the substitution model to handle mutable state. Each model has its advantages and invites different perspectives on computing.

Procedures With State

In Chapter 2 we demonstrated how to construct a program under the control of another program and then execute it using eval or by piping it to a shell using echo:

% ls -l
-rw-r--r--  1 tld  staff   770 Jun 29 2002 email.txt
-rw-r--r--  1 tld  staff  1812 Nov 17 2002 music.sql
% set command="ls" 
% set options="-l" 
% eval "$command $options" 
-rw-r--r--  1 tld  staff   770 Jun 29 2002 email.txt
-rw-r--r--  1 tld  staff  1812 Nov 17 2002 music.sql
% echo "$command $options" | sh
-rw-r--r--  1 tld  staff   770 Jun 29 2002 email.txt
-rw-r--r--  1 tld  staff  1812 Nov 17 2002 music.sql

Scheme has lots of interesting tools for building computations and controlling their evaluations:

> (define nums (list 1 2 3))
> (apply + nums)
6
> (define sums (cons + nums))
> (eval sums)
6
> (define funs (list + *)
> ((first funs) 1 2 3 4)
10
> ((first (rest funs)) 1 2 3 4)
24

Some of the most interesting ways of manipulating procedures involve the use of lambda which is similar in some respects to quoting mechanisms in shell languages but lambda is much more powerful; lambda enables us to create procedures with persistent state. lambda is used to create procedures; the next two definitions are functionally equivalent:

(define (square x) (* x x))

(define square (lambda (x) (* x x)))

Use the environment model to make sense of the following exchange:

> (define procs
    ((lambda (x)
       (list (lambda ()
               (set! x (+ x 1)))
             (lambda ()
               (printf "~A~%" x)))) 0))
> ((first procs))
> ((first procs))
> ((first (rest procs)))
2

Note that the body of the define consists of a procedure invocation involving a procedure created by lamda applied to a single argument whose value is 0. This procedure invocation created two brand new procedures and packaged them in a list. The two procedures share the same environment: an environment consisting of two tables, the global table for the outermost table and an inner table corresponding to the table that was created when we evaluated the body of the define statement. The shared environment looks something like the following:

[table: 
  [table: x = 0 ] ]

Now when the first procedure in the list increments the variable x, the second procedure sees the result reflected in the shared environment. This may seem like so much hocus pocus but the ability to keep and modify internal state is crucial in building objects of the sort that are central in object oriented programming.

And if procedures can share private names for data, there's no reason they can't share private names for procedures; procedures and data are, after all, just different perspectives that we can switch between by controlling evaluation.

As an exercise, write down a set of applications for which you think persistent state and private data and functions would be useful. Try to imagine how these aspects might play a role in the constraint propagation program presented at the end of Chapter 6. It might help to anthropomorphize the components of the constraint propagation program. Think of wires as knowing the values they carry and keeping track of the components to which they're connected. Think of adders and multipliers as being able to carry out generic procedures like ``update'' yourself.

Code Fragments

Download all the code fragments in this chapter as a file in zip format.