Saturday, July 6, 2002

Computational Muddles

When I finished working around 5PM yesterday, Jo and I took a walk and a short swim, the ocean water is still pretty cool for July so we didn't swim for long. I was a little tired and cramped from sitting at the computer most of the day and I was preoccupied thinking about what should go next, in today's journal entry.

Toward the end of yesterday's entry, I was starting into a familiar groove (or perhaps rut), fondly misremembering my first exposure to programming in Lisp and thinking I should cover some basic programming techniques and models. At this point in my life, when I start thinking "should" and "ought to", warning flares go up and I begin questioning why I feel compelled to think one way or another. Usually, my compulsions, cast in statements of the form "you should do this or think that", are due to my being lazy or gutless (in the sense of "lacking conviction" or "blindly following the status quo") and following a comfortable or externally sanctioned track.

I've taught at the college level for over sixteen years and for most of that time my lectures were top-down, beginning with basic principles and techniques and building to more difficult ones. I tended to introduce so-called models of computation and then apply these models to particular problems using various technologies and tools. Luckily for the students, most of the time I got to some concrete problems and instances of real tools pretty quickly. Much later I'd find in talking with students returning to Brown for reunions or just traveling through Providence, that while they really enjoyed the concrete stuff, the models were always a bit vague and typically didn't stick (or at least they couldn't recall any models well enough to articulate them in a manner I could recognize).

For the most part, these experiences with returning students didn't dissuade me from continuing to introduce them in class, but even I could tell that there were a bunch of students who usually perked up when we got to a real problem that they could sink their teeth into. Reading Richard Felder [Felder, 1993] I came to appreciate what I'd always dimly perceived, namely that students come in many different sorts with different learning styles and, not surprisingly, they respond differently to different teaching styles.

In the last two years serving as the chair of the department, I taught my classes and in particular my introductory robotics course in a very different style. I wasn't doing this consciously to improve my effectiveness as a teacher, rather I was doing it to survive.

When I was just a professor and not chairing the department, I'd use the weekend to get my notes and assignments together and the evening prior and particularly the hour or so right before class to think through the material and prepare my lecture. Then I'd walk into class, brimming with ideas and material which I'd try to cram down the yawning gullets of the waiting students. I often had this ideal student in mind, hanging on my every word, efficiently filing away each abstract model as I introduced it and then trundling them out again when appropriate to solve a particular problem or understand a particular method. This ideal student would go back to his or her dorm room to pore over the lecture slides and ponder the deeper implications of the material. Well, this just wasn't happening (there were some exceptions) and if I'd only thought a little more about how I learned new material, I would have realized that it wouldn't have even worked for me, hypothetically speaking.

I generally first need to be motivated by something that I want to do or something that I think would generally be cool though I might have no particular use for it at the time. Once I'm motivated I can bear a certain amount of pain if that's what it takes, but I generally prefer to do something that's interactive. This interactive part of learning can take a couple of forms: There might be a neat computer program that would stimulate me on several levels with interesting graphics, and the ability to learn at my own pace or in my own idiosyncratic and spur-of-the-moment order. For example, I like programming environments that allow you to "converse" with the computer. The other form of interaction is through talking with and working through problems with others. When you're working with others and your enthusiasm flags or you get stuck, generally there's someone else who can pick up the slack, spur you into action, or fine a way around the impasse. Also when you're working with a bunch of people, there's usually one or two who think (learn) more like you (this is good because you don't feel odd or stupid) and a few who think differently (this is good because solving problems often takes multiple perspectives and even if a particular way of thinking doesn't come naturally, that doesn't mean it isn't a useful way of thinking or that you're incapable of learning to think in that way).

When I taught robotics during the last two of my tenure as department chair, I'd do most of my preparation during the weekend and then, because I was fighting fires in my weekday job as chair, come into class either cold or with only minimum preparation. I'd prepare notes which I'd make available on line but for the most part I spoke conversationally and at a fairly high level. This only worked so so but I was too distracted by my duties as chair to think much about it and I was blessed with a couple of advantages.

First, robots are just plain fun. They're easy to motivate and its simple to explain the potential applications. And this was a course about building robots. Second, it was hands on; the robots were built from Lego parts and everyone loves Legos, and the labs were all about learning new stuff that directly contributed to building better robots. There were all sorts of options for exercising your own particular skills: some students were more mechanical and loved designing and building new robot bodies, others liked the more esoteric parts of programming the robots, and still others liked thinking up interesting things for robots to do either individually or as teams of interacting and communicating robots. Finally and this advantage is less easy to replicate in other courses, most of the students were highly motivated juniors and seniors who'd been waiting to take this course for years.

I'd teach for an hour and twenty minutes on Monday and the rest of the week the class consisted of labs and students working in pairs and larger groups. Oh and I almost forgot, both years I had a team of skilled and enthusiastic undergraduate teaching assistants who'd worked with me before and knew an extraordinary lot about robots.

The good thing about my lectures is that they were unmistakably about robots and even though I'd occasionally dive into some arcane bit of control theory or signal processing usually I'd surface pretty quickly with some useful tidbit that would eventually and manifestly turn out to be relevant to programming robots. The lectures were generally tied to the labs and the labs would often, after the fact, explain why I spent so much time belaboring some point a couple of days prior. The bad thing about my lectures is that they were too long and for the most part failed to engage the students sufficiently that they asked questions and took a more active part.

I think I know how to fix the problem and perhaps I would have had I had more time to divert from being chair. The important thing however is that the class worked despite my lectures because the students were motivated by the content (can't claim credit for this), the course structure provided opportunities for several different learning styles to shine and social structure of the class offered many different ways for students to learn: from one another, from the teaching assistants, from the professor and from traditional materials such as textbooks and lecture notes.

What lesson did get from this experience that I'm applying now? Only that I don't want to get ahead of motivating you nor do I want to make the mistake of thinking that what motivated me when I first learned about computers is going to motivate you now. And I want to make sure that we approach this from a range of perspectives trying out different motivating examples and inviting you to exercise your different methods of engagement.

In June of this year, I met two or three times a week with four students who are working for me as UTRAs (for Undergraduate Teaching and Research Assistant) this summer. The four women, Erika, Leah, Susannah and Jennifer, are all rising Juniors at Brown majoring in computer science and very open minded about what areas of computer science they might want to specialize in. Part of their summer experience will be to run a program for high school girls interested in computer science. The program is called Artemis, after the Greek goddess of the hunt, and the goal of the program is to encourage the Artemis participants in their interests, accelerate their education in computer science and introduce them to opportunities and careers in the field.

We decided back in April that for the research part of their UTRAs we'd talk about a wide range of computer topics and perhaps about this book which I mentioned I'd been thinking about. When June rolled around we thought everyone would be laid back, with all the time in world to think about computer science. However, it soon dawned on Erika, Jen, Leah and Susannah, that they'd taken on a huge responsibility in running Artemis and they had little time to plan a syllabus, arrange for field trips, experiment with various software packages and carry out a host of other planning and preparation tasks before the Artemis students arrived July 1. Our meetings during June then weren't much different from regular classes during school year, smaller perhaps, but, as during the school year, there was competition for their time and energy and the needs of Artemis generally won out.

At first I assigned some readings but that quickly became an imposition as they had a million things to do to prepare for Artemis. So mainly we just talked and occasionally played with some software or experimented with tools that might be relevant to Artemis. After each of our meetings, I'd think of other things to talk about based on what seemed interesting during our meeting; I'd send them little snippets of follow-up explanation or code fragments (shell scripts or bits of Perl code) to try out. And I'd puzzle about what things I had said that they got or didn't get.

Even as rising juniors they didn't seem to have a useful repertoire of sophisticated models of computation or at least they couldn't articulate the same models that I was aware of (which is clearly not the same thing as not having a useful repertoire). But I knew that all four were accomplished programmers and had served as teaching assistants and so had taught other students how to become good programmers. They hadn't, however, taken any of the more advanced and theoretical courses that were required for students concentrating in Computer Science.

So I used a lunch out and a wind surfing safari to nearby Fogland beach, to quiz two rising seniors, Albert and Luke, about what models they had in their head. Albert and Luke are very talented robot hackers. Al made significant improvements in the packet-based infrared communications package that students used in CS148 so that their robots could talk with one another and with other computers equipped with infrared ports. Luke's robotic inventions were always elegant and inspired; he has an intuitive feeling for robotics which was always apparent in his designs and in how he handled other students' robots as a teaching assistant for my class.

Al and Luke, or Sonic as he likes to be called, weren't much better at articulating models though they could clearly answer questions that I would have expected required the use of a computational model. I asked a colleague, Shriram Krishnamurthi, who does research on programming languages with a strong theoretical component, if he'd found a similar lack of awareness of explicit, classical computational models in his students. Shriram said that he'd found students often unable to articulate such models but that he was convinced that at least some of his students possessed a working knowledge of some of the most important models.

Thinking a little more about this I realized that I didn't acquire my explicit models until I was required to produce formal proofs in proving properties of algorithms (an algorithm is a step-by-step recipe for performing a computation, similar to a program but specified in a precise and general mathematical notation rather than in the notation of a specific programming language). Then it was important not only to use suitably powerful abstractions but to employ a standard language for expressing these abstractions so that other computer scientists could understand my proofs and appreciate my results.

The upshot of all this worrying about articulating models of computation is that I ended up not worrying about them so much and not forcing them on students until I could find a suitable motivation, like the need to communicate a complex idea to a colleague or a whole community with a shared vocabulary and set of models.

What does all this have to do with what I'm going to write about today? Only that I want to think of this journal less like a top-down textbook and more like an ongoing conversation. My model for the book is that each day, I have a conversation like the ones I had with Erika, Jen, Leah and Susannah in June. And each evening I think about what we talked about and figure out where to go next being careful to exercise different ways of thinking and indulge my penchant for abstraction only sparingly.

Part of what got me thinking about Erika, Jen, Leah and Susannah was an early exchange we had about databases. Modern database systems, as a field of study, is a beautiful example of theory and practice working hand in hand to produce an extraordinarily useful technology. Most businesses, banks and industries could not function without databases to keep track of clients, control orders inventory, generate payrolls, and handle billing and sales. As soon as you have more than half a dozen employees, a few hundred customers or reasonable variety in your merchandise you'll need a database to keep things straight.

Now I was originally motivated to bring up databases because I wanted talk about the relational model which is an elegant and to my mind simple theoretical model explaining how modern databases work. Let's try an experiment. I'm going to introduce you to databases by playing with a popular open-source database that you can get for free and install on your computer. We'll pretend that we're using the database to set up a web-based application of some sort and I'll try to explain the role of the database without reference to any fancy theories. Later I'll point you to an explanation of the relational model and you can see for yourself whether you think it would have made any difference if I had explained the relational model first.

Here's the way I tried to motivate the application of databases for Erika, Jen, Leah and Susannah. And for you to understand this, you should know that Artemis has been running since 1996 and the instructors and Artemis students for the last few years have kept in touch. The Artemis instructors along with members of Brown WiCS (Women in Computer Science) serve as mentors for the Artemis students. Mentors keep in touch with their "mentees" (the students that they mentor) through email and by attending reunions and Artemis get-togethers during the school year. With twenty or so new Artemis students each year, four or five new instructors plus additional mentors from WiCS, pretty soon you've got a bunch of people to keep track of. How do you connect Artemis students with mentors, keep track of email addresses and phone numbers, send out email when there's a reunion for a particular year being planned, handle email forwarding and answer questions from prospective students and their parents interested in Artemis?

The answer is to use a database to keep track of all these things and then hook it up to a web-based front-end so that Artemis instructors and students can gain access to the information while at the same time being careful to make the information secure and appropriately private. The web page would provide general information for parents and prospective students. Instructors and students could log into instructor and student accounts using personal passwords and a designated database administrator could be assigned to a special account that would enable her to manage the database, whether the administrator is still at Brown, spending a summer abroad, or off at graduate school. Students could access information about their year and could update their personal information. Instructors could access the information on mentors and mentees and update any information for their year. And the database administrator would have access to all of the information in the database and could change anything including adding new tables and programming.

You'd need a similar arrangement to set up an on-line service or commercial web site. And while you could buy a commercial database product, you can pick up several very good database free. I downloaded MySQL and installed it on my machine for the following demonstration. SQL (computer scientist go nuts defining acronyms) stands for Structured Query Language and is a general-purpose language for making queries to database programs. Many database programs come with their own languages but most of these languages include SQL or a major subset of SQL.

According to their site, "MySQL is the world's most popular Open Source Database, designed for speed, power and precision in mission critical, heavy load use." PostreSQL is another open-source database that you can get for free and that some of the technical staff and faculty in the department use. The PostgreSQL site boasts that, "PostgreSQL is a sophisticated Object-Relational DBMS, supporting almost all SQL constructs, including subselects, transactions, and user-defined types and functions. It is the most advanced open-source database available anywhere." I don't expect you to believe the hype but after playing with each of these for a bit I have to admit they're pretty nice databases free or not.

To complete the Artemis web-based database application, you'd need to set up a server and here again you can appeal to the open-source community. The system available from the Apache Project is a good choice; from their web site we're told that the "Apache Project is a collaborative software development effort aimed at creating a robust, commercial-grade, featureful, and freely-available source code implementation of an HTTP (Web) server. The project is jointly managed by a group of volunteers located around the world, using the Internet and the Web to communicate, plan, and develop the server and its related documentation." (Note the implied distributed development model for this open-source project and, if this sounds interesting to you, search the web to learn more about the "open source movement".)

You'll also need some glue to connect the web server to the database. Think of the web server and the database as programs that speak different languages. If you get serious about this kind of a project you might want to check out PHP. Among other things, PHP will allow a browser on a remote computer to access a MySQL database through an Apache web server. The PHP web page tells us that "PHP is a tool that lets you create dynamic web pages. PHP-enabled web pages are treated just like regular HTML pages and you can create and edit them the same way you normally create regular HTML pages."

PHP is one of several so-called server-side scripting languages used web servers to perform computations on the server at the request of web surfers requesting pages from the server. PHP stands for "PHP Hypertext Preprocessor" which is a a recursive acronym (see the July 5, 2002 entry for a brief discussion of recursive rules). One of the most famous recursive acronyms in computer science is GNU which stands for "GNU's Not Unix". There are other popular server-side scripting languages including ASP (for "Active Server Page" from Microsoft) and JSP (for "Java Server Pages" from Sun Microsystems).

Insert a suitably edited version of the email that I sent Erika, Jen, Leah and Susannah on June 8, 2002, replacing the names and email addresses of the Artemis instructors and students with made-up names and email addresses to protect the student's privacy. I thought about including a summary of the relational model but this is exactly the sort of more advanced material that is readily available on the web and interested readers should be encouraged to go out and learn on their own. I will include a link to a good source for this material however.

As per the above instruction, here's the excerpted and edited email inserted on July 30, 2002:

BEGIN INSERT

In the following exchange, MySQL is run as a server and this next incantation starts the server running as a separate process. A process in this context refers to a computation managed by the operating system; more generally, the word "process" refers to any on-going or evolving computation. In the so-called client-server model processes called "servers" provide ("serve up") services to processes called "clients" who request the services. Once the server process is up and running, anyone can start up a client process and connect to the server.

% /usr/local/mysql/bin/safe_mysqld &

The next command starts a very simple client process that allows a user (me in this case) to type SQL queries that are run on the server. In response, the server sends data back to the simple client which then displays the using the following simple "teletype" format.

% /usr/local/mysql/bin/mysql

The client just sits there waiting for input from the user and displaying the "mysql>" prompt. The text following this prompt, e.g., SHOW databases, is the SQL query that I typed followed by a semicolon which tells MySQL that I've finished the command and that it should try to interpret what I just typed. I've been very careful typing in the following but if I had mistyped anything, MySQL would issue a complaint and try to describe to me what I did wrong or what it things I did wrong. Usually the error messages are pretty hard to decipher unless you've been working with MySQL for some time.

mysql> SHOW databases ;
+----------+
| Database |
+----------+
| mysql    |
+----------+
1 row in set (0.00 sec)

I'm going to create a new database just to record information about Artemis.

mysql> CREATE DATABASE artemis ;

The next command confirms that MySQL did indeed create the database as I requested.

mysql> SHOW databases ;
+----------+
| Database |
+----------+
| mysql    |
| artemis  |
+----------+
2 rows in set (0.00 sec)

Now I tell MySQL that I want all of my subsequent commands to be executed with respect to the Artemis database.

mysql> USE DATABASE artemis ;

Finally, I'm ready to get on with setting up a database of students and instructors. Databases consist of tables, tables consist of records, and records consist of fields. I'm going to create separate tables for students and instructors. Within the instructor table, I'll create separate records for each instructor. For instructors, I'm going to have fields for their first, middle and last names, the year in which they served as an instructor, and their email address. I'm also going to include an identifier field which uniquely identifies each record which will come in handy when we want to refer a particular instructor in the midst of a query. The identifiers are just the integers 1 through however many instructors we have.

mysql> CREATE TABLE instructors (
    ->  id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
    ->  first VARCHAR(20),
    ->  middle VARCHAR(20),
    ->  last VARCHAR(20),
    ->  year VARCHAR(4),
    ->  email VARCHAR(50),
    ->  PRIMARY KEY (id) ) ;
Query OK, 0 rows affected (0.01 sec)

The DESCRIBE command allows us to look at the fields of a specified table. Ignore all the specialized type information; I did this just to make sure that I didn't mistype anything.

mysql> DESCRIBE instructors ;
+--------+----------------------+------+-----+---------+----------------+
| Field  | Type                 | Null | Key | Default | Extra          |
+--------+----------------------+------+-----+---------+----------------+
| id     | smallint(5) unsigned |      | PRI | NULL    | auto_increment |
| first  | varchar(20)          | YES  |     | NULL    |                |
| middle | varchar(20)          | YES  |     | NULL    |                |
| last   | varchar(20)          | YES  |     | NULL    |                |
| year   | varchar(4)           | YES  |     | NULL    |                |
| email  | varchar(50)          | YES  |     | NULL    |                |
+--------+----------------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)

Next we want to INSERT a bunch of records into the instructors table. I use NULL for anything I don't know or anything that I want MySQL to fill in for me. In the following, I let MySQL provide the identifiers using its "auto increment" feature.

mysql> INSERT INTO instructors VALUES
    ->  (NULL, 'Lena', 'Katherine', 'Perl', '2002', 'lperl@cs.brown.edu'),
    ->  (NULL, 'Jennet', 'Elizabeth', 'Bloom', '2002', 'jbloom@cs.brown.edu'),
    ->  (NULL, 'Alicia', 'Susan', 'Roberts', '2002', 'aroberts@cs.brown.edu'),
    ->  (NULL, 'Elyssa', 'Alice', 'Fathom', '2002', 'efathom@cs.brown.edu'),
    ->  (NULL, 'Michelle', 'Mary', 'Angle', '2001', 'mangle@cs.brown.edu'),
    ->  (NULL, 'Tracy', 'Nancy', 'Hepburn', '2001', 'thepburn@cs.brown.edu'),
    ->  (NULL, 'Anita', 'Natasha', 'Virgil', '2001', 'avirgil@cs.brown.edu'),
    ->  (NULL, 'Mary', 'Sally', 'Gauss', '2000', 'mgauss@cs.brown.edu'),
    ->  (NULL, 'Kristine', 'Sara', 'Karma', '2000', 'kkarma@cs.brown.edu'),
    ->  (NULL, 'Gloria', 'Amy', 'Galois', '2000', 'ggalois@cs.brown.edu'),
    ->  (NULL, 'Siri', 'Lila', 'Raman', '2000', 'sraman@cs.brown.edu') ;
Query OK, 11 rows affected (0.00 sec)
Records: 11  Duplicates: 0  Warnings: 0

The following SELECT command requests to MySQL to create a new table by selecting all of the fields (that's the *) from the instructors table and then order the records alphabetically using the last name field.

mysql> SELECT * FROM instructors ORDER BY last ;
+----+----------+-----------+---------+------+-----------------------+
| id | first    | middle    | last    | year | email                 |
+----+----------+-----------+---------+------+-----------------------+
|  5 | Michelle | Mary      | Angle   | 2001 | mangle@cs.brown.edu   |
|  2 | Jennet   | Elizabeth | Bloom   | 2002 | jbloom@cs.brown.edu   |
|  4 | Elyssa   | Alice     | Fathom  | 2002 | efathom@cs.brown.edu  |
| 10 | Gloria   | Amy       | Galois  | 2000 | ggalois@cs.brown.edu  |
|  8 | Mary     | Sally     | Gauss   | 2000 | mgauss@cs.brown.edu   |
|  6 | Tracy    | Nancy     | Hepburn | 2001 | thepburn@cs.brown.edu |
|  9 | Kristine | Sara      | Karma   | 2000 | kkarma@cs.brown.edu   |
|  1 | Lena     | Katherine | Perl    | 2002 | lperl@cs.brown.edu    |
| 11 | Siri     | Lila      | Raman   | 2000 | sraman@cs.brown.edu   |
|  3 | Alicia   | Susan     | Roberts | 2002 | aroberts@cs.brown.edu |
|  7 | Anita    | Natasha   | Virgil  | 2001 | avirgil@cs.brown.edu  |
+----+----------+-----------+---------+------+-----------------------+
11 rows in set (0.00 sec)

Next we need a table to store information about the students. The records in the student table have fields for first and last name, the year the students attended Artemis, their mentor (usually an instructor), their email address, and, again, an identifier field.

mysql> CREATE TABLE students (
    ->  id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
    ->  first VARCHAR(20),
    ->  last VARCHAR(20),
    ->  year VARCHAR(4),
    ->  mentor SMALLINT UNSIGNED NOT NULL REFERENCES instructors,
    ->  email VARCHAR(50),
    ->  PRIMARY KEY (id) ) ;
Query OK, 0 rows affected (0.01 sec)

Let's check it out just to make sure that I typed everything right.

mysql> DESCRIBE students ;
+--------+----------------------+------+-----+---------+----------------+
| Field  | Type                 | Null | Key | Default | Extra          |
+--------+----------------------+------+-----+---------+----------------+
| id     | smallint(5) unsigned |      | PRI | NULL    | auto_increment |
| first  | varchar(20)          | YES  |     | NULL    |                |
| last   | varchar(20)          | YES  |     | NULL    |                |
| year   | varchar(4)           | YES  |     | NULL    |                |
| mentor | smallint(5) unsigned |      |     | 0       |                |
| email  | varchar(50)          | YES  |     | NULL    |                |
+--------+----------------------+------+-----+---------+----------------+
6 rows in set (0.01 sec)

I'll type in some student records for examples leaving the identifier, mentor and email address fields blank. Generally we don't know the students mentor or email address when we first enter them into the data base. The identifier field will get filled in automatically.

mysql> INSERT INTO students VALUES
    ->  (NULL, 'Lisa', 'Dante', '2001', NULL, NULL),
    ->  (NULL, 'Hana', 'Laplace', '2001', NULL, NULL),
    ->  (NULL, 'Vicky', 'Markov', '2001', NULL, NULL),
    ->  (NULL, 'Hariet', 'Curry', '2001', NULL, NULL),
    ->  (NULL, 'Julia', 'Cicero', '2001', NULL, NULL),
    ->  (NULL, 'Alexis', 'Galileo', '2001', NULL, NULL),
    ->  (NULL, 'Mariane', 'Solaris', '2001', NULL, NULL) ;
Query OK, 7 rows affected (0.00 sec)
Records: 7  Duplicates: 0  Warnings: 0

In the next query, I introduce a temporary variable (called @tmp) and assign it to be the identifier for the instructor whose last name is "Galois". The SELECT command always returns a table but in this case it isn't very interesting.

mysql> SELECT @tmp := id FROM instructors WHERE last = 'Galois' ;
+------------+
| @tmp := id |
+------------+
|         10 |
+------------+
1 row in set (0.03 sec)

Having set the temporary variable, I then use it to UPDATE a field in the student table. The net result of the following command is that Gloria Galois is now identified as the mentor for Hana Laplace.

mysql> UPDATE students SET mentor = @tmp WHERE last = 'Laplace' ;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

The next two commands establish Mary Gauss as Lisa Dante's mentor.

mysql> SELECT @tmp := id FROM instructors WHERE last = 'Gauss' ;
+------------+
| @tmp := id |
+------------+
|          8 |
+------------+
1 row in set (0.00 sec)

mysql> UPDATE students SET mentor = @tmp WHERE last = 'Dante' ;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

The whole interchange was pretty verbose but all of these commands would be processed in the background if we were using MySQL as part of a web interface for the Artemis program. Note that when we SELECT all fields from the student table we now see that the mentors for Lisa and and Hana are filled in as we requested.

mysql> SELECT * FROM students ORDER BY last ;
+----+---------+---------+------+--------+-------+
| id | first   | last    | year | mentor | email |
+----+---------+---------+------+--------+-------+
|  5 | Julia   | Cicero  | 2001 |      0 | NULL  |
|  4 | Hariet  | Curry   | 2001 |      0 | NULL  |
|  1 | Lisa    | Dante   | 2001 |      8 | NULL  |
|  6 | Alexis  | Galileo | 2001 |      0 | NULL  |
|  2 | Hana    | Laplace | 2001 |     10 | NULL  |
|  3 | Vicky   | Markov  | 2001 |      0 | NULL  |
|  7 | Mariane | Solaris | 2001 |      0 | NULL  |
+----+---------+---------+------+--------+-------+
7 rows in set (0.00 sec)

Note that now the two tables, instructor and student, are linked together through the mentor field. We can articulate that linkage in terms of a new table using the following query which you can interpret as follows: Create a new table by SELECTing two fields from the student table, first and last, and two fields from instructor table, first and last, and do your selecting by combining the instructor and student tables but only include those records where the mentor field of the student record is equal to the identifier field of the instructor record.

mysql> SELECT students.first, students.last, instructors.first, instructors.last
    ->  FROM  students, instructors
    ->  WHERE students.mentor = instructors.id ;
+-------+---------+--------+--------+
| first | last    | first  | last   |
+-------+---------+--------+--------+
| Lisa  | Dante   | Mary   | Gauss  |
| Hana  | Laplace | Gloria | Galois |
+-------+---------+--------+--------+
2 rows in set (0.00 sec)

This query is particularly interesting because it's an example of how an existing set of tables can be manipulated to allow you to view your data in new and useful ways. The above table might be thought of as the "mentor assignment" table but it only exists as a result of generating this table. That's enough of MySQL syntax but I'll say a couple of more things about SQL queries.

mysql> EXIT ;
Bye

SELECT always generates a table as output and the FROM part expects one or more tables as input. When there are two or more tables specified in the FROM part, these tables are JOINed together. JOINs can be the source of performance problems in databases; if you join two tables of size N and M then you'll have to construct a JOINed table of size N * M. In the above query, since there were 11 instructor records and 7 student records, the JOIN of the student and instructor tables contained 77 combined records. And MySQL had to at least consider this many records even though the final output table contained only two records. Figuring out how to handle JOINs efficiently is an important part of designing database systems.

You can use SELECT recursively so that SQL queries of the form

SELECT ... FROM (SELECT ... FROM ...) WHERE

are just fine and indeed quite common. You can do fairly complicated programming in most database languages. For example, it's pretty easy to write short programs to answer the following queries: Find all students who don't have a mentor. Find all instructors who don't have at least one mentee. List all of the instructors in reverse order of the number of their mentees, i.e., list instructors with fewest mentees first. Create a table with students grouped together who share the same mentor.

Later on we'll talk about abstract models of computation that make it easier to understand what's going on in complex programs. What sort of a model would provide you with the insights needed to write efficient queries for a particular database engine with an SQL-based front end? The model would probably have to include the basic objects and data structures used in implementing databases, e.g., tables, records, fields, keys, functions, and then include the basic operations on these structures including SELECTs and JOINs. The model should support the analysis and simplification of queries to improve performance. If you're interested in learning more about abstract models for database systems, search the web for descriptions of the standard relational model.

The standard relational model (due to E. F. Codd in 1970) along with various normal forms is the stuff of most database courses and database textbooks, e.g., see [Ullman, 1988]. The relational model and the entity-relationship (E-R) model often appear together but represent very different sorts of models. The E-R model is the more concrete of the two and is specific to a particular application. We were developing an E-R model by creating the student and instructor tables and defining their respective records. Developing E-R models is a particular case of the more general representation problem which is central to the subfield of knowledge representation in AI. Developing E-R models is the design part of database systems; it's actually quite hard to do well and the experts are paid very well for their services. Proceeding without an E-R model is tantamount to writing code without a clear idea of what problem you're trying to solve or building a house without a set of blueprints.

END INSERT

Unlike the gentle introduction to programming that I had thought I'd be working on today, I gave you a 40,000 feet up-in-the-clouds discussion of models and a crash course in databases with web servers and some sort of magic glue thrown in for good measure. We'll talk more about browsers and web servers and the like later on but it might be a good idea to give you high level, 5,000 feet up, description of how browsers and web servers and databases work apart and together.

Your browser can send and receive files which you can think of as containing code in a variety of different programming languages, GIF and JPEG for displaying images, HTML and XHTML for encoding data and formatting and displaying text and images, Java and JavaScript for running local programs that you can interact with, and PHP (or the CGI (Common Gateway Interface) or ASP protocols which allow similar capabilities) for handling all sorts of dynamic content. Your browser communicates with web servers using HTTP (Hyper Text Transfer Protocol), an interlingua that allows browsers and servers to understand one another.

A web page containing JavaScript or PHP tags can cause a computation to occur either inside your browser using its Java virtual machine (JVM), a special piece of software that knows how to interpret Java code, or on some other machine at the behest of the server.

The web server is set up to "serve up" web pages in response to requests from browsers and other programs that can communicate using the HTTP protocol. Serving up a web page means that the server sends HTML, GIF, JPEG, etc. code that comprises a given web page to the browser requesting the page. In some cases, there really isn't a fixed page that's being requested but rather the page is constructed on the fly using various sources of information, including, for example, information that you've sent to the server by filling out forms on your browser. This means that your requesting a web page often causes a computation to occur on the server.

Databases are typically also set up as servers. A server is just a program that runs continually with some means for clients, other programs that have need of the services offered by the server, to make requests for services. When I used the MySQL database above, I actually just started a simple little client (another program) that sent messages back and forth to the MySQL database server. Since the database server was running on the same machine as the little client program they communicated in a direct fashion. However, I could have almost as easily set up a client on one machine to communicate with server running on another machine.

Most database servers allow multiple clients and these clients have to use special accounts and passwords to make sure that only authorized people are able to gain access to the data. In case of a large databases, i.e., programs that manages a large amount of data say for corporation, the database server often run on a dedicated machine simply because the database server needs all of the resources of that machine to provide the services that are asked of it. You'll often also hear of machines referred to as a server; typically the machine is just an ordinary machine such as the one you have on your desk or in your school but beefed up with more memory, larger disks or more than one processor.

The main difference between the sort of machine that sits on your desk at home or school and a server is the software that runs on the machine - it's the software that distinguishes it as a server. It's useful to abstract away from particular machines and distinguish the inert software from the dynamic running of the software. Indeed a server is best thought of in terms of the notion of process as used in computer science, which emphasizes not machines or software but the dynamics of computation itself. As machine is to your brain and software is to what you've been taught, so process is to your thinking, the continually evolving twists and turns of your thoughts as they tumble over one another into the future.

If these abstract flights of fancy don't help, think concretely in terms of some business, say a restaurant, that provides some basic service, food in this case, to its clients, customers, in response to requests, orders, for those services. The main difference between a restaurant and server on a computer is that the services offered are computational; database servers store and organize data and answer questions concerning that data, web servers serve up and generate web pages that can be displayed on your web browser.

The difference between client and server can shift. At times the web server behaves like a server, when it is serving up web pages to browser clients, and at other times, the web server behaves like a client say when it is requesting data from a database server. There are even arrangements in which all of the software components are referred to as "servents", sometimes behaving as servers and sometimes behaving as clients.

When you go to the web page of, say, an on-line bookseller like Amazon as a frequent customer, the web server will likely look up your name in a database and generate a special web page on which it will provide personalized information such as status information on your last purchase which was just shipped from their warehouse. When you ask about a book, your browser will send a request to the web server that will consult the database again to see if there are any copies available and the price. If you decide to purchase the book, the database will record your purchase along with the necessary credit card information. Other software will access the database to arrange for shipping the book and keeping track of it while it is in transit.

The orchestrated dance involving the various pieces of software that serve as the basis for on-line business is quite complicated but each component and the languages and protocols that it uses to communicate with other components can be broken down into tractable pieces. The overall design and management of these systems is a major engineering achievement. What is truly amazing however is that the components and protocols and languages are available to anyone with the interest and intellect to use them. Some the available open-source systems constitute engineering artifacts every bit as complicated as a modern automobile or power plant but you can download them to your computer in a couple of minutes.

Well, that's enough "motivation" for now. Tomorrow we'll take a stab at computer programming 101 and I'll see how far I can go without losing your interest. My goal won't be to teach you the material in a good introduction to programming but rather to keep pelting you with little insights into how computers and programming languages work. I'd like you to keep returning for additional insights but I'll be quite happy if at some point you grab a copy of SICP, download a version of Scheme, and get serious about learning to program.