Chapter 11

Under the Hood

Have you ever wondered what’s going on when you click your mouse on some underlined text in a browser window and suddenly the screen fills with text and graphics that clearly come from some other faraway place? It’s as if you’ve been transported to another location, as if a window has opened up on another world. If you’re on a fast cable or DSL (“digital subscriber line”, the first of many acronyms in this chapter) connection, the transformation is almost instantaneous; if you’re using a slow modem, then updating the screen can take several seconds or even minutes, but in any case the behind-the-scenes machinations making this transformation possible leave little evidence. Occasionally, however, you’ll catch fleeting glimpses of the machinery through little cracks in the user interface.

If you use a dial-up connection and modem to connect with your Internet service provider, you may hear an awful squawking as your modem and the service provider’s modem initiate two-way communication. Similar noisy exchanges can occur when one fax machine attempts to communicate with a second. In both cases, computer programs at each end of a telephone connection are validating, handshaking, synchronizing and otherwise handling the transmission of information. As smarter modems and fax machines replace older technology, these noisy accompaniments are being silenced, since a human need no longer overhear them to check that things are proceeding as desired.

In surfing the web, however, we still often see the underlying machinery of the Internet peeking through. Many browsers display a line of text along the top or bottom of the browser window that shows the address of the web page you’re currently looking at and in some cases lets you edit the line in order to jump to another address. If you were looking at my web page, for example, you’d see some variant of http://www.cs.brown.edu/people/tld/. HTTP is the acronym for a protocolthat the browser uses to communicate with a program running on a distant computer to fetch the web page contents.

11.1 Client-server model

A communication protocol specifies, among other things, how to request that information be transmitted, how to respond when information is requested, how to package information for transmission, and how to signal that you’ve received the information you requested. HTTP is one of several protocols that are necessary for you to surf the web. Protocols are just specifications, and in order to be useful they have to be implemented as parts of programs. The programs that implement protocols such as HTTP are often characterized as clients or servers or some combination of the two. In the client-servermodel of computing, server programs provide (“serve up”) services to client programs that request them. The services can be web pages in the case of a web server or answers to queries in the case of a database server. Communication protocols must run on both the client and server to conduct the transfer of information among running programs.

The word “server”has several computer-related meanings that are relevant to understanding what happens when you click on a link in a web page. A program that provides particular services or implements a particular protocol is sometimes referred to as server software. Once the program is running, the on-going computation managed by the operating system can be called a server process. Such a process often involves many separate threads of computation for such operations as processing database transactions and checking for clients requesting services. Finally, the computer on which such a process is running is often called a server. Such a server is usually a relatively powerful computer, with redundant power supplies, lots of disk storage and multiple processors to make sure it won’t crash, run out of memory, or bog down when many clients request services at the same time. Companies conducting business on the web rely on their web and database servers being available 24 hours a day, seven days a week, 365 days a year.

In order to request services, a client program needs to know the address of the server (the machine on which the server process is running), and to provide services, the server needs to know the address of the client machine. Different protocols are assigned specific portson servers so that the server process implementing a particular protocol knows where to look for client requests.¹ The operating system assists by routing incoming requests to the right ports. If the program implementing a given service crashes or was never started, then a client requesting this service won’t get a response. Similarly, if some part of the network linking the client to the server isn’t working, then no requests or services are communicated.

A computer connected to a network is said to hostweb pages if that computer runs a server process implementing the HTTP protocol that responds to requests for these web pages by transferring the relevant content and dealing with related chores such as handling forms and processing the data that such forms produce. When you click on a link in your web browser window, your browser (an HTTP client) figures out the address of the machine hosting the web page associated with the link and sends a request to the HTTP port on the host machine. The server looks for the page and if possible sends the relevant data (HTTPstands for “hypertext transfer protocol”) back to the client, which formats the data and displays it on your monitor.

Millions of computers are connected to the Internet and many of these computers provide services by implementing various protocols. In addition to HTTP, FTP(“file transfer protocol”) servers enable users to transfer files from remote machines, SMTP(“simple mail transfer protocol”) makes it possible to send email messages from one computer to another, and TELNET ² lets users log on to remote computers and lets system administrators administer machines all over the world without leaving their offices. I won’t dwell on the negative aspects of such services, but the ports that traffic in the various protocols also offer targets for malicious hackers who try to find ways of redirecting these protocols to access protected data or coopt the servers for their own purposes.

Nowadays most users consider protocols like FTP and TELNET hopelessly low-level. Modern client applications implementing these protocols hide their details behind fancy user interfaces. Most programmers, however, know how to use simple programs that implement protocols like FTP or TELNET just in case the fancy clients or the display routines and graphical interfaces they rely on don’t work. In the following exchange, I type ftp 192.168.1.101to invoke a text-based FTP client in a shell and connect to an FTP server running on a remote machine identified by network address 192.168.1.101. The FTP client tells me that I successfully logged on to the specified machine, and the next few commands I type at the ftp> prompt are interpreted by the client and then relayed to the server running on the remote machine. I use this indirect means to look around in the file system on the remote machine, using the cd (change directory) and ls (list directory) commands that my FTP client relays to the remote machine for processing by the server. The numbered statements, for instance, 226 Transfer complete, are diagnostic messages printed by the client that give me some idea what’s happening on the server side. Having found the file I was looking for, I download it with the get command and type exit to end the session.

% ftp 192.168.1.101
Connected to 192.168.1.101.
Name: tld
331 Password required for tld.
Password: 
230 User tld logged in.
ftp> cd email/code/
250 CWD command successful.
ftp> pwd
257 "/u/tld/email/code/" is the current directory.
ftp> ls
150 Opening ASCII mode data connection for '/bin/ls'.
total 1
-rw-r--r--  1 tld  staff  11151 Nov 18 12:17 soup.txt
-rw-r--r--  1 tld  staff  37708 Nov 18 11:47 nuts.txt
226 Transfer complete.
ftp> get soup.txt
150 Opening BINARY mode data connection for 'soup.txt'.
226 Transfer complete.
11151 bytes received in 00:00 (26.35 KB/s)
ftp> exit

The FTP protocol is implemented in both the FTP client and server programs, thus providing the software equivalent of the winking, nodding and handshaking necessary to ensure that information is reliably passed back and forth between client and server and that both programs are aware of what has happened — for example, the protocol requires each program to acknowledge the receipt of any information passed to it by the other.

As low-level as HTTP, SMTP, FTP and TELNET may seem, they are just the tip of the iceberg; these protocols reside in one layer of a complex set of layered communication services. You probably don’t want to know all the details, but it’s useful (or at least interesting) to have some idea of what’s going on in the guts of the Internet. To that end, let’s delve a little deeper.

11.2 Acronym city

We’ll begin our descent into the lower realms of communications software purgatory with the layer immediately below HTTP, SMTP, FTP and TELNET. Protocols such as HTTP exist in what’s called the application layer, though you might consider the primitive clients implementing these protocols too low-level to be considered applications. Just remember that HTTP and FTP implementations are part of your web browser and that your email application relies on an implementation of SMTP to send messages (and on implementations of other application-layer protocols such as POP for “post office protocol” and IMAP for “interactive mail access protocol”, to fetch messages).

Immediately below the application layer is the transport layer. Protocols in the application layer rely on sending messages from one machine to another. These protocols don’t want to worry about how their messages are packaged or how they’re transported. They simply want their messages to get to their intended destinations, in one piece and without error — just do it and spare me the details. You might think that the application-layer protocols would also require some guarantees of security — for instance, don’t let anyone read my love letters. But these services are typically handled at the application layer by using encryption methods to render messages unreadable should they find their way into other hands. In presenting itself to a protocol in the transport layer, an application-layer protocol provides a destination address and a stream of bits and expects the bits to be delivered quickly and in their original order, or if necessary to be told that this can’t be done.

The most commonly used protocols at the transport layer are TCP(“transmission control protocol”) and UDP(“user datagram protocol”). UDP is often used to exchange data over a network where high speed is important and occasional transmission failures are acceptable — for example, in a teleconferencing application where you’re willing to lose a few frames of a video signal. TCP, the workhorse of the Internet, is the transport-layer protocol responsible for transmitting the data used by HTTP clients and servers to support web surfing.

Given the address of a destination machine and a message, TCP begins by repackaging the message into a sequence of fixed-size packages. If a message is smaller than the fixed-size packages, that’s fine: it’ll rattle around in its package, but it still ends up in a one-size-fits-all container. This repackaging enables the next-lower-layer protocols to avoid worrying about large messages. Just as parcel delivery services limit the size of the packages they’ll ship to make sure they’ll fit on their trucks, so the protocol that TCP relies on places limits on package size.

Digital data is great because it can always be broken up into smaller pieces that can be shipped individually and then reassembled at the other end. In fact, that’s pretty much TCP’s job: TCP repackages arbitrary-sized messages into one or more fixed-sized packages, labels the individual packages, sends them separately to the destination address, if necessary resends packages that have taken too long and are presumed lost in transit, and then, when all the packages have arrived, uses the labels to assemble them in the proper order and delivers the reconstituted message to the intended recipient.

To get its work done, TCP needs to be able to send relatively small fixed-size packages from one machine to another. This is the job of the network layerand TCP relies on IP(“internet protocol”) at this layer to handle such transmissions. The problem faced by IP is that there isn’t a wire connecting every possible pair of computers. Instead, there are a lot of computers out there, only some of which happen to be connected to one another. For its part, IP doesn’t want to worry about the individual connections — they could be implemented using smoke signals for all it cares. IP relies on the next lower layer of protocols to worry about transmissions between connected pairs of computers.

IP views the network as a system of roads linking towns: the towns are computers and the roads are data connections along which packages can be sent. IP’s job is to find a sequence of roads that starts at the town corresponding to the machine originating the request and ends at the town corresponding to the destination machine. Each package has a label that specifies its final destination. Since at any instant some roads are busier than others, IP uses frequently updated routing tables to select a route that’s likely to be congestion free. IP doesn’t plan out a route for each package in advance; rather, it relies on the other towns — other computers also running IP — to accept packages sent to them and then, using the information in the routing tables, to forward the packages to the next town.

Except for computing the routing tables (which is actually done by a separate protocol), IP operates in a completely decentralized fashion: individual computers running IP accept packages of data and forward them to the next computer along the route. Things can get complicated when, for example, individual computers get bogged down with lots of packages to forward or the networks connecting pairs of computers become less reliable. If a package doesn’t arrive at its intended destination within a specified time limit, TCP gives it up for lost and tells IP to resend it.

TCP relies entirely on IP to deliver packages to network addresses. It’s for this reason that the two protocols are typically referred to by the collective name TCP/IP. IP, however, has to depend on a variety of protocols at the next lower or data-link layer. The data-link layer has more acronyms than you can shake a stick at and IP has to deal with all of them. From Ethernet-based LANs(“local area networks” — most computers come equipped to communicate on a LAN using Ethernet technology and its related family of protocols) to ISDN (“integrated services digital network”) networks realized by digital phone connections and ATM (“asynchronous transfer mode”) networks based on a different switching technology, IP has to figure out the right data-link-layer protocol to invoke to forward a package to the next computer in its multi-stage journey.

Below the data-link layer is the physical layerconsisting of fiber-optic cables, twisted pairs of copper wires, coaxial cables and radio frequencies, but I suspect you’re layered out. If all these acronyms, layers and protocols are confusing, don’t worry, it’s not just you. Modern computer networks are mind-bogglingly complex and my brief overview barely skims the surface. The remarkable thing is that programmers generally concern themselves with only one or two levels and don’t worry about the rest: the abstractions at one level shelter them from the complexity at lower levels. And these abstractions do more than simply hide detail; they also describe services and qualities of service at one level that are not available at lower levels. For example, TCP supports reliable communication by using an unreliable protocol, namely IP. It does so by using IP as subroutine in a more complicated protocol, invoking IP services, monitoring IP performance, and terminating or re-invoking IP services as needed to ensure reliability.

Whether you’re a professional, occasional or recreational programmer, you’re not likely to invoke TCP, UDP or any other communication protocol directly. Programmers generally exploit libraries that package protocols for different purposes. Whether you’re implementing a distributed game in Java, remotely controlling a robot over a wireless network using C, or building a web server in Scheme, you’ll very likely find a library in your favorite programming language that suits your purposes.

If you’d like to learn more about the history of computer networking and the Internet in particular, you might enjoy M. Mitchell Waldrop’s The Dream Machine: J. C. R. Licklider and the Revolution That Made Computing Personal and Steven Segaller’s Nerds 2.0.1: A Brief History of the Internet. To learn more about the technical details behind networking software, I recommend Andrew Tanenbaum’s Computer Networks.

11.3 Alphabet soup

As long as we’re talking about the protocols and programs that power the World Wide Web, I might as well point out some other highlights of the acronym-laden vocabulary for web pages and the services they offer. As I mentioned earlier, HTTP stands for “hypertext transfer protocol” and the most common Hypertext languages in use today are HTML(“hypertext markup language”) and XHTML(“extended hypertext markup language”). Hypertext languages are used for formatting and displaying text and images, with GIF(Graphics Interchange Format) and JPEG(Joint Photographic Experts Group) being the most common formats for encoding image data.

Hypertext languages also allow the embedding of hyperlinks, without which the World Wide Web wouldn’t be a web. Hyperlinks are essentially pointers to web pages called URLs(“universal resource locators”) that tell your browser how to request a web page from the web server hosting that web page and presumably implementing the HTTP protocol. We’ve already talked about that aspect of surfing the web, but there are lots of other goodies embedded in web pages worth knowing about.

You may have heard that web pages can have dynamic content, and even if you haven’t, you’ve probably experienced them: web pages with pop-up advertisements, animations, images that change as you move your cursor over them, menus that seem to anticipate your choices, and text windows that beckon you to type something. This sort of dynamic content is powered in a variety of ways that illustrate new programming models.

One of the oldest but still quite common ways to create dynamic content uses various server-side scripting languages. It used to be that if you could program in Perl you could get a job for the summer writing CGI(“common gateway interface”) scripts. Here a “script” is a program written in a scripting language, and since Perl programs can be invoked from many scripting languages, many of these scripts were just Perl programs. CGI scripts are run on a web server (hence the designation as a “server-side” scripting language) when someone requests a page containing such a script. The script can create custom content and thus generate brand new, never-before-seen hypertext that’s sent back to the browser requesting the page. Many early online businesses displayed their catalogs and filled customer orders using CGI scripts.

Today there are several alternatives for server-side computations. ASP (“active server page”) is part of a much larger collection of languages and tools provided by Microsoft and known as .NET (pronounced “dot net”). JSP (“java server pages”) from Sun Microsystems provides similar services. PHP (“PHP hypertext preprocessor”³) is another popular scripting language used for delivering dynamic content. All these server-side languages can create new HTML or XHTML content. They’re often used to generate SQL(“structured query language”) queries and submit them to database servers to add new information or access existing data. Nowadays when you click on a link in your browser, all sorts of computations can take place on the web server before it spits back a page of hypertext on your screen. And usually all this takes place in the blink of an eye.

Sometimes, however, it makes sense for some computations to occur not on the web server, which could get bogged down running scripts and dynamically producing content, but on the client side. When you’re just sitting there browsing the web, your computer usually doesn’t have a lot to do. So, Internet content providers figure, why not put those spare computing cycles to good use to power all those cute little animations that make your screen jump and sparkle like a cinema marquee? Java and JavaScript are the languages of choice for embedding in web pages and running on your computer. For client-side computations, when you request a web page containing Java or JavaScript, the code or, in the case of Java, a platform-dependent compiled version of the code called bytecodesis downloaded along with the rest of the web page. Many browsers can directly interpret JavaScript programs, while Java requires a special program called the Java Virtual Machine(JVM). Small Java programs called appletsare often used to add effects and interactive components to dynamic web pages.

So when you click on a link in your browser you trigger a complicated sequence of computations. Your browser implements an HTTP client that uses TCP to send messages to a web server. The bits making up these messages are shuttled across the Internet in small packages using IP. Server-side scripting languages make computations happen on the web server and these computations can initiate exchanges with database servers or other remotely running programs. The web server uses TCP/IP to reply to your browser by sending back content as hypertext, graphics and embedded Java and JavaScript destined to run on your computer and provide a dynamic, interactive experience. All that just find out if it’s going to rain tomorrow.

11.4 Smart milk cartons

I can’t leave the subject of the web without a nod to the many people responsible for establishing, refining, adopting and then sticking to the standards without which no computer could talk to any other computer and the web would be impossible. Standards are everywhere in world of programming. Standards organizations like ANSI (“American National Standards Institute”), IEEE (“Institute of Electrical and Electronic Engineers”) and ISO (“International Organization for Standardization”) have established standards for programming languages (e.g., ANSI C), communication protocols (e.g., IEEE 802), date and time notation (e.g., ISO 8601), and even character sets (e.g., ISO 10646).

I’ll bet you didn’t even realize that character sets need standards. Characters codes are definitely the nuts and bolts of the computer world. And, just as engineers need standards for the pitch of threads so the nuts manufactured by one company will screw on the bolts of another, so software companies need standards for how to encode to the characters that appear on your keyboard and are displayed on your computer screen. For a long time, programmers got by with the 7-bit ASCIIcharacter code (ASCII stands for “American standard code for information interchange”), published in 1968 as ANSI X3.4. But a good portion of the world doesn’t want to be limited to what’s available on most keyboards, and so an 8-bit standard called ISO 8859 was developed in the 1980s. ISO 8859 defines several multilingual character sets including Cyrillic, Arabic, Greek and Hebrew and more are on the way (talk about alphabet soup!). A more ambitious standard called Unicode(ISO 10646) promises to provide a unique code for every character no matter what language you’re using.

When it makes sense to standardize so computers can talk with one another and programmers can understand one another’s code, then the standards organizations get the interested parties to sit down at the table and hash out a standard. There are times when economic interests and just plain stubbornness stick us with ungainly compromises. But more often than not, reason and good design win out over other interests and we get something we can live with. The fact of the matter is that we couldn’t have a World Wide Web were it not for standards.

Standards are making possible a host of new technologies that promise to free us from wires and cables while at the same time connecting us to everything imaginable. IEEE 802.11 is a standard for wireless LANs that lets people move from home to office to classroom to conference room, and even to suitably equipped cafes, restaurants and airport waiting lounges, without missing an email message. Bluetooth, another wireless technology, combines several standards and protocols and is used for short-range (a few meters) wireless communication. Bluetooth will allow digital cameras to exchange data with laptops, audio headsets and microphones to communicate with cell phones, and cellphones to talk with hand-held organizers, all without wires. A number of companies are scrambling to figure out which applications are likely to become commercially viable. Imagine that a carton of milk in your refrigerator senses it’s nearly empty or will soon pass its expiration date and then tells your hand-held organizer to add milk to your grocery list. Such applications are now technically possible with the new standards; it remains to be seen whether people will want or be willing to pay for them.

While these standards may make our lives a little simpler and less cluttered, the consequences for computing are even more striking. When I think about computation, I think of dynamic processes, evolving threads of control that sense and adapt to their environment. With the advent of networked computers, computations were no longer confined to individual computers and the network became the computer. With the advent of wireless LANs and Bluetooth-enabled devices, computations will literally take flight, freed from the constraints of wires. Scientists imagine tiny solar-powered sensors equipped with even smaller computers and wireless transceivers that could be sprinkled from aircraft to gather weather data, aid in searching for lost hikers, or gather military intelligence. Once you enable the most mundane of objects to perform computations and share information, the possibilities are extraordinary.

¹ A port was originally a piece of circuitry used to transfer data into and out of a computer, but modern systems blur the boundaries between hardware and software and nowadays a port is simply a source of digital data that the operating system can be made to attend to using interrupts (interrupts are described in Chapter 10). The operating system can choose to ignore such data or can send it along to other programs waiting to process it. Ports have numbers — HTTP uses port number 80 — that give the operating system a hint about what program might be appropriate for processing the data.

² TELNET originally stood for “teletypewriter network” but now refers to the “network virtual terminal” protocol — which accounts for all the letters if not their order.

³ PHP is called a recursive acronym. One of the most famous recursive acronyms in computing is GNU, which stands for “GNU’s not Unix” and is most closely associated with the Free Software Foundation and its collection of extraordinarily useful pieces of high-quality software.

Chapter 11 Under the Hood

11.1 Client-server model

11.2 Acronym city

11.3 Alphabet soup

11.4 Smart milk cartons

Chapter 11

Under the Hood