⚠️ This is not the current iteration of the course! Head here for the current offering.

Lecture 13: Intro to Operating Systems

» Lecture Video (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 11:59pm Tuesday, March 15).

Disk I/O

Input and output (I/O) on a computer must generally happen through the operating system, so that it can mediate and ensure that only one process at a time uses the physical resources affected by the I/O (e.g., a harddisk, or your WiFi). This avoids chaos and helps with fair sharing of the computer's hardware. (There are some exceptions to this rule, notably memory-mapped I/O and recent fast datacenter networking, but most classic I/O goes through the operating system.)

File Descriptors

When a user-space process makes I/O system calls like read() or write(), it needs to tell the kernel what file it wants to do I/O on. This requires the kernel and the user-space process to have a shared way of referring to a file. On UNIX-like operating systems (such as macOS and Linux), this is done using file descriptors.

File descriptors are identifiers that the kernel uses to keep track of open resources (such as files) used by user-space processes. User-space processes refer to these resources using integer file descriptor (FD) numbers; in the kernel, the FD numbers index into a FD table maintained for each process, which may contain extra information like the filename, the offset into the file for the next I/O operation, or the amount of data read/written. For example, a user-space process may use the number 3 to refer to a file descriptor that the kernel knows corresponds to /home/malte/cats.txt.

To get a file descriptor number allocated, a process calls the open() syscall. open() causes the OS kernel to do permission checks, and if they pass, to allocate an FD number from the set of unused numbers for this process. The kernel sets up its metadata, and then returns the FD number to user-space. The FD number for the first file you open is usually 3, the next one 4, etc.

Why is the first file descriptor number usually 3?

On UNIX-like operating systems such as macOS and Linux, there are some standard file descriptor numbers. FD 0 normally refers to stdin (input from the terminal), 1 refers to stdout (output to the terminal), and 2 refers to stderr (output to the terminal, for errors). You can close these standard FDs; if you then open other files, they will reuse FD numbers 0 through 2, but your program will no longer be able to interact with the terminal.

Now that user-space has the FD number, it uses this number as a handle to pass into read() and write(). The full API for the read system call is: int read(int fd, void* buf, size_t count). The first argument indicates the FD to work with, the second is a pointer to the buffer (memory region) that the kernel is supposed to put the data read into, and the third is the number of bytes to read. read() returns the number of bytes actually read (or 0 if there are no more bytes in the file; or -1 if there was an error). write() has an analogous API, except the kernel reads from the buffer pointed to and copies the data out.

One important aspect that is not part of the API of read() or write() is the current I/O offset into the file (sometimes referred to as the "read-write head" in man pages). In other words, when a user-space process calls read(), it fetches data from whatever offset the kernel currently knows for this FD. If the offset is 24, and read() wants to read 10 bytes, the kernel copies bytes 24-33 into the user-space buffer provided as an argument to the system call, and then sets the kernel offset for the FD to 34.

A user-space process can influence the kernel's offset via the lseek() system call, but is generally expected to remember on its own where in the file the kernel is at. In Project 3, you'll have to maintain such metadata for your caching in user-space memory. In particular, when reading data into the cache or writing cached data into a file, you'll need to be mindful of the current offset that the I/O will happen at.

Operating Systems

Why are we covering this?

Operating systems are an important part of our computing landscape, and how they work impacts how other computer systems – including the applications you write, and the distributed systems that technology companies run – work. In this unit, we will try to understand the fundamental abstractions behind today's operating systems, and conclude from them how you can write efficient and safe software. We won't go into the deepest possible detail on exactly how an operating system implements these abstractions – if you're curious, consider taking CSCI 1670!

We are now entering the second block of the course. So far, we talked about our computers as if they only run a single program: we treated memory as if there is only one instance of each segment (e.g., stack, heap, etc.), and we treated the processor as though it only ever runs machine code instructions from one program. This was indeed true of early computers, but today's computers evidently run several programs – even (seemingly) at the same time!

In this block of the course, we will understand the concepts that make this safe sharing of a computer's hardware possible. Many of these concepts are implemented inside a special software program that every computer runs: the operating system.

Examples of operating systems in wide use today include Microsoft's Windows, Apple's iOS and Mac OS X, and the free, open-source operating systems Linux and BSD (of which several variants exist).

Why do we even need operating systems? For an analogy, consider why society has police and the courts: not every citizen always abides by the rules of society (i.e., the law), but in order to make life safe for others, we have people and structures that enforce the rules. The same applies with computer programs: not all computer programs are always inclined to play by the rules. Some programs are outright malicious (e.g., viruses or ransomware), others can become co-opted by malicious hackers (e.g., due to bugs such as unchecked buffer overflows), and others again are benign but resource-hungry and can crowd out other programs that run on the same computer.

Here is an example of a program that, depending on your viewpoint, could be seen as benign or as actively malicious (attack.cc):

int main() {
    while (true) {
    }
}
This program runs an infinite loop! When when disassemble its machine code, we find the following instructions in main():
00000000000005fa <main>:
 5fa:	55                   	push   %rbp
 5fb:	48 89 e5             	mov    %rsp,%rbp
 5fe:	eb fe                	jmp    5fe 
The first two instructions run only once, but the final jmp jumps to byte offset (address) 5fe within the program, which turns out to be ... that same instruction!

Based on our understanding of how processors work, as presented in the course so far, running this program should be potential fatal to our computer: the processor only does what it's told, so it will infinitely keep executing the jmp instruction at 5fe. The processor gets stuck in an infinite loop of an instruction jumping back to itself, and no other program ever gets to use the processor. (If your computer has multiple processors, it could still make progress even when such an infinite loop is running, but an attacker might just run multiple infinite loop programs until all processors are busy.)

The role of the operating system is to make sure that all programs play by the rules, and also to make it easier for programs to use shared resources such as the computer's hardware.

Kernel and Processes

An operating system (OS) in practice consists of many components (including a whole bunch fo preinstalled applications, desktop backgrounds, aesthetic and GUI elements), but for the purpose of this course, we care particularly about the most privileged core of the OS, the kernel.

The kernel is the operating system software that runs with full machine privilege, meaning full privilege over all resources on the computer. The kernel is all-powerful: it can access anything and do anything it wants without having to pass any checks on its actions.

Unprivileged processes (also called "user-level processes", where "user-level" is the opposite of "kernel"), by contrast, are software that runs without elevated machine privilege. A process is a program in execution. But processes can have bugs: they may access invalid memory, divide by zero, start running infinite loops, or run haywire in other ways (maliciously or accidentally). The kernel should prevent mistakes of an individual process from bringing down the whole system.

What's the difference between a "program" and a "process"?

A process is a program in active execution (e.g., a running web browser showing the course website), while the notion of program refers to the concept of code to achieve a specific purpose (e.g., a web browser, which generically serves the purpose of displaying webpages). In this course, we use "program" to mean the "dead" notion of compiled machine code on storage, while we use "process" to refer to an active, executing program managed by the OS.

In modern operating systems, much kernel code aims to provide protection for processes from other processes: protection ensures that no process can violate the operating system's sharing policies.

A kernel has three general goals:

  1. Ensure robustness and performance by isolating programs from each other.
  2. Share the computer's hardware resources fairly.
  3. Provide safe and convenient access to hardware resources by inventing abstractions for those resources.
In lectures, we will see several examples of resources that the OS needs to protect. In your lab and project for this unit of the course, you will add memory protection to a small, yet fully functions, OS called WeensyOS.

DemoOS and WeensyOS

WeensyOS is our teaching operating system. It includes a kernel and a userspace portion, and could, in principle, run on my or your computer, provided they use the x86-64 instruction set architecture. In lectures and assignments, we will instead run WeensyOS on an emulated computer, using a software called QEMU. QEMU is a program that faithfully fakes out a x86-64 processor, memory, and hardware devices (such as a screen, keyboard, etc.). If you run it inside your course VM, you effectively have an emulated computer (QEMU's) running on another emulated computer (your course VM), which in turn runs on a real computer. You can think of this as some kind of Matryoshka doll of fake computers...

Why do you say "in principle"? Can I actually run WeensyOS on my computer?

The qualification in that sentence is there because WeensyOS would need to be extended with a fair amount of device driver code to run on a physical computer. Real computers use a huge variety of different hardware chips, and these chips need specific "driver" software for the OS to be able to use them. Commercial operating systems like Windows, Mac OS X, or even open-source Linux come with thousands of drivers for all sorts of weird chips and devices, but WeensyOS only has drivers for the devices QEMU emulates. Hence, it wouldn't actually manage to start up and show information on your computer's screen!

Another reason why you wouldn't want to run WeensyOS on your computer is that it would be rather painful to debug if you had made a mistake: you would need to restart your computer every time you hit a bug in the kernel. With QEMU, you can simply restart the emulator.

For the purpose of the course, we will run QEMU with a single emulated processor, and WeensyOS will use up to 2 MB of RAM. That is not a lot, but restricting the operating system to a small amount of memory makes it possible for you to keep track of exactly what's happening.

Note that operating system kernels are just programs, and writing a kernel is broadly similar to normal programming. For example, WeensyOS is written in C++, and much of the kernel code will read just like a normal C++ program. There are some differences, however:

This makes kernel programming more difficult, but also immensely fun: once it works, you'll get the cool feeling of having built part of your own operating system!

Keeping Bad Programs In Check

Let's consider a situation where two programs are running on a computer. We'll call the programs "Alice" and "Eve", and they're implemented as p-alice.cc and p-eve.cc in the lecture code.

When I run DemoOS (make run), Alice and Eve are alternately printing lines to the screen, saying "Hi, I'm Alice!" followed by a number, and vice versa for Eve. A nice example of successful sharing of a computer!

Let's look at what p-alice.cc actually does. This is a user-space program, and its code is below:

// p-alice.cc
#include "u-lib.hh"

void process_main() {
    char buf[128];
    sys_getsysname(buf);

    console_printf(0x1D00, "This is %s.\n", buf);

    char msg[15];
    snprintf(msg, 15, "Hi, I'm Alice!");

    unsigned i = 0;
    while (true) {
        ++i;
        if (i % 1024 == 0) {
            console_printf(0x1D00, "%s #%d\n", msg, i / 512);
        }
        sys_yield();
    }
}
On WeensyOS, the a userspace process starts execution in process_main(). p-alice.cc's implementation contains four lines (bolded above) that interact with the kernel and/or hardware: Overall, what this program does is the following:
  1. It obtains the OS identifier ("DemoOS 1.31") via tha sys_getsysname system call and stores it in buf.
  2. It then fills another buffer (msg) with the message "Hi, I'm Alice!".
  3. It finally starts an infinite loop, printing the message on every 1,024th iteration. Each loop iteration also invokes the sys_yield() system call that tells the kernel to let another process run.

Now, let's look at p-eve.cc, which is Eve's program.

// p-eve.cc
#include "u-lib.hh"

void process_main() {
    unsigned i = 0;
    while (true) {
        ++i;
        if (i % 1024 == 0) {
            console_printf(0x0E00, "Hi, I'm Eve! #%d\n", i / 512);
        }
        sys_yield();
    }
}
Initially, Eve's process will run very similar code to Alice's.

But Eve is evil, and her goal is to take over the computer from Alice. How can she do this? One possible plan is to monopolize one of the shared hardware resources of the computer and to deny Alice access to it.

Protected Resource – Processor Time

What can Eve do to monopolize the computer? One simple thing she can do is to stop playing nice and call sys_yield() every time around the loop. If Eve does not give up the processor by calling sys_yield(), then Alice never gets to run! Indeed, Eve can just add an infinite loop to her program prior to the call to sys_yield() and rest assured, execution will never make it to sys_yield().

This phenomenon of depriving another process of access to a crucial resource is called starvation. The specific resource that Eve is attacking in this case is processor time: because our emulated computer in QEMU has only one processor, only one process can run on it at the same time. If that process never gives up the processor, it just keeps running.

The underlying reason for why this attack succeeds is because Eve simply never returned control to the OS kernel. What we need is a way for the OS kernel to wrest back control from a misbehaving process like Eve's.

Interrupts

The solution to our problem is to rely on a hardware mechanism to put the kernel back in control at regular intervals. This timer hardware is configured to generate an "alarm" at regular frequencies (e.g., every millisecond). But what happens when an alarm goes off? Who should control the policy? The kernel! This means when the alarm goes off, the kernel needs to take control. So once the alarm goes off, whatever program the processor was running before gets interrupted and the kernel gets to run. This kind of control transfer from a user process to the kernel is called an interrupt.

Currently, our DemoOS doesn't have timer interrupts configured. A small change to the kernel can help turn them on, however: in the kernel.cc file, we add a call to init_timer(1000) in the startup code to set up a timer that goes off every 1000 microseconds. But that alone isn't enough! The kernel doesn't know how to handle a timer interrupt yet, so if we run with this change, the first time a timer interrupt occurs, the kernel crashes with an error indicating an "Unhandled exception 32".

To handle the timer interrupt, we need to add a case to the code in the exception() function in kernel.cc. The code there handles different interrupts via a switch statement, and if we add a case for interrupt 32, we can handle it correctly. What does a correct handler do? We would like to give another process (specifically, Alice's process) a chance to run. The code needed for this purpose is this:

switch (regs->reg_intno) {
    // [...]

    case 32:
        lapicstate::get().ack(); // re-enable timer interrupt
        schedule(); // let something else run

    // [...]
}
The first line re-enables the timer interrupt, setting the next alarm. (We don't expect you to understand that line in detail.) The second line invokes the kernel process scheduler, which is a piece of code that decides what process should next get to run and eventually configures the processor to continue running that process.

With these changes in place, Eve's attack no longer succeeds: even though Eve's process is spinning in an infinite loop, it regularly gets interrupted by the timer interrupt, which causes the kernel to run and give Alice's process a chance to run; after Alice's process yields, Eve's process runs again.

We've solved the starvation problem by letting the privileged kernel take over control of the computer at regular intervals!

Summary

Today, we learned about file descriptors and system calls, and their role in I/O.

We then embarked on a new topic: Operating Systems. In particular, we learned that there is a privileged program on the computer – the kernel – whose job it is to ensure that all userspace processes play by the rules. We want processes to fairly share the computer's hardware resources, and we'd like them to be isolated from each other. Next time, we'll look more at how we can further isolate processes from each other.