Lecture 16: Virtual Memory and Page Tables

🎥 Lecture video (Brown ID required)
💻 Lecture code
❓ Post-Lecture Quiz (due 11:59pm, Monday, April 1).

Virtual Memory

We previously used memory protection to isolate kernel memory from user-space processes, preventing attacks where user-space processes write to kernel memory. But this is not enough – we also need to prevent user-space processes from accessing the memory of other user-space processes!

In DemoOS as we looked at in lectures so far (and also in WeensyOS at the start of Project 3), there is no isolation between processes. The only things that prevents utter chaos is that the processes happen to use non-overlapping memory addresses; for example, p-alice starts at address 0x10'0000 and ends at 0x13'FFFF, while p-eve starts at 0x14'0000 and ends at 0x17'FFFF (and likewise with the first and second process on WeensyOS).

There are a couple of problems with this approach:

  1. If Eve successfully guesses an address within Alice's memory, she can read and modify Alice's data.
  2. If Alice and Eve's processes every accidentally use the same address, they will corrupt each other's memory; programmers need to carefully choose non-overlapping memory regions for their processes.
  3. The processes' memory regions are of fixed size, and a process that needs more than, say, the 0x3FFFF bytes of memory between the top and bottom address (= 256 KB) either cannot run or needs to carefully avoid any memory used by other processes.
  4. If we have many processes, there may not be enough memory to run them all as we need to pre-reserve a fixed amount of memory for each process (in the examples, 0x3FFFF bytes = 256 KB).
So we need something safer and more flexible! Virtual memory is a concept that achieves both these goals.

Demo: Eve Attacking Alice's Memory

Two relatively simple attacks demonstrate the danger of giving Eve's process access to the memory of Alice's process. In the first, Eve might form a pointer into data stored, e.g., on Alice's stack and change that data, for example to make Alice print an attack message rather than her normal Hi, I'm Alice! message.

In the lecture demo as compiled, the address of Alice's message on her process's stack is 0x13'ff59. (You can obtain this by printing the address &msg from p-alice.cc, or by looking at the disassembly in obj/p-alice.asm). Eve may modify her program as follows to overwrite Alice's message:

--- p-eve.cc    2020-03-05 10:24:04.760050399 -0500
+++ p-eve.cc    2020-03-05 10:25:21.671907556 -0500
@@ -7,6 +7,8 @@
         if (i % 1024 == 0) {
             console_printf(0x0E00, "Hi, I'm Eve! #%d\n", i / 512);
         }
+        char* msg = (char*) 0x13ff59;
+        snprintf(msg, 15, "EVE ATTACK!");

         if (i % 2048 == 0) {
           char* syscall = (char*) 0x40ad6;

What's the syntax of the above listing?

This file is a unified diff, which is a format for expressing differences between text files (sometimes referred to as "patches". It's the format that the git diff command produces its output in. Developers often use diffs to concisely show differences to code. The lines black starting with a space are context lines, which indicate where in the file the changes should occur; the green lines starting with a "+" sign indicate lines to be added; if there were lines starting with a "-" sign, they would indicate lines to be removed (typically shown in red). More about the diff format!

An even worse attack involves Eve writing to Alice's code in the static segment of her process. Recall that the code segment contains the machine instructions executed by Alice's process. If Eve finds a convenient place to sneakily insert new code, she can cause Alice to compute on her behalf, or worse, force Alice into an infinite loop, permanently disabling her process.

Remember that the two bytes 0xEB 0xFE correspond to an infinite loop in x86-64 machine code (a two-byte instruction encoding an unconditional, relative jump by -2 bytes). If Eve writes these bytes into the first two bytes of any instruction in the inner loop of Alice's code, Alice will enter an infinite loop the next time she executes the instructions at that address. For example, one such address is 0x10'0077, normally a mov instruction just after Alice's process returns from a system call (see obj/p-alice.asm). Eve might make the following change to her program to corrupt Alice's process:

--- p-eve.cc    2020-03-05 10:26:32.893535962 -0500
+++ p-eve.cc    2020-03-05 10:25:51.894988665 -0500
@@ -11,9 +11,9 @@
         snprintf(msg, 15, "EVE ATTACK!");

         if (i % 2048 == 0) {
+          char* alicecode = (char*) 0x100077;
+          alicecode[0] = 0xEB;
+          alicecode[1] = 0xFE;
-          char* syscall = (char*) 0x40ad6;
-          syscall[0] = 0xEB;
-          syscall[1] = 0xFE;

           console_printf(0x0D00, "MWAHAHAHAHAHAH EVE REIGNS SUPREME!\n");
         }
This replaces the prior attack on the kernel code (which got Eve's process killed, as we now protect the kernel memory) with an attack on Alice's code.

Back to Virtual Memory!

The basic idea behind virtual memory is to create, for each user-space process, the illusion that it runs alone on the computer and has access to the computer's full memory. In other words, we seek to give different processes different views of the actual memory.

Recall that the (physical) memory in DemoOS is roughly laid out as follows:

         0x0
            +--------------------------------------------------------------------+
null page ->|R                                                                   |
            +--------------------------------------------------------------------+
 0x40000 -->|KKKKKKKKKKKKKKKKKKKKKKKKKKKKKK                                     K| <-- kernel stack
(kernel mem)+--------------------------------------------------------------------+
            |                                    RRRRRRRRRRRRRRRRRRRRRCRRRRRRRRRR| console @ 0xB8000
            +--------------------------------------------------------------------+
            |RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR|
            +--------------------------------------------------------------------+
0x100000 -->|Code|Data|Heap ...       Alice's process memory            ... Stack| <-- 0x13ffff
            +--------------------------------------------------------------------+
0x140000 -->|Code|Data|Heap ...         Eve's process memory            ... Stack| <-- 0x17ffff
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
                                                                                 0x1fffffff (MEMSIZE_PHYSICAL - 1)

What we'd like to achieve through virtual memory is that Alice's user-space process has the following view of this memory:

         0x0
            +--------------------------------------------------------------------+
null page ->| XXX NO ACCESS XXX                                                  |
            +--------------------------------------------------------------------+
 0x40000 -->| XXX NO ACCESS in userspace XXX                                     | <-- kernel stack
(kernel mem)+--------------------------------------------------------------------+
            | XXX NO ACCESS XXX                                       C          | console @ 0xB8000 (can access)
            +--------------------------------------------------------------------+
            | XXX NO ACCESS XXX                                                  |
            +--------------------------------------------------------------------+
0x100000 -->|Code|Data|Heap ...       Alice's process memory            ... Stack| <-- 0x13ffff
            +--------------------------------------------------------------------+
0x140000 -->| XXX NO ACCESS (Eve's memory) XXX                                   | <-- 0x17ffff
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
                                                                                 0x1fffffff (MEMSIZE_PHYSICAL - 1)

... while Eve's user-space process should see this view:

         0x0
            +--------------------------------------------------------------------+
null page ->| XXX NO ACCESS XXX                                                  |
            +--------------------------------------------------------------------+
 0x40000 -->| XXX NO ACCESS in userspace XXX                                     | <-- kernel stack
(kernel mem)+--------------------------------------------------------------------+
            | XXX NO ACCESS XXX                                       C          | console @ 0xB8000 (can access)
            +--------------------------------------------------------------------+
            | XXX NO ACCESS XXX                                                  |
            +--------------------------------------------------------------------+
0x100000 -->| XXX NO ACCESS (Alice's memory) XXX                                 | <-- 0x13ffff
            +--------------------------------------------------------------------+
0x140000 -->|Code|Data|Heap ...         Eve's process memory            ... Stack| <-- 0x17ffff
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
            |                                                                    |
            +--------------------------------------------------------------------+
                                                                                 0x1fffffff (MEMSIZE_PHYSICAL - 1)

The memory labeled XXX NO ACCESS XXX here should behave as if it didn't exist: in particular, any access to a memory page in these regions should cause an exception (this is called a "page fault") that transfers control into the kernel.

Note that neither Alice nor Eve need to know here that they're not looking at the real picture of memory, but rather at a specific fiction created for their process. From the perspective of each process, it may as well be running on a computer with the physical memory laid out as shown in these views. (This is the power of virtualization: the notion of faking out an equivalent abstraction over some hardware without changing the interface.)

To achieve this, we need a layer of indirection between the addresses user-space processes use and the physical memory addresses that correspond to actually locations in RAM. We achieve this indirection by mapping virtual pages to physical pages through page tables.

Page Tables: Intro

Page tables are what let us actually convert a virtual address to physical address. Each user-space process gets its own page table, which it uses to perform that conversion.

Making page tables work efficiently requires some non-trivial data structure design. We'll work towards the actual design real computers use by considering a set of "strawman" designs that don't quite work.

Summary

Today, we started looking at how we can protect memory. Specifically, we saw how we can use page permissions to protect a user-space program from just writing over kernel memory by changing the memory mappings such that kernel memory is not available for user-space processes. Next time, we'll dive deeper into how to protect user-space processes from each other and how we can have multiple programs share the same memory safely through a notion called virtual memory.

Computers realize virtual memory using page tables, which are mapping tables that help translating virtual addresses (which user-space programs work with) to physical addresses (which refer to real memory addresses in the computer's DRAM chips).