Exercises: Operating Systems
These exercises will help you prepare for quiz questions you may see for Block 2: Fundamentals of Operating Systems.
Acknowledgements: These exercises were originally developed for Harvard's CS 61 course and were kindly shared by Eddie Kohler.
OS-1. Virtual memory
QUESTION OS-1A. What is the x86-64 page size? Circle all that apply.
- 4096 bytes
- 64 cache lines
- 256 words
0x1000
bytes- 216 bits
- None of the above
#1, #2, #4. The most common x86-64 cache line size is 64 = 26 bytes, and 26×26 = 212, but there may have been some x86-64 processors with 128-byte cache lines. The word size is 8; 256×8 = 2048, not 4096. There are 8 bits per byte; 216/8 = 213, not 212.
The following questions concern the sizes of page tables. Answer the questions in units of pages. For instance, the page tables in WeensyOS each contained one level-4 page table page (the highest level, corresponding to address bits 39-47); one level-3 page table page; one level-2 page table page; and two level-1 page table pages, for a total size of 5 pages per page table.
QUESTION OS-1B. What is the maximum size (in pages) of an x86-64 page table (page tables only, not destination pages)? You may write an expression rather than a number.
1 level-4 page table page + 512 level-3 page table pages + 512 * 512 level-2 page table pages + 512 * 512 * 512 level-1 page table pages ----------------------- 2^27 + 2^18 + 2^9 + 1 = 0x8040201 = 134480385 page table pages
QUESTION OS-1C. What is the minimum size (in pages) of an x86-64 page table that would allow a process to access 221 distinct physical addresses?
4 is a good answer—x86-64 page tables have four levels—but the best answer is one.
Whaaat?! Consider a level-4 page table whose first entry refers to the level-4 page table page itself, and the other entries referred to different pages. Like this:
Physical address Index Contents 0x1000
0 0x1007
0x1008
1 0x2007
0x1010
2 0x3007
0x1018
3 0x4007
… … … 0x1ff8
511 0x200007
With this page table in force, the 221 virtual addresses
0x0
through0x1FFFFF
access the 221 distinct physical addresses0x1000
through0x200FFF
.
The 64-bit x86-64 architecture is an extension of the 32-bit x86 architecture, which used 32-bit virtual addresses and 32-bit physical addresses. But before 64 bits came along, Intel extended 32-bit x86 in a more limited way called Physical Address Extension (PAE). Here’s how they differ.
- PAE allows 32-bit machines to access up to 252 bytes of physical memory (which is about 4000000 GB). That is, virtual addresses are 32 bits, and physical addresses are 52 bits.
- The x86-64 architecture evolves the x86 architecture to a 64-bit word size. x86-64 pointers are 64 bits wide instead of 32. However, only 48 of those bits are meaningful: the upper 16 bits of each virtual address are ignored. Thus, virtual addresses are 48 bits. As with PAE, physical addresses are 52 bits.
QUESTION OS-1D. Which of these two machines would support a higher number of concurrent processes?
- x86-32 with PAE with 100 GB of physical memory.
- x86-64 with 20 GB of physical memory.
#1 x86-32 with PAE. Each concurrent process occupies some space in physical memory, and #1 has more physical memory.
(Real operating systems swap, so either machine could support more processes than fit in virtual memory, but this would cause thrashing. #1 supports more processes before it starts thrashing.)
QUESTION OS-1E. Which of these two machines would support a higher maximum number of threads per process?
- x86-32 with PAE with 100 GB of physical memory.
- x86-64 with 20 GB of physical memory.
#2 x86-64. Each thread in a process needs some address space for its stack, and an x86-64 process address space is much bigger than an x86-32’s.
OS-2. Virtual memory and kernel programming
The WeensyOS kernel occupies virtual addresses 0 through 0xFFFFF; the
process address space starts at PROC_START_ADDR
== 0x100000 and goes
up to (but not including) MEMSIZE_VIRTUAL
== 0x300000.
QUESTION OS-2A. True or false: On x86-64 Linux, like on WeensyOS, the kernel occupies low virtual addresses.
False
QUESTION OS-2B. On WeensyOS, which region of a process’s address space is closest to the kernel’s address space? Choose from code, data, stack, and heap.
Code
QUESTION OS-2C. On Linux on an x86-64 machine, which region of a process’s address space is closest to the kernel’s address space? Choose from code, data, stack, and heap.
Stack
The next problems consider implementations of virtual memory features in a
WeensyOS-like operating system. Recall that the WeensyOS
sys_page_alloc(addr)
system call allocates a new physical page at the given
virtual address. Here’s an example kernel implementation of sys_page_alloc
,
taken from the WeensyOS syscall
function:
case SYSCALL_PAGE_ALLOC: {
uintptr_t addr = current->regs.reg_rdi;
/* [A] */
void* pg = kalloc(PAGESIZE);
if (!pg) { // no free physical pages
console_printf(CPOS(24, 0), 0x0C00, "Out of physical memory!\n");
return -1;
}
/* [B] */
// and map it into the user’s address space
vmiter(current->pagetable, addr).map((uintptr_t) pg, PTE_P | PTE_W | PTE_U);
/* [C] */
return 0;
}
(Assume that kalloc
and kfree
are correctly implemented.)
QUESTION OS-2D. Thanks to insufficient checking, this implementation allows a WeensyOS process to crash the operating system or even take it over. This kernel is not isolated. What the kernel should do is return −1 when the calling process supplies bad arguments. Write code that, if executed at slot [A], would preserve kernel isolation and handle bad arguments correctly.
if (addr % PAGESIZE != 0 || addr < PROC_START_ADDR || addr >= MEMSIZE_VIRTUAL) { return -1; }
QUESTION OS-2E. This implementation has another problem, which the following process would trigger:
void process_main() {
heap_top = /* ... first address in heap region ... */;
while (1) {
sys_page_alloc(heap_top);
sys_yield();
}
}
This process code repeatedly allocates a page at the same address. What should happen is that the kernel should repeatedly deallocate the old page and replace it with a newly-allocated zeroed-out page. But that’s not what will happen given the example implementation.
What will happen instead? And what is the name of this kind of problem?
Eventually the OS will run out of physical memory. At least it will print “
Out of physical memory!
” (that was in the code we provided). This is a memory leak.
QUESTION OS-2F. Write code that would fix the problem, and name the slot
in the SYSCALL_PAGE_ALLOC
implementation where your code should go. (You may
assume that this version of WeensyOS never shares process pages among
processes.)
if (vmiter(current, addr).user()) { kfree((void*) vmiter(current, addr).pa()); }
This goes in slot A or slot B. Slot C is too late; it would free the newly mapped page.
OS-3. Kernel programming
WeensyOS processes are quite isolated: the only way they can communicate is by using the console. Let’s design some system calls that will allow processes to explicitly share pages of memory. Then the processes can communicate by writing and reading the shared memory region. Here are two new WeensyOS system calls that allow minimal page sharing; they return 0 on success and –1 on error.
int share(pid_t p, void* addr)
Allow processp
to access the page at addressaddr
.int attach(pid_t p, void* remote_addr, void* local_addr)
Access the page in processp
’s address space at addressremote_addr
. That physical page is added to the calling process’s address space at addresslocal_addr
, replacing any page that was previously mapped there. It is an error ifp
has not shared the page atremote_addr
with the calling process.
Here’s an initial implementation of these system calls, written as clauses in
the WeensyOS kernel’s syscall
function.
case SYSCALL_SHARE: {
pid_t p = current->regs.reg_rdi;
uintptr_t addr = current->regs.reg_rsi;
/* [A] */
int shindex = current->nshared;
if (shindex >= MAX_NSHARED) {
return -1;
}
/* [B] */
++current->nshared;
current->shared[shindex].sh_addr = addr;
current->shared[shindex].sh_partner = p;
return 0;
}
case SYSCALL_ATTACH: {
pid_t p = current->regs.reg_rdi;
uintptr_t remote_addr = current->regs.reg_rsi;
uintptr_t local_addr = current->regs.reg_rdx;
/* [C] */
int shindex = -1;
for (int i = 0; i < processes[p].nshared; ++i) {
if (processes[p].shared[i].sh_addr == remote_addr
&& processes[p].shared[i].sh_partner == current->p_pid) {
shindex = i;
}
}
if (shindex == -1) {
return -1;
}
/* [D] */
vmiter it(processes[p].pagetable, remote_addr);
/* [E] */
vmiter(current->pagetable, local_addr).map(it.pa(), PTE_P | PTE_W | PTE_U);
/* [F] */
return 0;
}
Some notes:
- The implementation stores sharing records in an array. A process may
call
share
successfully at mostMAX_NSHARED
times. After that, its futureshare
calls will return an error. processes[p].nshared
is initialized to 0 for all processes.- Assume that WeensyOS has been implemented as in Problem Set 4 up through step 6 (shared read-only memory).
QUESTION OS-3A. True or false: Given this implementation, a single
WeensyOS process can cause the kernel to crash simply by calling share
one or more times (with no process ever calling attach
). If true, give
an example of a call or calls that would likely crash the kernel.
False
QUESTION OS-3B. True or false: Given this implementation, a single
WeensyOS process can cause the kernel to crash simply by calling
attach
one or more times (with no process ever calling share
). If
true, give an example of a call or calls that would likely crash the
kernel.
True. If the user supplies an out-of-range process ID argument, the kernel will try to read out of bounds of the
processes
array. Example call:attach(0x1000000, 0, 0)
.
QUESTION OS-3C. True or false: Given this implementation, WeensyOS
processes 2 and 3 could work together to obtain write access to the
kernel code located at address KERNEL_START_ADDR
. If true, give an
example of calls that would obtain this access.
True, since the
attach
andshare
code don’t check whether the user process is allowed to access its memory. An example:#2: share(3, KERNEL_START_ADDR) #3: attach(2, KERNEL_START_ADDR, 0x110000)
QUESTION OS-3D. True or false: Given this implementation, WeensyOS processes 2 and 3 could work together to obtain write access to any memory, without crashing or modifying kernel code or data. If true, give an example of calls that would obtain access to a page mapped at address 0x110000 in process 5.
The best answer here is false. Processes are able to gain access to any page mapped in one of their page tables. But it’s not clear whether 5’s 0x110000 is mapped in either of the current process’s page tables. Now, 2 and 3 could first read the
processes
array (via share/attach) to find the physical address of 5’s page table; then, if 2 and 3 are in luck and the page table itself is mapped in their page table, they could read that page table to find the physical address of 0x110000; and then, if 2 and 3 are in luck again, map that page using the VA accessible in one of their page tables (which would differ from 0x110000). But that might not work.
QUESTION OS-3E. True or false: Given this implementation, WeensyOS
child processes 2 and 3 could work together to modify the code run by a
their shared parent, process 1, without crashing or modifying kernel
code or data. If true, give an example of calls that would obtain write
access to process 1’s code, which is mapped at address
PROC_START_ADDR
.
True; since process code is shared after step 6, the children can map their own code read/write, and this is the same code as the parent’s.
#2: share(3, PROC_START_ADDR) #3: attach(2, PROC_START_ADDR, PROC_START_ADDR)
QUESTION OS-3F. Every “true” answer to the preceding questions is a bug in WeensyOS’s process isolation. Fix these bugs. Write code snippets that address these problems, and say where they go in the WeensyOS code (for instance, you could refer to bracketed letters to place your snippets); or for partial credit describe what your code should do.
Here’s one possibility.
Prevent
share
from sharing an invalid address in[A]
:if ((addr & 0xFFF) || addr < PROC_START_ADDR) { return -1; }
NB don’t need to check
addr < MEMSIZE_VIRTUAL
as long as we check the permissions fromvmiter
below (but that doesn’t hurt).Prevent
attach
from accessing an invalid process or mapping at an invalid address in[B]
:if (p >= NPROC || (local_addr & 0xFFF) || local_addr < PROC_START_ADDR || local_addr >= MEMSIZE_VIRTUAL) { return -1; }
We do need to check
MEMSIZE_VIRTUAL
here.Check the mapping at
remote_addr
before installing it in[E]
:if (!it.user()) { return -1; }
Also, in the
….map
call, useit.perm()
instead ofPTE_P|PTE_W|PTE_U
.For greatest justice we would also fix a potential memory leak caused by
attach
ing over an address that already had a page, but this isn’t necessary.
OS-4. Teensy OS VM System
The folks at Teensy Computers, Inc, need your help with their VM system. The hardware team that developed the VM system abruptly left and the folks remaining aren't quite sure how VM works. I volunteered you to help them.
The Teensy machine has a 16-bit virtual address space with 4 KB pages. The Teensy hardware specifies a single-level page table. Each entry in the page table is 16-bits. Eight of those bits are reserved for the physical page number and 8 of the bits are reserved for flag values. Sadly, the hardware designers did not document what the bits do!
QUESTION OS-4A. How many pages are in the Teensy virtual address space?
16 (24)
QUESTION OS-4B. How many bits comprise a physical address?
20 (8 bits of physical page number + 12 bits of page offset)
QUESTION OS-4C. Is the physical address space larger or smaller than the virtual address space?
Larger!
QUESTION OS-4D. Write, in hex, a PAGE_OFFSET_MASK
(the value
that when anded with an address returns the offset of the address on a
page).
0xFFF
QUESTION OS-4E. Write a C expression that takes a virtual address,
in the variable vaddr
, and returns the virtual page number.
(vaddr >> 12)
OR((vaddr) & 0xF000 >> 12)
You are now going to work with the Teensy engineers to figure out what those other bits in the page table entries mean! Fortunately, they have some engineering notes from the hardware team—they need your help in making sense of them. Each letter below has the contents of a note, state what you can conclude from that note about the lower 8 bits of the page table entries.
QUESTION OS-4F. “Robin, I ran 8 tests using a kernel that did nothing other than loop infinitely -- for each test I set a different bit in all the PTEs of the page table. All of them ended up in the exception handler except for the one where I set bit 4. Any idea what this means?”
Bit 4 is the “present/valid bit”, the equivalent of x86 PTE_P.
QUESTION OS-4G. “Lynn, I'm writing a memory test that iterates over all of memory making sure that I can read back the same pattern I write into memory. If I don't set bit 7 of the page table entries to 1, I get permission faults. Do you know what might be happening?”
Bit 1 is the “writable bit”, the equivalent of x86 PTE_W.
QUESTION OS-4H. “Pat, I almost have user level processes running! It seems that the user processes take permission faults unless I have both bit 4 and bit 3 set. Do you know why?”
Bit 3 is the “user/unprivileged bit”, the equivalent of x86 PTE_U.
OS-5. Teensy OS Page Tables
The Teensy engineers are well on their way now, but they do have a few bugs and they need your help debugging the VM system. They hand you the following page table, using x86-64 notation for permissions, and need your help specifying correct behavior for the operations that follow.
Index |
Entry contents: |
|
---|---|---|
Page number of |
Permissions |
|
0 |
0x00 |
PTE_U |
1 |
0x01 |
PTE_P |
2 |
0x02 |
PTE_P|PTE_W |
3 |
0x03 |
PTE_P|PTE_W|PTE_U |
4 |
0xFF |
PTE_W|PTE_U |
5 |
0xFE |
PTE_U |
6 |
0x80 |
PTE_W |
7 |
0x92 |
PTE_P|PTE_W|PTE_U |
8 |
0xAB |
PTE_P|PTE_W|PTE_U |
9 |
0x09 |
PTE_P|PTE_U |
10 |
0xFE |
PTE_P|PTE_U |
11 |
0x00 |
PTE_W |
12 |
0x11 |
PTE_U |
All others |
(Invalid) |
0 |
For each problem below, write either the physical address of the given virtual address or identify what fault would be produced. The fault types should be one of:
- Missing page (there is no mapping for the requested page)
- Privilege violation (user level process trying to access a supervisor page)
- Permission violation (attempt to write a read-only page)
QUESTION OS-5A. The kernel dereferences a null pointer
Missing page
QUESTION OS-5B. A user process dereferences a null pointer
Missing page
QUESTION OS-5C. The kernel writes to the address 0x8432
0xAB432
QUESTION OS-5D. A user process writes to the address 0xB123
Missing page (when both PTE_P and PTE_U are missing, it's PTE_P that counts)
QUESTION OS-5E. The kernel reads from the address 0x9876
0x09876
QUESTION OS-5F. A user process reads from the address 0x7654
0x92654
QUESTION OS-5G. A user process writes to the address 0xABCD
Permission violation
QUESTION OS-5H. A user process writes to the address 0x2321
Privilege violation
OS-6. Virtual memory
QUESTION OS-6A. What kind of address is stored in x86-64 register
%cr3
, virtual or physical?
physical
QUESTION OS-6B. What kind of address is stored in x86-64 register
%rip
, virtual or physical?
virtual
QUESTION OS-6C. What kind of address is stored in an x86-64 page table entry, virtual or physical?
physical
QUESTION OS-6D. What is the x86-64 word size in bits?
64
Many paged-virtual-memory architectures can be characterized in terms of the PLX constants:
- P = the length of the page offset, in bits.
- L = the number of different page indexes (equivalently, the number of page table levels).
- X = the length of each page index, in bits.
QUESTION OS-6E. What are the numeric values for P, L, and X for x86-64?
P=12, L=4, X=9
Assume for the remaining parts that, as in x86-64, each page table page fits within a single page, and each page table entry holds an address and some flags, including a Present flag.
QUESTION OS-6F. Write a PLX formula for the number of bytes per page, using both mathematical and C notation.
Mathematical notation: _____________________
C notation: _____________________________
2P,
1 << P
QUESTION OS-6G. Write a PLX formula for the number of meaningful bits in a virtual address.
P + L*X
QUESTION OS-6H. Write a PLX formula that is an upper bound on the number of bits in a physical address. (Your upper bound should be relatively tight; PX100L is a bad answer.)
There are several good answers. The best answer uses the fact that a physical address fits inside a page table entry, so the number of bits in a physical address is limited by the number of bits in an entry. So what’s an entry’s size? A page table contains 2X entries, and a page table fits in 2P bytes. That leaves the number of bits as:
8 *
sizeof(entry)
= 8 * 2P/2X = 8 * 2P–XAnother reasonable answer assumes that virtual addresses aren’t likely to be smaller than physical addresses, giving the answer P + L*X. (On some machines, physical addresses have been larger than virtual addresses, but as a general rule, virtual addresses are in fact bigger.)
QUESTION OS-8I. Write a PLX formula for the minimum number of pages it would take to store a page table that allows access to 2X distinct destination physical pages.
L (for a well-formed page table with distinct pages for each level), or 1 (for a “trick” page table that reuses the top-level page for all subsequent levels; see question OS-1C).
OS-7. Weensy threads
⚠️ Threads are covered in Block 3 (Concurrency).
Betsy Ross is changing her WeensyOS to support threads. There are many
ways to implement threads, but Betsy wants to implement threads using
the ptable
array. “After all,” she says, “a thread is just like a
process, except it shares memory with some other process!”
Betsy has defined a new system call, sys_create_thread
, that starts a new
thread running a given thread function, with a given argument, and a given
stack pointer:
typedef void* (*thread_function)(void*);
pid_t sys_create_thread(thread_function f, void* arg, void* stack_top);
The system call’s return value is the ID of the new thread.
Betsy’s kernel contains the following code for her sys_fork
implementation.
// in syscall()
case SYSCALL_FORK:
return handle_fork(current);
...
uint64_t handle_fork(proc* p) {
proc* new_p = find_unused_process();
if (!new_p)
return -1;
new_p->pagetable = copy_pagetable(p->pagetable);
if (!new_p->pagetable)
return -1;
new_p->regs = p->regs;
new_p->regs.reg_rax = 0;
new_p->state = P_RUNNABLE;
return 0;
}
And here’s the start of her sys_create_thread
implementation.
// in syscall()
case SYSCALL_CREATE_THREAD:
return handle_create_thread(current);
...
uint64_t handle_create_thread(proc* p) {
// Whoops! Got a revolution to run, back later
return -1;
}
QUESTION OS-7A. Complete her handle_create_thread
implementation. Assume for now that the thread function never exits. You
may use these helper functions if you need them (you may not):
proc* find_unused_process()
Return a usableproc*
that has stateP_FREE
, ornullptr
if no unused process exists.x86_64_pagetable* copy_pagetable(x86_64_pagetable* pgtbl)
Return a copy of pagetablepgtbl
, with all unprivileged writable pages copied. Returnsnullptr
if any allocation fails.void* kalloc(size_t)
Allocates a new physical page, zeros it, and returns its physical address.
Recall that system call arguments are passed according to the x86-64
calling convention: first argument in %rdi
, second in %rsi
,
third in %rdx
, etc.
uint64_t handle_create_thread(proc* p) { proc* np = find_unused_process(); if (!np) { return (uint64_t) -1; } np->regs = p->regs; np->regs.reg_rip = p->regs.reg_rdi; np->regs.reg_rdi = p->regs.reg_rsi; np->regs.reg_rsp = p->regs.reg_rdx; np->state = P_RUNNABLE; np->pagetable = p->pagetable; return np; }
QUESTION OS-7B. Betsy’s friend Prince Dimitri Galitzin thinks
Betsy should give processes even more flexibility. He suggests that
sys_create_thread
take a full set of registers, rather than just a
new instruction pointer and a new stack pointer. That way, the creating
thread can supply all registers to the new thread, rather than just a
single argument.
pid_t sys_create_thread(x86_64_registers* new_registers);
The kernel will simply copy *new_registers
into the proc
structure for the new thread. Easy!
Which of the following properties of x86_64_registers
would allow
Dimitri’s plan to violate kernel isolation? List all that apply.
reg_rax
contains the thread’s%rax
register.reg_rip
contains the thread’s instruction pointer.reg_cs
contains the thread’s privilege level, which is 3 for unprivileged.reg_intno
contains the number of the last interrupt the thread caused.reg_rflags
contains theEFLAGS_IF
flag, which indicates that the thread runs with interrupts enabled.reg_rsp
contains the thread’s stack pointer.
#3, #5 only, though it is OK to list #4.
Now Betsy wants to handle thread exit. She introduces two new system
calls, sys_exit_thread
and sys_join_thread
:
void sys_exit_thread(void* exit_value);
void* sys_join_thread(pid_t thread);
sys_exit_thread
causes the thread to exit with the given exit
value; it does not return. sys_join_thread
behaves like
pthread_join
or waitpid
. If thread
corresponds is a thread
of the same process, and thread
has exited, sys_join_thread
cleans up the thread and returns its exit value; otherwise,
sys_join_thread
returns (void*) -1
.
QUESTION OS-7C. Is the sys_join_thread
specification
blocking or polling?
Polling
QUESTION OS-7D. Betsy makes the following changes to WeensyOS internal structures to support thread exit.
- She adds a
void* exit_value
member tostruct proc
. - She adds a new process state,
P_EXITED
, that corresponds to exited threads.
Complete the case for SYSCALL_EXIT_THREAD
in syscall()
. Don’t worry about
the case where the last thread in a process calls sys_exit_thread
instead of
sys_exit
.
case SYSCALL_EXIT_THREAD:
current->state = P_EXITED; current->exit_value = p->regs.reg_rdi; schedule();
Note that we don’t even need
exit_value
, since we could useregs.reg_rdi
to look up the exit value elsewhere!If you wanted to clean up the last thread in a process, you might do something like this:
int nlive = 0; for (int i = 1; i < NPROC && !nlive; ++i) { if (is_thread_in(i, current) && processes[i].state != P_EXITED) { ++nlive; } } if (!nlive) { for (int i = 1; i < NPROC; ++i) { if (is_thread_in(i, current)) { do_sys_exit(&processes[i]); } } }
QUESTION OS-7E. Complete the following helper function.
// Test whether `test_pid` is the PID of a thread in the same process as `p`.
// Return 1 if it is; return 0 if `test_pid` is an illegal PID, it corresponds to
// a freed process, or it corresponds to a thread in a different process.
int is_thread_in(pid_t test_pid, proc* p) {
return test_pid >= 0 && test_pid < NPROC && processes[test_pid].state != P_FREE && processes[test_pid].pagetable == p->pagetable;
QUESTION OS-7F. Complete the case for SYSCALL_JOIN_THREAD
in syscall()
. Remember that a thread may be successfully joined at
most once: after it is joined, its PID is made available for
reallocation.
case SYSCALL_JOIN_THREAD:
int pid = current->regs.reg_rdi; if (pid != current->regs.reg_rdi || !is_thread_in(pid, current) || processes[pid].state != P_EXITED) { return -1; } else { processes[pid].state = P_FREE; return processes[pid].exit_value; }
Note that we can’t distinguish a –1 return value from error from a –1 return value from
sys_exit_thread(-1)
.
QUESTION OS-7G. In pthreads, a thread can exit by returning from its thread function; the return value is used as an exit value. So far, that’s not true in Weensy threads: a thread returning from its thread function will execute random code, depending on what random garbage was stored in its initial stack in the return address position. But Betsy thinks she can implement pthread-style behavior entirely at user level, with two changes:
- She’ll write a two-instruction function called
thread_exit_vector
. - Her
create_thread
library function will write a single 8-byte value to the thread’s new stack before callingsys_create_thread
.
Explain how this will work. What instructions will
thread_exit_vector
contain? What 8-byte value will
create_thread
write to the thread’s new stack? And where will that
value be written relative to sys_create_thread
’s stack_top
argument?
thread_exit_vector: movq %rax, %rdi jmp sys_exit_thread
The 8-byte value will equal the address of
thread_exit_vector
, and it will be placed in the return address slot of the thread’s new stack. So it will be written starting at addressstack_top
.
OS-8. Virtual memory cache
x86-64 processors maintain a cache of page table entries called the TLB (translation lookaside buffer). This cache maps virtual page addresses to the corresponding page table entries in the current page table. TLB lookup ignores page offset: two virtual addresses with the same L1–L4 page indexes use the same TLB entry.
QUESTION OS-8A. A typical x86-64 TLB has 64 distinct entries. How many virtual addresses are covered by the entries in a full TLB? (You can use hexadecimal or exponential notation.)
212 * 64 = 218
QUESTION OS-8B. What should happen to the TLB when the processor changes
%cr3
?
It should be emptied because the current page table changed.
4pts
QUESTION OS-8C. Multi-level page table designs can naturally support multiple sizes of physical page. Which of the following is the most sensible set of sizes for x86-64 physical pages (in bytes)? Explain briefly.
- 212, 214, 216
- 212, 224, 236
- 212, 221, 230
- 212, 239
3. These sizes match up to levels in the page table.