CS 300/1310 Spring 2021 Final Quiz
Welcome to our final quiz! We’re hoping this quiz provides you with an opportunity to review your knowledge and what you've learned in CS 300 so far. Our questions are designed to check your conceptual understanding, rather than asking you to memorize details.
Please make sure you read the rules below before you get started.
Rules
The quiz is open book, open notes, open computer. This means you can use the following resources:
- You may access a browser and a PDF reader.
- You may access your own notes and your assignment code electronically.
- You may access an internet site on which your own notes and assignment code are stored.
- You may access this year’s course website and notes on it.
- You may access pages directly linked from the course site, including our lecture materials and links therein.
- You may run a C/C++ compiler, including an assembler and linker, or a calculator.
- You may access manual pages.
- You may access Piazza.
- You may access Google or Wikipedia or other online resources.
- You absolutely may not contact other humans via chat, phone calls, or anything like it.
- You cannot ask quiz questions on online platforms like Stackoverflow, etc.
- You may not access solutions from any previous exam, by paper or computer, including exams from other courses.
Completing Your Quiz
You will complete the final quiz using the same Git repository you already used for the midterm
quiz. Before beginning
the quiz, merge your local cs300-s21-quizzes
repository with our handout to get the
template for the final:
$ git pull git://github.com/csci0300/cs300-s21-quizzes.git masterYou will enter your answers in the
final/final.md
file. Do not place your name in this
file. (We will grade your quiz anonymously.)
When you have completed the quiz, edit the file final/policy.txt
to sign your name. This
is your promise that you have obeyed the quiz rules in letter and spirit. (We won’t look at this file in
grading.)
Then commit your changes and push them to your repository on GitHub:
$ git commit -a -m "Final Quiz Completed" $ git pushIf you get an error message that you do not have access to push to
github.com/csci0300/cs300-s21-quizzes.git
, this means that you are trying to push to our
handout repository. Instead, push explicitly to your own repository:
$ git push https://github.com/csci0300/cs300-s21-quizzes-YOURGITHUBUSERNAME masterMake sure that you have entered your quiz repository URL on the grading server for the final.
Notes
Assume an x86-64 architecture and a Linux operating system unless noted otherwise in the question.
Virtual Memory (14 pts)
QUESTION 1A (5 pts).
CS300 student Bruno is working on WeensyOS, and observes that the project handout asks him to write code
in process_setup
to copy the kernel pagetable mappings into the pagetable of the newly
created process. Bruno believes that this is unnecessary, arguing that WeensyOS will be more secure if
user-space processes' page tables don't contain mappings for kernel memory. However, when he changes his
code accordingly, Bruno finds that his WeensyOS kernel crashes and QEMU keeps rebooting.
You are a CS300 TA seeing Bruno at hours! Explain why the kernel mappings must be copied.
When system calls are invoked (or traps, interrupts, or faults occur), the process must transfer control
to the kernel. The syscall
instruction changes protection level and jumps into the syscall
handler, but it does not change the active page table. Hence, the kernel code that runs next must be
mapped in the (still-active) process page table.
QUESTION 1B (4 pts). Identify whether each of the following statements about the kernel and virtual memory are true or false.
- Virtual memory protects the kernel from uncontrolled access by user-space processes.
- Virtual memory is utilized to protect user-space processes from each other.
- Shells, browsers, and metadata about open files (e.g., the current file offset) all exist in userspace memory.
- 3 in the
%cs
register means that processor is running in privileged mode.
- True
- True
- False (file metadata is in kernel memory).
- False (0 means privileged, 3 unprivileged)
QUESTION 1C (5 pts).
Some older computers supported segmentation, an alternative to page tables. In this
scheme, contiguous segments of physical memory (of any size) are mapped into same-size, contiguous virtual memory
address regions. For example, virtual addresses 0x20'0000
to 0x20'ffff
might map to physical
addresses 0x1'0000
to 0x1'ffff
. For address translation, the processor has special hardware that
understands the known segments and translates virtual addresses within a segment into the corresponding physical
addresses via arithmetic (e.g., subtracting 0x19'0000
in the example).
Why are page tables preferable to this segmentation scheme? For each reason, state if it is true or false and give a brief explanation.
- Page tables use less memory.
- Page tables are more flexible because they can map use any physical page for any (page-aligned) virtual address.
- The segmentation scheme can lead to memory fragmentation, which page tables do not suffer from.
- Page tables allow for faster address translation.
- False (need memory for each page's mapping; segment tables are smaller)
- True
- True (can have free memory partitioned into unusually small segments)
- False (PTs need to do up to 3 memory accesses for each translation)
Processes (15 pts)
QUESTION 2A (4 pts). Consider the common pattern of using the fork()
and
execv()
system calls together to answer the following questions:
- What would happen if
execv()
was called withoutfork()
being called first? - Why would other interfaces, such as
posix_spawn()
or Windows'sCreateProcess()
be more efficient thanfork()
+exec()
?
- The calling process would be replaced with the new program image, and would continue execution in that program's
main()
function.- A potential tradeoff of using
fork()
andexecv()
is the inefficiency of copying the original process just to destroy it, although copy-on-write optimization can makefork()
efficient. Other APIs don't require this copy.
QUESTION 2B (11 pts). For each of the following statements, determine whether each is true or false. If the statement is false, explain why.
- When a pipe is created using the
pipe()
system call, the kernel allocates memory for the array storing the file descriptor numbers for each end of the pipe. - When writing to a pipe, the data stays in the user-address space and is passed to other processes on demand.
- Calling
pipe()
after callingfork()
enables two processes to communicate. - The
execv()
family of system calls never returns. - The
waitpid()
system call can only wait for child processes of the calling process. - Zombie processes are a result of the kernel retaining the heap memory of an exited child process.
- False, The user must allocate memory for the array and pass it into the pipe.
- False. The data is stored in a memory buffer located in the kernel address space.
- False. The
pipe()
system call must be called beforefork()
to allow both the parent and child processes to have access to the pipe.- False. It will return on error.
- True.
- False. The kernel retains the process id and exit status of the child process.
Threads (8 pts)
QUESTION 3A (4 pts). Consider the segment of code below, which is run by multiple threads.
int sum = 0;
// A
for (int i = 0; i < 10000000; i++) {
// B
sum += i;
// C
}
// D
- A =
mtx.lock()
, D =mtx.unlock()
- B =
mtx.lock()
, C =mtx.unlock()
Two valid answers given credit for:
- 2. is more efficient, since it has a smaller critical section.
- 1. is more efficient because the atomic instructions for locking/unlocking the mutex inside the loop body are far more expensive than the additions/loop control flow instructions (jumps, condition checks).
QUESTION 3B (4 pts). The code below creates 4 threads that each run threadfunc
on
a shared variable. Why does this code sometimes fail to produce the expected result of 20?
std::mutex mutex;
void threadfunc(unsigned* x) {
for (int i = 0; i != 5; ++i) {
mutex.lock();
*x += 1;
mutex.unlock();
}
}
int main() {
std::thread th[4];
unsigned n = 0;
for (int i = 0; i != 4; ++i) {
th[i] = std::thread(threadfunc, &n);
}
printf("%u\n", n);
// process exits, all threads terminate
exit(0);
}
The main thread does not call join()
, so it’s possible that the main thread finishes before the
other threads do.
Synchronization (10 pts)
QUESTION 4A (2 pts). In which of the following situations could we get a potential deadlock? (Select one.)
- A thread locks a mutex, and while holding the lock, tries to acquire the same lock again.
- Thread A tries to acquire the mutex associated with thread B first, and then its own mutex (B's mutex -> A's mutex). At the same time, Thread B tries to acquire the mutex associated with Thread A first, and then its own mutex (A's mutex -> B's mutex).
- A thread locks a mutex, encounters an error while running, and the thread exits without unlocking the mutex. Other threads are still running and waiting to lock the mutex.
- All of the above.
All of the above
QUESTION 4B (8 pts). Alice, a CS300 TA, is looking at a student's implementation of a
synchronized bounded buffer. The implementation of the student's write()
method looks like this:
ssize_t bbuffer::write(const char* buf, size_t sz) {
std::unique_lock guard(this->mutex_);
assert(!this->write_closed_);
while (this->blen_ == bcapacity) {
// wait for space to free up
}
size_t pos = 0;
while (pos < sz && this->blen_ < bcapacity) {
this->bbuf_[this->blen_] = buf[pos];
++this->blen_;
++pos;
}
if (pos == 0 && sz > 0) {
return -1;
} else {
return pos;
}
}
First bug: iterating in a
while
loop waiting for space to free up in the buffer with the mutex lock held will result in deadlock, because the reader thread can never acquire the mutex (which is required for it to remove data from the buffer). Using a condition variable and callingwait()
on it fixes this.Second bug: the student's implementation isn't properly wrapping around in the circular buffer; specifically, the write line needs to be
this->bbuf_[this->blen_ % this->bcapacity_]
.
Networking (10 pts)
QUESTION 5A (6 pts). Consider TCP (the network protocol we discussed in the course) compared to another networking protocol, the Unified Datagram Protocol (UDP). UDP has no notion of a "connection" (and thus, no handshake at connection setup) and sends packets through network with only a "best-effort" guarantee that they will arrive at the destination (i.e., the protocol does not check that the packet was actually delivered).
Which of the following statements about the benefits of using TCP compared to UDP to send data across a network are true and which are false? Give a brief justification for each answer.
- TCP achieves better performance (higher throughput and lower connection setup latency) than UDP.
- TCP is more secure than UDP.
- TCP is more reliable than UDP.
- False. TCP must do a three-way handshake (adding latency to connection setup) and sends acknowledgement packets (reducing throughput).
- False. Security (such as encryption) is an application-level propery implemented atop the network protocols.
- True. TCP resends lost packets, increasing reliability.
QUESTION 5B (4 pts). The folks at Banana Computer are expanding their OS with
networking capabilities. New CEO Brad Boardpleaser suggests that pipes and network connections are very
similar, and that the kernel dev team can save some engineering time by supporting both with similar
APIs. In particular, he requests that the team add a net_pipe()
syscall with signature
int net_pipe(int pipefds[2])
, which will allow processes to communicate via read and
write ends of a network connection.
Describe two reasons why Brad's suggested API will not work and/or is a bad idea.
Many possible answers. Some examples:
- There is no way to indicate the IP address or port of the remote computer in this API, so the kernel doesn't know where to connect to.
- The other computer might not be using
net_pipe()
, but instead use normal network sockets. It would not understand this API.- Pipes are used with
fork()
to get the FDs into the other process, butfork()
can't start processes on remote computers.- The notion of a "read end" and "write end" is not meaningful for network connections, which must often send data in both directions (e.g., TCP's acknowledgement packets).
Distributed Systems (20 pts)
QUESTION 6A (5 pts). Which of the following statements about vertical and horizontal scaling are true and which are false?
- Vertical scaling is limited by the number of processors that a single computer can have.
- Horizontal scaling is simpler (in terms of code/implementation) than vertial scaling.
- Vertical scaling may not add linear performance benefits because of contention on mutex locks.
- Requests in a horizontally-scaled system achieve lower latency than in a vertically-scaled system (assuming neither setup is overloaded).
- Horizontal scaling increases flexiblity because it allows adding and removing machines at a finer granularity.
- True
- False (horizontal scaling doesn't require dealing with failures)
- True
- False (network/RPC latency)
- True
QUESTION 6B (2 pts). Consider this statement: "Sharding a key-value dataset across several machines reduces the likelihood of data loss."
Is the statement true or false? Explain your answer.
False. Sharding is a technique for distributed load, but does not affect reliability.
QUESTION 6C (2 pts). Consider this statement: "Storing a dataset and its replica on the same machine, but in different databases, is recommended because it allows for fast restoration from backup in case of a failure."
Is the statement true or false? Explain your answer.
False. If the machine fails, both primary and backup replica will be lost, so this scheme adds no reliability over storing a single copy.
QUESTION 6D (8 pts). Suppose you have a distributed key-value store with N servers. 2/3 of these servers are new and functional. However, 1/3 of the servers are old and their disks are worn. As such, they may sometimes read and return the wrong value for any given request! How might you design a key-value store that utilizes all servers (functional and faulty), while still maintaining confidence that you are returning the correct values to your users?
[Hint: The solution will involve both explaining how you store data and how your distributed system handles requests.]
Replicate each key-value pair on three servers, making sure that only one copy is on an old server. Clients read from all three server, and take a majority vote over the results. Since at most 1/3 of the servers have broken disks, the correct servers can always outvote the broken ones.
QUESTION 6E (3 pts). Consider the following transactions on a key value store,
executed in order 1 -> 2 -> 3 with the initial values a
= 250, b
= 100,
c
= 50. (Assume that all locks are released at the end of the transaction.)
BEGIN TX { LOCK(a); LOCK(c); v = GET(a); if (v >= 200) { x = GET(c); SET(a = v - 200); SET(c = x + 200); } else { abort; } } END TX;
BEGIN TX {
LOCK(a);
LOCK(b);
v = GET(a);
if (v > 100) {
x = GET(b);
SET(a = v - 100);
SET(b = x + 100);
} else {
ABORT;
}
} END TX;
BEGIN TX {
LOCK(b);
v = GET(b);
if (v < 150) {
DELETE(b);
} else {
ABORT;
}
} END TX;
For each of the following, specify if the statement is true or false:
- Transaction 1 aborts.
- Transaction 2 aborts.
- Transaction 3 aborts.
- False
- True
- False
CS 300 Feedback (8 pts)
QUESTION 7A (3 pts). What is one aspect of CS300 that we should definitely retain for future years? Any answer but no answer will receive full credit.
- ALRs (12x)
- PLQs (12x)
- Instructor (9x)
- Everything! (8x)
- Projects (8x)
- Flexibility on deadlines (6x)
- Labs (5x)
- Lecture notes (5x)
- Lecture demos (3x)
- Late hours policy (3x)
- Collaboration policy (2x)
- Course website, lecture notes (2x)
- Material on threading (2x)
- Fast Piazza responses
- FileIO project, despite difficulty
- Grading server
- Interactive lectures
QUESTION 7B (3 pts). What is one aspect of CS300 (other than firing the instructor and dropping the exams) that we can improve for future years? Any answer but no answer will receive full credit.
- Make labs more balanced in time taken (5x)
- Hold synchronous collaboration hours (5x)
- Avoid time crunch at the end (5x)
- Improve handouts for FileIO and Distributed Store (4x)
- Require collaboration with other students (4x)
- PLQs were difficult to remember to do (3x)
- Hold project gear-ups (2x)
- Less time-consuming labs (2x)
- Faster pace at start, more time for material at end (2x)
- More TAs, more than one TA at hours (2x)
- More material on RPCs, including demos (3x)
- Don't do ALRs, do lectures + sections (2x)
- Make PLQ student question voluntary
- Publish preview material before class
- Less busywork-y SRC material
- Fewer course components
- Spend more time on stack
- More extensive Lab 8 (better project 6 preparation)
- Apple M1 support
- Tune time allocation for projects
- Make grades less based on passing tests
- Give more implementation freedom in projects, but also provide more specificity in handouts
- Vunmo was too directed, give more freedom in implementation
- Clarify where CS300 doesn't cover all aspects of a topic
- Faster grading and more feedback
- Discuss C++ smart pointers
- More WeensyOS hints
- Fix grading server links in handouts
QUESTION 7C (2 pts). Which CS300 projects were your most and least favorite ones? Any answer but no answer will receive full credit.
Most favorite:
- WeensyOS (24x)
- Vunmo (23x)
- Distributed Store (13x)
- FileIO (11x)
- DMalloc (10x)
- All! (3x)
- StrVec (2x)
Least favorite:
- FileIO (24x)
- Distributed Store (15x)
- WeensyOS (10x)
- StrVec (8x)
- DMalloc (3x)
- Vunmo (3x)
Handing in!
You're almost done!
When you have completed the quiz, edit the file final/policy.txt
to sign your name. This
is your promise that you have obeyed the quiz rules in letter and spirit. (We won’t look at this file in grading.)
To hand in your exam, please push to your cs300-s21-quizzes-YOURGITHUBUSERNAME
repository.
Important: You're welcome to push intermediate commits as you work on the quiz.
However, when you've completed the quiz, make sure to push a commit to your repository titled "Completed Final Quiz". This will indicate to us that your handin is ready to be graded. Once you push this commit, you can no longer edit your answers.
Finally, head to the grading server and add your quiz repository under in the "Final Quiz" section. Make sure to check that your answers show up in the diff displayed on the grading server.
Congrats on finishing your CS 300 Final, and thanks for taking CS300! 🥳