CS 131 Spring 2020 Final Quiz

Completion due: 12:00 noon on May 12 (no late hours!)

Welcome to our final quiz! We’re hoping this quiz provides you with an opportunity to review your knowledge and what you learned in CS 131. Our questions are designed to check your conceptual understanding, rather than asking you to memorize details.

Please make sure you read the rules below before you get started.

Rules

The quiz is open book, open notes, open computer. You may access your own notes in paper or digital form. You may also use a computer or equivalent to access your own class materials and public class materials. However, you may not access other materials except as explicitly allowed below. Specifically:

You may access a browser and a PDF reader.
You may access your own notes and your assignment code electronically.
You may access an internet site on which your own notes and assignment code are stored.
You may access this year’s course website and notes on it.
You may access pages directly linked from the course site, including our lecture materials and links therein.
You may run a C/C++ compiler, including an assembler and linker, or a calculator.
You may access manual pages.
You may access Piazza.
You may access Google or Wikipedia or other online resources.

But:

You absolutely may not contact other humans via chat, phone calls, or anything like it.
You may not access solutions from any previous exam, by paper or computer, including exams from other courses.

Any violations of this policy, or the spirit of this policy, are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Completing your quiz

Create a cs131-s20-quizzes repository if you have not already. Then, before beginning the quiz, merge your local cs131-s20-quizzes repository with our handout:

$ git pull git://github.com/csci1310/cs131-s20-quizzes.git master

You will enter your answers in the final/final.md file. Do not place your name in this file. (We will grade your quiz anonymously.)

When you have completed the quiz, edit the file final/policy.txt to sign your name. This is your promise that you have obeyed the quiz rules in letter and spirit. (We won’t look at this file in grading.)

Then commit your changes and push them to your repository on GitHub:

$ git commit -a -m "Final Quiz Answers"
$ git push

If you get an error message that you do not have access to push to github.com/csci1310/cs131-s20-quizzes.git, this means that you are trying to push to our handout repository. Instead, push explicitly to your own repository:

$ git push https://github.com/csci1310/cs131-s20-quizzes-YOURGITHUBUSERNAME master

Make sure that you have entered your quiz repository URL on the grading server for the midterm.

Notes

Assume an x86-64 architecture and a Linux operating system unless noted otherwise in the question.

Multiple Choice Questions

Answer these questions by entering the letters for your choices in the Multiple Choice section of final.md. For example, if your answer to Question N is “C” and “A”, you will fill in the relevant section like this:

Question N: A, C

Question 1 (2 points)

Which of the following is FALSE?

A. Freeing dynamically allocated memory within a running program requires explicit action by the programmer
B. Returning a pointer to a local variable from a function can produce undefined behavior
C. It’s possible to change the size of a stack allocated variable
D. Global variables are not stored on the stack or the heap

Question 2 (2 points)

You are working on a codebase that has defined two structs: a user_t, and a person_info_t (notice that person_info_t contains a user_t as one of its fields):

 typedef struct user {
    char access_mode;
    char username[8];
} user_t;

typedef struct person_info {
    user_t user;
    char *full_name;
    char first_initial;
    char last_initial;
    int favorite_number;
    char favorite_letter;
} person_info_t;

At one point you create a very large array of type person_info_t. Specifically,

int main() {
    ...
    person_info_t people[1000];
    ...
}

You notice that the fields of the person_info_t may not be laid out optimally in memory. This could be causing your array to take up more memory than necessary! If you rearrange the fields of the person_info_t struct into an optimal arrangement, roughly how much memory will you ultimately save? (1kB = 1024 bytes)

A. 7kB
B. 9kB
C. 16kB
D. None, the arrangement is already optimal.

Question 3 (2 points)

Which segment of memory are the machine instructions for a program located in?

A. stack/automatic
B. heap/dynamic
C. text/code
D. data/bss

Question 4 (3 points)

What can possibly happen as the result of a buffer overflow? Please select all that apply.

A. Segmentation Fault
B. Execution of an unexpected function in the program
C. Local variables in the vulnerable function are overwritten
D. The program works as expected

Question 5 (2 points)

What is the minimum number of physical 4 KB pages required on x86-64 to allocate 1 byte of memory? Assume that the page table on x86-64 is four levels deep, and is initially completely empty.

A. 1 page
B. 4 pages
C. 5 pages
D. None of the above

Question 6 (2 points)

In many architectures, including x86-64, multi-level page tables are used to translate virtual memory addresses into physical memory addresses. Which of the following is an advantage of using multi-level page tables instead of single-level page tables?

A. The table maps individual addresses instead of page-sized units
B. Large empty address space regions are more cheaply representable
C. The table is represented using a contiguous block of memory
D. None of the above

Question 7 (2 points)

Suppose you have three processes all currently running: A, B, and C. Process A currently shares a pipe with B and another pipe with C. Process A now wants process B to be able to write to process C directly. To do this, process A creates a new pipe, and then sends the new pipe’s file descriptor numbers to B and C through the existing pipes. Can B and C now use this pipe to communicate?

A. No, B could use this file descriptor to communicate with A but not C.
B. No, because B and C cannot access the new pipe.
C. No, because that would create a pipe cycle.
D. Yes, since B and C now share this new pipe.

Question 8 (3 points)

Consider the following C++ program. Which of the following are possible outputs of this program? Select all that apply.

#include <cstdio>
#include <thread>
#include <unistd.h>

int count = 0;

int child() {
   while(count != 100) {
       usleep(200000);  // sleep for 200 ms
       count += 1;
   }
   return 0;
}

int main() {
   std::thread ths[5];
   // start 5 child threads, which each call child()
   for (int i = 0; i < 5; i++) {
       ths[i] = std::thread(child);
   }
  
   // wait on each of the child threads to terminate
   for (int i = 0; i < 5; i++) {
       ths[i].join();
   }
   printf("Count %d\n", count);
}

A. The printed count might be 100.
B. The printed count might be less than 100.
C. The printed count might be greater than 100.
D. The program might never terminate.

Question 9 (2 points)

Recall the vector struct from the first project:

typedef struct my_vector {
    void *array;
    size_t ele_size;
    size_t capacity;
    size_t length;
}vector_t

If we were to make this vector struct thread-safe, access to which elements would have to be synchronized?

A. all of them
B. ele_size, capacity, and length
C. capacity and length
D. array, capacity, and length

Question 10 (2 points)

In which of the following situations is using a mutex technically not required for thread safety?

A. There are many threads getting values from a shared map, but there is only one thread that writes to the map.
B. There are many threads that are reading from the same text file, but each thread is accessing a different section of the file.
C. There are many threads that are writing to the same buffer, and each thread is trying to append to the end of the buffer.
D. All of the above situations require mutexes.

Question 11 (2 points)

Zachary is writing a server application that accepts requests from clients. To serve many users’ network connections at the same time, he has chosen to run a new thread for each incoming connection. Alice argues that this design could make the server crash due to resource exhaustion.

Which of the following resources can be exhausted by an unbounded number of connections?

i. space on a specific thread’s stack
ii. the server process’s virtual memory
iii. the computer’s physical memory
iv. the server process’s file descriptors

A. i, ii
B. ii, iii
C. iii, iv
D. Alice is wrong. We don’t run out of anything because there’s a lot of virtual memory.

Question 12 (2 points)

Consider the below C program that starts several processes.

#include <stdio.h>
#include <unistd.h>

int main() {
    if (fork() && fork()){
        printf("1");
    } else {
        printf("2");
    }
    return 0;
}

Assuming that fork() does not fail, what process tree represents the hierarchy of the processes created by this code?

In the diagrams below, the P node is the parent and a C node is a child. A node with a number represents the output from that particular process (parent or specific child).

Example

This example represents the execution of one call to fork() because the program starts by consisting of one parent process (the root of the tree), but then divides into the parent process and one child process.

Hint

In C, if the first expression in the && statement evaluates to false, the second expression will not be evaluated.

Question 13 (2 points)

What will happen in your WeensyOS program if you forget to check if kalloc() returned a nullptr in fork()?

A. A memory leak
B. A nullptr is mapped into a page table
C. Your process will always notice this immediately and exit prematurely
D. Nothing, it is just good practice to error check

Question 14 (2 points)

Which of the following is false?

A. Vertical scaling doesn’t always give performance benefits.
B. In general, horizontal scaling adds more complexity to systems than vertical scaling.
C. When measuring the performance of a distributed system, the most important metric is the number of operations per second the fastest component can handle.
D. None of the above.

Short Response

Please answer these questions as indicated in the question, or in a few sentences otherwise.

Question 15 (3 points)

Consider the code below:

char* x = "hello";
int* foo() {
    int y = 5;
    int* z = (int *)malloc(sizeof(int) * y);
    return z; 
}

For each of the following, specify in what region of memory (code part of static segment, read-only data part of static segment, data part of static segment, stack, heap) it is stored:

x
the string "hello"
y
the pointer z
the data pointed to by pointer z

Question 16 (3 points)

strncmp and memcmp are functions in the C standard library that compare data in memory. For example, the following usage is valid and both assert statements pass, as both functions return the same value:

const char* arg1 = "foo";
const char* arg2 = "foo";
assert(memcmp(arg1, arg2, 4) == 0);
assert(strncmp(arg1, arg2, 4) == 0);

Specify arguments that produce different results when passed to strncmp and memcmp, and specifically one zero result and one non-zero result. In other words, you want it to be the case that strncmp(arg1, arg2, arg3) != memcmp(arg1, arg2, arg3), and that one of the functions returns 0 and the other a non-zero value.

Use the man pages if you aren’t sure what these functions do (i.e., man strncmp and man memcmp).

Question 17 (3 points)

Consider the following code:

typedef struct Book {
  long ISBN;
  char name[16];
  int is_checked_out;
} book_t;

book_t arr[10];

Assume the beginning of the array is at byte offset of zero. What is the byte offset from the beginning of the array for the following things:

the fourth book struct in the array?
the is_checked_out field in the fourth book struct in the array?
the third character in the name field, in the fourth book struct in the array?

Question 18 (3 points)

Consider the following C program:

#include <stdio.h>

void foo(int a, long b) {
  long c = a + b; 
  printf("%ld\n", c);
}

int main(int argc, char** argv) {
  foo(2, 64);
  printf("Returned to main safe and sound\n");
  return 0;
}

Here are the assembly instructions for the foo and main functions.

Note: Each instruction is formatted as:
<instruction-address>: <instruction-bytes> <assembly-instruction>

000000000000068a <foo>:
 68a:	55                   	push   %rbp
 68b:	48 89 e5             	mov    %rsp,%rbp
 68e:	48 83 ec 20          	sub    $0x20,%rsp
 692:	89 7d ec             	mov    %edi,-0x14(%rbp)
 695:	48 89 75 e0          	mov    %rsi,-0x20(%rbp)
 699:	8b 45 ec             	mov    -0x14(%rbp),%eax
 69c:	48 63 d0             	movslq %eax,%rdx
 69f:	48 8b 45 e0          	mov    -0x20(%rbp),%rax
 6a3:	48 01 d0             	add    %rdx,%rax
 6a6:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
 6aa:	48 8b 45 f8          	mov    -0x8(%rbp),%rax
 6ae:	48 89 c6             	mov    %rax,%rsi
 6b1:	48 8d 3d d0 00 00 00 	lea    0xd0(%rip),%rdi        # 788 <_IO_stdin_used+0x8>
 6b8:	b8 00 00 00 00       	mov    $0x0,%eax
 6bd:	e8 9e fe ff ff       	callq  560 <printf@plt>
 6c2:	90                   	nop
 6c3:	c9                   	leaveq 
 6c4:	c3                   	retq   

00000000000006c5 <main>:
 6c5:	55                   	push   %rbp
 6c6:	48 89 e5             	mov    %rsp,%rbp
 6c9:	48 83 ec 10          	sub    $0x10,%rsp
 6cd:	89 7d fc             	mov    %edi,-0x4(%rbp)
 6d0:	48 89 75 f0          	mov    %rsi,-0x10(%rbp)
 6d4:	be 40 00 00 00       	mov    $0x40,%esi
 6d9:	bf 02 00 00 00       	mov    $0x2,%edi
 6de:	e8 a7 ff ff ff       	callq  68a <foo>
 6e3:	48 8d 3d a6 00 00 00 	lea    0xa6(%rip),%rdi        # 790 <_IO_stdin_used+0x10>
 6ea:	e8 61 fe ff ff       	callq  550 <puts@plt>
 6ef:	b8 00 00 00 00       	mov    $0x0,%eax
 6f4:	c9                   	leaveq 
 6f5:	c3                   	retq   
 6f6:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
 6fd:	00 00 00

When the program executes the instruction at address 0x6de, the call instruction will push the address 0x6e3 onto the stack. Why is this necessary?
What’s stored in %rip before and after the callq instruction is executed?
What’s the purpose of instruction at the address0x6c4?

Question 19 (3 points)

The diagram below represents the stack of a function that is vulnerable to a buffer overflow attack. Assume you have written an exploit file that takes advantage of this vulnerability and will redirect the control flow of the program to evil_func. Fill in the diagram below how the stack will look after your exploit is read into the buffer. Write an ‘X’ in the diagram for every byte in memory that can be filled with any character and write &evil_func to represent the address of evil_func. Format your answer as:

<address>: <bytes>

Here’s an example response.

The response,

0x7fffffffe400: X X X X X X X X
0x7fffffffe3f8: X X X X X X X X

corresponds to filling in the diagram like this:

Question 20 (3 points)

The kernel is the core, privileged part of the operating system, whose goals are to:

provide safe and convenient access to hardware
fairly share resources among processes
protect the OS and processes from each other

Pick two of these goals and describe how the kernel achieves each. (Expected length: 2-4 sentences)

Question 21 (3 points)

What are two problems with a computer that has just RAM (i.e., no SSD/disk or processor caches)? (Expected length: 1-2 sentences)

Question 22 (3 points)

Consider the difference in memory layout between a linked list and array. Explain why iterating over the elements of the array would be faster by using a cache, and why iterating over the elements of the linked list may not. (Expected length: 2-3 sentences)

Question 23 (3 points)

When might you choose to create threads over processes? When might you choose to use processes over threads? (Expected length: 2-4 sentences)

Long Response

Please answer these questions in a short paragraph, except where indicated otherwise or code is asked for. For questions that require you to write code, it might be helpful to visualize your responses in an online markdown editor such as dillinger.

Question 24 (5 points)

You are a C programmer who is very frustrated with only being allowed to allocate constant length arrays of characters. You decide that you are going to implement your own string struct, which will be a dynamically sized character array. What are some possible fields of this struct, and in what sections of memory would these fields be stored? If your struct contains pointers, indicate to which memory segment they would point.

Question 25 (5 points)

You want to create a wrapper for malloc that has header and footer metadata in each allocated block. Specifically, you would like your malloc wrapper to include the size of the block (as a size_t) at both the beginning and end of each allocated block. You plan to use your malloc_wrapper fucntion in place of malloc in your programs (but your wrapper should use malloc in its implementation). You can assume the size of a size_t is 8 bytes. Write your implementation in the heading provided below:

void* malloc_wrapper(size_t size){
  // <your code  here>
}

Question 26 (5 points)

In WeensyOS, you implemented the function sys_page_alloc to allocate a page of memory at the specified address. Now, consider implementing the reverse function, sys_page_free to free a page of memory at a specified address. Describe, at a high level, how you might implement this function. In your answer, describe which functions you would use in your implementation and what error cases you would have to check for. (Expected length: 4-6 sentences)

Question 27 (5 points)

You have the following two programs, and you’ve compiled them into executables.

Here’s program1.cpp


















#include <unistd.h>
 
int BUF_SIZE = 1000;
 
char* long_computation(char* buf);
 
int main() {
  char buf[BUF_SIZE];
  char* processed_buf;
  read(STDIN_FILENO, buf, BUF_SIZE);
 
  // processes the data, and allocates memory for processed_buf
  processed_buf = long_computation(buf);
 
  write(STDOUT_FILENO, processed_buf, BUF_SIZE);

  free(processed_buf);
}

Here’s program2.cpp




















#include <unistd.h>
 
int  BUF_SIZE = 1000;
 
char* long_computation2();
char* long_computation3(char* data, char* buf);

int main() {
  char buf[BUF_SIZE];
  char* processed_buf;
  char* data = long_computation2();
  read(STDIN_FILENO, buf, BUF_SIZE);
 
  // processes the data, and allocates memory for processed_buf
  processed_buf = long_computation3(data, buf);

  write(STDOUT_FILENO, processed_buf, BUF_SIZE);
 
  free(processed_buf);
}

You try running the programs as follows, using shell redirection to to send it input from a file on stdin and send its output to a file instead of stdout:

$ ./program1 < input.txt > output.txt
$ ./program2 < output.txt > final_output.txt

a) You notice that the programs take a long time to finish running. How could you speed up the process of generating final_output.txt without modifying the files? Explain why your suggestion works.

b) Combine the two programs into a single program which achieves the same functionality in less time.

Question 28 (5 points)

Alice is a TA for CS 131 and has developed a new grading server in C++, which may concurrently handle requests from multiple students.

The grading server program includes two synchronized std::map data structures: one maps student names to anonymous IDs (id_map), and the other maps anonymous IDs to assignments (assignment_map). However, when running the code below, the course staff finds that it sometimes hangs and appears to be stuck. After running thread sanitizer, they suspect that the problem lies in the two functions below, which add and remove students.

Give an example of a specific sequence of events that could cause this problem, and explain how to fix the code to prevent the issue. (You may assume that all students have unique names.)

void add_student(string name) {
  id_map_mtx.lock();
  assignment_map_mtx.lock();
  id_map[name] = generate_random_id();
  assignment_map[id_map[name]] = new assignment_info_t;
  assignment_map_mtx.unlock();
  id_map_mtx.unlock();
}

void delete_student(string name) {
  assignment_map_mtx.lock();
  id_map_mtx.lock();
  assignment_map.erase(id_map[name]);
  id_map.erase(name);
  id_map_mtx.unlock();
  assignment_map_mtx.unlock();
}

Question 29 (5 points)

We want to design synchronized linked lists. As a reminder, a linked list is a data structure consisting of nodes, each of which has some data and a pointer to the next node. So far there is a list_node_t struct that represents each list node, a list_t struct representing a list, and append and search functions below. We have a badly implemented fine-grained locking scheme that doesn’t work.

Linked List Program

// list node struct
typedef struct list_node {
  int value;              // value of the node
  struct list_node* next; // pointer to the next node
  std::mutex mtx;         // mutex for this node
} list_node_t;
 
// list struct
typedef struct list {
  list_node_t* head; // head node of the list
} list_t;
 
 
/**
 * Appends the given value to the end of the list.
 * Returns nothing.
 */
void append(list_t* list, int value) {
  // create a new node
  list_node_t* new_node = new list_node_t();
  new_node->next = NULL;
  new_node->value = value;

  // if the list is empty, set the head to new node
  if (list->head == NULL) {
    list->head = new_node;
    return;
  }

  // list is not empty; iterate to the end and append new node
  list_node_t* current = list->head;
  current->mtx.lock();
  while (current->next != NULL) {
    current->mtx.unlock();
    current = current->next;
    current->mtx.lock();
  }
  current->next = new_node;
  current->mtx.unlock();
}
 
/**
 * Searches the list for the target value.
 * Returns true if found; false otherwise.
 */
bool search(list_t* list, int value) {
  // list is empty
  if (list->head == NULL) {
    return false;
  }

  list_node_t* current = list->head;

  // iterate through the list
  current->mtx.lock();
  while (current->next != NULL) {
    current->mtx.unlock();
    // found the first instance of value
    if (current->value == value) {
      return true;
    }
    current = current->next;
    current->mtx.lock();
  }

  // we didn't find the value
  current->mtx.unlock();
  return false;
}

a) Your first task is to modify the code above to make it thread-safe, i.e., allow multiple threads to share a list_t object without corrupting its data or creating deadlocks. How would you modify the struct(s), append, and search functions to achieve thread-safety? For full credit, your solution must use fine grained locking.

Note: There is actually a race condition for the head node field of list_t struct. You may choose to fix this, but we want you to focus on the locking required on iterating through the list nodes. We will award credit for that part only.

b) Imagine a program that uses your synchronized linked list, with many threads searching through and appending elements into the list concurrently. Your friend Alice wants to write a function that takes the snapshot of the list and prints it out to users. In order to properly capture the state of the list, she wants to block all the threads from modifying the list until the function is done printing. Alice is thinking of using a condition variable, printing_cond_variable, to achieve this. Here is what she has so far:

Print Function

std::atomic<bool> is_printing; // global variable

// a function that blocks all threads, prints the list, and
// resumes all the threads again
void print_list() {
  // assume that only print_list will change this variable
  is_printing = true; 
  // this function prints the entire trie
  visit_list_and_print();
  is_printing = false;
  printing_cond_variable.notify_all(); 
}

Note: You can assume that all threads have access to the conditional variable.

Alice modifies the append and search functions to wait on the condition variable at the beginning of each function:

std::atomic<bool> is_printing; // global variable

void append(list_t* list, int value) {
  while (is_printing.load()) {
    printing_cond_variable.wait();
  }
  // [...]
}

bool search(list_t* list, int value) {
  while (is_printing.load()) {
    printing_cond_variable.wait();
  }
  // [...]
}

Unfortunately, Alice notices that this does not work. What is one problem with her approach?

Hint

Think about what would happen when print_list() is called when there are many threads working on different parts of the list, each executing append or search.

Question 30 (5 points)

In project 5, the shardmaster can be alerted to key-value servers (shardkvs) leaving via the Leave RPC. The specification of the Leave RPC is as follows:

The Leave RPC contains a message that specifies the server(s) that are leaving.
When a key-value server is shutting down, it sends a Leave RPC to the shardmaster.
When the shardmaster receives a Leave RPC, it updates its configuration and redistribute the shards hosted on the leaving server(s).

Alice has implemented her shardkv to use the Leave RPC as specified above. However, this only handles cases when the shardkv shuts down gracefully. Alice would also like to handle cases when key-value servers crash or fail!

Alice reads about a concept called “Heartbeats”, where messages are sent periodically between two servers to attest that they’re still alive. How can she use this idea to let her shardmaster detect crashed/unresponsive key-value servers?

In your answer:

specify any new RPCs and protobuf messages you may need to create, and
describe how the shardmaster and shardkv servers should handle these new messages.

When describing the new RPCs and how to implement this new behavior, model your answer after the above description of the Leave RPC. (Expected length: 4-6 sentences)

CS 131 Feedback

Any answer for these questions will receive credit.

Question 31 (1 point)

If you could make one change to CS 131 to improve it (other than firing the instructor and dropping the exams), what would you change?

Question 32 (1 point)

Which project (or topic) should we definitely keep for future years?

Question 33 (1 point)

Which project (or topic) should we change or replace?

You’re almost done!

To hand in your exam, please push to your cs131-s20-quizzes-YOURGITHUBUSERNAME repository.

Important: You’re welcome to push intermediate commits as you work on the quiz.

However, when you’ve completed the quiz, make sure to push a commit to your repository titled “Final Quiz Answers”. This will indicate to us that your handin is ready to be graded.

Once you push this commit, you can no longer edit your answers.

Finally, head to the grading server and add your quiz repository under in the “Final Quiz” section. Make sure to check that your answers show up in the diff displayed on the grading server.

Congrats on finishing your CS 131 final! We hope you enjoyed the course.