⚠️ This is not the current iteration of the course! Head here for the current offering.

CS 300/1310 Spring 2022 Midterm Quiz

Welcome to our midterm quiz! We’re hoping this quiz provides you with an opportunity to review your knowledge and what you've learned in CS 300 so far. Our questions are designed to check your conceptual understanding, rather than asking you to memorize details.

Please make sure you read the rules below before you get started.

Rules

The quiz is open book, open note, open computer. You may access the book and your own notes. You may also use computers or electronic devices to access your own class materials, public class materials, and online resources. Specifically:

However:

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Additionally, students are taking the quiz at different times. Do not post publicly about the quiz until given permission to do so by course staff. This includes general comments about quiz difficulty.

Completing your quiz

You have 80 minutes to complete the quiz starting from the time you press "Start". Different students may have different quizs. Enter your answers directly in this form; a green checkmark appears when an answer is saved.

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

You should not need to attach a drawing or other longer notes, but you may do so by adding them to a midterm subdirectory in your cs300-s22-projects repository. Make sure you push your changes to GitHub before time is up, and explain in this form where we should look at your notes.

Notes

Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

1. Data Representation (8 points)

QUESTION 1A. You're trying to figure out how much space you need to store a given string in memory. Your friend says "it's easy: just look at the string, count the number of letters it contains, and allocate that many bytes!" Give two reasons why your friend's approach might lead you to allocate an incorrect number of bytes for the string.

QUESTION 1B. You find a printed hexdump in a dumpster behind CIT, and have reason to believe that it represents the memory for a C struct. What is a likely definition of the struct, and what values are stored in the struct's members? [Hint: assume that the dump contains the full struct, and pay attention to the addresses in the left column.]

0x7ffcc6706980  43 53 33 30 30 20 3c 33
0x7ffcc6706988  21 00 00 00 09 00 00 00
0x7ffcc6706990  80 69 70 c6 fc 7f 00 00

QUESTION 1C. Consider the following program, which handles signed integers. What data type for f's return type should go in the blanks (_____) to compute the correct result? In your answer, specify the type and either the value returned in binary notation (0b..., omitting leading 1s or 0s) or "undefined behavior" if the program triggers undefined behavior.

_____ f(char i1, int i2) {
  return i1 + i2;
}

int main() {
  _____ retval = f(-1, -256);
  return 0;
}

2. C Programming (3 points)

QUESTION 2A. Consider the following function:

char* func(char* str) {
  char* ret = (char*) ((long*) ((int*) str + 2) + 1) + 3;
  return ret;
}

Which of the following expressions is the ret variable equivalent to? List all answers that apply.

  1. str + 6
  2. str + 19
  3. &(str[19])
  4. str[19]

QUESTION 2B. Joe Hackerbro is annoyed that he always needs to keep track of the sizes of integer arrays that he passes around as pointers. He decides that to make things easier, he will NUL-terminate his integer arrays, just like strings are NUL-terminated. Specifically, he'll use 4 NUL bytes, since on his machine integers are 4 bytes.

Besides taking extra space, what is one issue with Joe's proposal?

3. Project 1: Snake (5 points)

QUESTION 3A. Curious T. Hundred, a new student, is working on the Snake project, and just learned about dynamic memory allocation in lecture. Curious wonders when they should use dynamic lifetime (heap) memory in Snake, and why.

List two instances where dynamic lifetime memory is needed, and give one reason why it's necessary. The reason should be unrelated to returning pointers to objects that need to be accessed outside the function they're allocated in.

QUESTION 3B. When Curious implements food consumption for their snake, they decide to update board state first and then check the flags, rather than checking the next cell's flags before moving the snake.

Recall that the stencil contains the following constant definitions:

#define FLAG_PLAIN_CELL 0x0
#define FLAG_SNAKE 0x1
#define FLAG_WALL 0x2
#define FLAG_FOOD 0x4

Which of the following expressions checks if a cell is both a snake cell and a food cell, and has no other flags set? Choose one answer, and explain briefly why the others don't work.

  1. cell == 0x05
  2. cell ^ FLAG_SNAKE == FLAG_FOOD
  3. (cell & FLAG_SNAKE) && (cell & FLAG_FOOD)
  4. cell & (FLAG_SNAKE & FLAG_FOOD) != 0

2pts #1 is correct, as it combines exactly the bitflags 0b1 (0x1) and 0b100 (0x4) and no others.

4. Undefined Behavior, or Here Be Dragons (7 points)

QUESTION 4A. Consider the following C program:

int main() {
   // EXAMPLE 1
   int i;
   printf("My favorite number is %d\n", i);

   // EXAMPLE 2
   char* my_str = NULL;
   printf("my_str is %s\n", *my_str);

   // EXAMPLE 3
   unsigned int x = 0x7fff'ffff;
   printf("%d\n", x + 1);

   // EXAMPLE 4
   char* hello = "Hi!";
   printf("%s\n", hello[5]);

   // EXAMPLE 5
   int* i_ptr = (int*) malloc(8);
   i_ptr[0] = 12;
   printf("i_ptr[0] = %d\ni_ptr[1] = %d\n", *i_ptr, *(i_ptr + 1));
   free(i_ptr);
}

For each example, specify (a) if the code triggers undefined behavior or not; and (b) if it does trigger undefined behavior, name or explain what type of undefined behavior it is.

QUESTION 4B. Joe Hackerbro thinks that the notion of undefined behavior is silly. After all, Joe argues, a given compiler can't do "absolutely anything", but must actually produce some assembly that does something when a C program contains undefined behavior. Thus, Joe argues, the behavior is actually well-defined for the given compiler, and that even a program containing UB will always behave deterministically (i.e., the same way) as long as it is compiled with the same compiler.

Give one reason why Joe's argument is incorrect.

5. Alignment and Collections (7 points)

QUESTION 5A. What is the number of padding bytes for the following struct T?

struct T {
    struct Inner {
        char* name;
        long l;
        short s;
    } inner;
    char* str;
};

QUESTION 5B. Create a struct of size 12 with exactly 5 bytes of padding. Your answer should contain the struct definition.

QUESTION 5C. Frustrated with C and C++'s rule that compilers may not optimize struct size by reordering struct members, you decide to propose a new programming language: C!! ("C bang bang"). The new language's specification is identical to C's, including its syntax. The only difference is that C!! lacks the requirement that compilers must lay out struct members in declaration order in memory.

Below is a C/C!! program and four functions. Assuming each function gets called as FUNC from main() in the program, specify all functions that may produce different results when compiled as C and C!!, and give a brief justification.

Program:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

typedef struct item {
    char type;
    unsigned int time;
    char category;
    char* value;
} item_t;

void FUNC();

int main() {
  FUNC();
  return 0;
}

Function f:

void f() {
    int item_sz = sizeof(char) + 3 +
                  sizeof(unsigned int) +
                  sizeof(char) + 7 +
                  sizeof(char*);
    assert(item_sz == sizeof(item_t));
}

Function g:

void g() {
    item_t* t = malloc(sizeof(item_t));
    t->type = 'A';
    printf("item type: %c\n", *(char*)t);
}

Function h:

void h() {
    item_t* items = malloc(sizeof(item_t) * 50);
    item_t** copied = malloc(sizeof(item_t*) * 50);
    for (int i = 0; i < 50; i++) {
        copied[i] = &items[i];
    }
    items[0].type = 'A';
    printf("item type: %c\n", (*copied[0]).type);
}

Function i:

void i() {
    item_t* t = malloc(sizeof(item_t));
    t->time = 1647703580;
    printf("item time: %d\n", *(int*)((char*)t + 4));
}

6. Project 2: DMalloc (6 points)

QUESTION 6A. Which of the following memory bugs can a debugging memory allocator like your DMalloc implementation not detect, and why?

  1. heap use-after-free
  2. double free
  3. reading outside the boundaries of a heap-allocated block
  4. writing one byte beyond the boundary of a heap-allocated block
  5. writing at b[1000005] for a 5-byte heap-allocated block

QUESTION 6B. The AddressSanitizer is similar to your DMalloc allocator, but it uses a so-called "shadow heap" in addition to padding allocations with secret values. The shadow heap contains metadata about which heap bytes are part of valid allocations, and gets updated every time the heap layout changes (on malloc and free). Sanitizer error messages include a hexadecimal representation of the shadow heap, like the following:

  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00 fa fa fa fd fa fa fa 00 fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa

Here, each two hexadecimal digits (byte) indicates the state of eight heap bytes: 00 indicates 8 valid bytes, while fa indicates 8 bytes of padding, and fd indicates 8 bytes in a free allocation.

In the above heap layout, it is possible for a program to write past the valid allocation represented at shadow heap address 0x0c047fff8002 in such a way that the sanitizer cannot detect the wild write. How many bytes past the allocation would the application write?

[Hint: be sure to take into account that each byte in the shadow heap represents eight actual heap bytes! The addresses in the left column are shadow heap addresses, not the addresses of actual heap data.]

7. Assembly (6 points)

QUESTION 7A. Alice C. Coder made a C source code file yesterday, but her project partner Eve L. Zhu deleted all the lines of code in the file by accident today. Fortunately, Alice still has the executable (which she compiled with compiler optimizations disabled). She uses objdump to extract the assembly code, and finds the following:

main:
.LFB0:
    mov    %edi, -4(%rbp)
    mov    %esi, -8(%rbp)
    mov    -4(%rbp), %edx
    mov    -8(%rbp), %eax
    add    %edx, %eax
    cmp    $300, %eax
    jle    .L2
    mov    $1, %eax
    ret
.L2:
    mov    -8(%rbp), %eax
    add    %eax, %eax
    ret

Please help Alice reconstruct the deleted source code to the best of your ability. You can make plausible assumptions about the C types removed on compilation.

QUESTION 7B. Given the poor collaboration with Alice, Eve worries about her grade. She bribes a TA to obtain the executable of the project autograder, and looks for a way to change her grade via a buffer overflow attack.

In the disassembly, Eve finds a vulnerable function set_grade that reads the grade from stdin. The function's disassembly is as follows:

0000000000401173 <set_grade>:
  401173:	55                   		push   %rbp
  401174:	48 89 e5             		mov    %rsp, %rbp
  401177:	48 83 ec 20          		sub    $0x20, %rsp
  40117b:	c7 45 fc 00 00 00 00 		movl   $0x0, -0x4(%rbp)
  401182:	48 8d 45 e0          		lea    -0x20(%rbp), %rax
  401186:	48 89 c7             		mov    %rax, %rdi
  401189:	b8 00 00 00 00      	 	mov    $0x0, %eax
  40118e:	e8 cd fe ff ff       		callq  401060 <gets@plt>
  401193:	8b 45 fc             		mov    -0x4(%rbp), %eax
  401196:	c9                   		leaveq
  401197:	c3                   		retq

How large will Eve have to make her input to gets() such that it overwrites set_grade's return address? Include padding and the bytes to overwrite the return address in your total size. Answers are represented as decimal numbers.

8. Processor Caching (2 points)

QUESTION 8A. Aliza P. Hacker has invested her hard-earned TA pay into a new Macbook Pro with Apple's M1 MAX processor, which is based on the ARM64 architecture and has a whopping 28 MB L2 cache and a 48 MB L3 cache. When Aliza runs the matrixMultiply program from Lab 4 on her new computer, she finds to her surprise that there isn't much of a performance difference between the different iteration orders.

Explain, with specific reference to the memory size and access patterns in matrixMultiply, why this might be.

9. Project 3: Caching I/O (4 points)

QUESTION 9A. You have an implementation of Project 3 that contains single-slot cache of size 4 bytes, and always immediately pre-fetches the entire cache starting from the file's read/write head for all open calls and for seeks not currently located in the cache. You want to run the stride_cat program on the following file:

cs300 rocks!!\n

What combination of parameters for stride_cat (BLOCK_SIZE and STRIDE) will minimize the number of bytes served by cache hits?

QUESTION 9B. Consider the following initial cs300.txt file and sequence of operations calling into the caching I/O library from the project.

Initial cs300.txt file:

cs330 is a systems course.

Sequence of operations:

char* buf = malloc(10);
struct io300_file* f = io300_open(“cs300.txt”, "cs300");
ssize_t r = io300_read(f, buf, 9);
ssize_t w = io300_write(f, “the best”, 8);
int s = io300_seek(f, 28);
int wc = io300_writec(f, ‘!’);
int s = io300_seek(f, 3);
wc = io300_writec(f, ‘0’);
io300_close(f);

After the sequence of operations completes, what will file cs300.txt contain? (Choose one option.)

  1. cs300 is the bests course.
  2. cs300 is the best course.\0\0!
  3. cs300 is the bests course.\0\0!
  4. cs330 is a systems course.
  5. cs300 is the best course.

10. CS 300/1310 Feedback (6 points)

QUESTION 11A. If you could make one change to CS 300/1310 to improve it (other than firing the instructor and dropping the quizzes), what would you change? Any answer but no answer will receive full credit.

QUESTION 11B. What topics do you wish we had spent more time on? Any answer but no answer will receive full credit.


Congrats on finishing your CS 300 Midterm! 🥳