CSCI 0300/1310: Fundamentals of Computer Systems

CS 300/1310 Spring 2024 Midterm Quiz

Welcome to our midterm quiz! We’re hoping this quiz provides you with an opportunity to review your knowledge and what you've learned in CS 300 so far. Our questions are designed to check your conceptual understanding, rather than asking you to memorize details.

Please make sure you read the rules below before you get started.

Rules

Read these rules carefully! In this quiz, you CANNOT interact with others, use AI tools, or access shared documents or editors. If you violate these rules, we will treat this as a matter of academic integrity and follow the process for academic code conduct violations.

The quiz is open book, open note, open computer. You may access the book and your own notes. You may also use computers or electronic devices to access your own class materials, public class materials, and online resources. Specifically:

You may access a browser and a PDF reader.
You may access your own notes and assignment code electronically.
You may access the course site.
You may access our lecture notes, exercises, and section notes.
You may access our lecture, assignment, and section code.
You may access online resources, including search engines, Wikipedia, and other websites.
You may run a C or C++ compiler, including an assembler and linker.
You may access manual pages and common utilities, including calculators and a Python interpreter.

However:

You cannot use AI tools (like ChatGPT, Github Copilot, Google Bard, and others) during the quiz.
You cannot contact other humans with questions about the quiz — whether in person, via IM, or by posting questions to forums — with the exception of course staff.
You cannot interact with other humans via chat, text, phone calls, shared Google Docs, or anything like it.
You cannot ask the quiz questions on class discussion forums or other question forums such as Stack Overflow.
You cannot access solutions from any previous quiz, from this course or others, by paper or computer, except for those on the CS 300 course site.

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Additionally, students are taking the quiz at different times. Do not post publicly about the quiz until given permission to do so by course staff. This includes general comments about quiz difficulty.

Completing your quiz

You have 120 minutes to complete the quiz starting from the time you press "Start".

Different students may have different quizzes. Enter your answers directly in this form; a green checkmark appears when an answer is saved.

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

You should not need to attach a drawing or other longer notes, but you may do so by adding them to a midterm subdirectory in your cs300-s24-projects repository. Make sure you push your changes to GitHub before time is up, and explain in this form where we should look at your notes.

Notes

Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

1. Memory Organization (11 points)

Consider the following C program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
   char* name;
   int depth;
} SeaCreature;

const int MAX_NAME_SIZE = 30;

int main() {
  SeaCreature shark; // <----- PART 1C
  shark.name = (char *)malloc(MAX_NAME_SIZE * sizeof(char)); 
  strcpy(shark.name, "Great White Shark");
  
  shark.depth = 1200; // <---- PART 1D

  printf("Sea Creature: %s, Depth: %d meters\n", shark.name, shark.depth);

  free(???); // <------------- PART 1B

  return 0;
}

QUESTION 1A. What segment of memory (code, data, heap, or stack) is the memory for the following objects in?

main()
MAX_NAME_SIZE
shark.name
*shark.name (Refers to the memory pointed to by shark.name)
shark.depth

QUESTION 1B. What should be passed as the argument to free() at the line marked PART 1B?

QUESTION 1C. Malte wants shark to be on the heap instead of the stack. Write a line of code replacing the line marked PART 1C that would allocate the shark struct on the heap.

QUESTION 1D. How would the line marked PART 1D need to change now that shark is on the heap? Provide the new line of code.

2. Alignment (14 points)

You’re creating a new data struct called trainCar to represent one car in a train. Here is your definition of the struct:

struct trainCar {
 short indexInTrain;
 struct trainCar* priorTrainCar;
 struct trainCar* nextTrainCar;
 char trainCarDescription[9]; // 8 characters + NULL terminator
};

QUESTION 2A. What is the size of the trainCar struct? To express your answer, copy the struct definition above and add comments to indicate 1) how many bytes are allocated for each member of the struct, and 2) how many bytes of padding (if any) are between each member.

9pts
struct trainCar {
  short indexInTrain; // 2 bytes [1pt size]
  // 6 bytes of padding [2 pts:  size, position]
  struct trainCar* priorTrainCar; // 8 bytes [1pt]
  struct trainCar* nextTrainCar; // 8 bytes [1pt]
  char trainCarDescription[9]; // 9 bytes [1pt]
  // 7 bytes of padding [2pts]
}
In total: 40 bytes. 27 bytes + 13 bytes of padding so each member is aligned with the largest member which is a pointer and is 8 bytes long. [1pt]

QUESTION 2B. Create a struct of size 10 with 3 bytes of padding. Your answer should contain the struct definition and indicate how many bytes are allocated for each struct member, as well as the location and number of bytes of padding.

5pts

struct T {
  char a;  // 1 byte
  // 1 byte of padding
  short b; // 2 bytes
  char c; // 1 byte
  // 1 byte of padding 
  short d; // 2 bytes
  char e; // 1 byte
  // 1 byte of padding 
}

Rubric

+1 Only shorts and chars
+2 Correct padding
+1 Adds padding at end
+1 Total size of data is correct (7 bytes)

3. Assembly (11 points)

Before going on spring break, Komron wrote some C code for a group project, but accidentally sent Alaina only the assembly code.

 1 func:
 2    movl %edi, -20(%rbp) // first argument "a"
 3    movq %rsi, -32(%rbp) // second argument "b"
 4    movq $10, -16(%rbp)  // local stack variable "c"
 5    movl $0, -4(%rbp)    // local stack variable "d"
 6    jmp .L2
 7 .L4:
 8    movq -32(%rbp), %rax
 9    cmpq -16(%rbp), %rax
10    jge .L3
11    addq $1, -32(%rbp)
12 .L3:
13    addl $1, -4(%rbp)
14 .L2:
15    movl -4(%rbp), %eax
16    cmpl -20(%rbp), %eax
17    jle .L4
18    movq -32(%rbp), %rax
19    ret

Alaina sees that this is a function that takes two arguments, which are then stored somewhere on the stack. She also notices that two local variables are initialized and that the function contains a loop.

Alaina also remembers that cmpq X, Y followed by a jge instruction jumps if Y >= X or, equivalently, if X < Y.

Help Alaina further understand the assembly and reverse-engineer Komron’s code by answering the following questions:

QUESTION 3A. Identify the line number(s) that check the loop condition.

QUESTION 3B. Identify the line number where the loop variable is incremented.

QUESTION 3C. Identify the line number(s) that correspond to the body of the loop. Do NOT consider the loop condition and incrementing the loop variable as part of the body (e.g., in a for loop in C, the body would be what is between the for loop curly brackets {}).

QUESTION 3D. Which of the following C code examples could have produced the assembly code of the loop? [Hint: these examples differ only in the contents of the loop body. The for (...) part below is identical across all options and not part of the answer.]

Write your preferred option and give a brief justification.

Option 1:

for (...) {
  b = b + 1;
}

Option 2:

for (...) {
  while (d < a) {
    b = b + 1;
    d++;
  }
}

Option 3:

for (...) { 
   b = (long) a;
}

Option 4:

for (...) {
  if (b < c) {
    b = b + 1;
  }
}

4. Stack (15 points)

The CS300 TAs created a diagram of the stack and the value of the register %rax for the following program at the position marked in the code.

They eagerly showed their hard work to Nick; however, Nick may have found some errors in their diagram, which he told them to fix!

Stack diagram

Assume that the code was compiled with no optimizations and uses the base pointer (%rbp). Also assume that the below code excerpt contains no errors.

#include <stdio.h>

int mystery(int a, short b) {
  return a + b;
}

long more_mystery(long arg1, long arg2) {
  long x = arg1;
  x -= arg2;
  return x; // STACK DIAGRAM DRAWN JUST BEFORE THE `ret` INSTRUCTION EXECUTES
}

int main() {
  int res1;
  long res2;

  res1 = mystery(300, 330);
  res2 = more_mystery(-3, 30);

  printf("%d, %ld\n", res1, res2);

  return 0;
}

QUESTION 4A. Consider each of the following statements. For each one, write "error" if the statement correctly points out an error in the diagram, or "not an error" if not; then, explain your reasoning (using 1-2 sentences each). You must provide an explanation to receive credit.

more_mystery is missing its return address to main
The stack frame for function mystery is missing.
The address of res2 is mis-aligned.
Arguments arg1 and arg2 are in the wrong place, they should be stored to the right of the stored %rbp (at higher addresses).
%rax should contain the value of x (-33), not its address.

5. Caching I/O (8 points)

UTA Kazuya has come up with a genius new file caching concept inspired by the CPU cache. Whenever he opens a file, the file cache conceptually splits the file into 64 byte blocks starting from the beginning of the file. Unlike a typical file cache, which can store data starting from anywhere in the file, Kazuya's cache can only store 64 byte blocks that start at a multiple of 64, as shown below:

Caching diagram

Therefore, on a io300_seek call, Kazuya’s design loads, in its entirety whichever file block the seek target position falls into.

Kazuya finds that his reverse_byte_cat performance test runs much faster than that of UTA Alex Portland, who has a design that works as follows:

The cache can start at any position in the file;
On io300_seek, if the seek target is outside the cache, Alex's design sets the start of the cache to the seek target position and loads CACHE_SIZE bytes from the seek target position.

QUESTION 5A. Why is Kazuya’s caching design faster on reverse_byte_cat than Alex's? Explain your reasoning in a few sentences.

Hint: remember that reverse_byte_cat seeks backwards one byte at a time.

6. Undefined Behavior (10 points)

Consider the following C program. Specifically, for each point in the code marked "POINT <N>", consider whether executing that line causes undefined behavior.

#include <stdio.h>
#include <string.h>

void identify_boat(int* boat_loc, int** is_cruise_ptr) {
  free(boat_loc); 
  *is_cruise_ptr = malloc(sizeof(int));
  **is_cruise_ptr = 1;
}

int main() {
  int* boat_loc = malloc(sizeof(int));
  printf("Boat location: %d\n", *boat_loc); // <------------------ POINT 1

  char loc[] = "under the C";
  printf("My boat is %s\n", loc); // <---------------------------- POINT 2
  
  int size = strlen(loc);
  char buff[size];
  for (int i = 0; i < size; i++) {
    buff[i] = loc[i];
  }
  printf("My boat is %s\n", buff); // <---------------------------- POINT 3
  
  int other_boat_size = 2147483600;
  printf("My boat is %d feet long!", other_boat_size + 100); // <-- POINT 4
  
  int* is_cruise;
  identify_boat(boat_loc, &is_cruise);
  printf("Is cruise? %d\n", *is_cruise); // <---------------------- POINT 5
  
  free(is_cruise);
  free(boat_loc); // < -------------------------------------------- POINT 6
}

QUESTION 6A. For each marked line in the code, indicate if it causes undefined behavior. If the line causes undefined behavior, briefly explain why (a few words is fine, at most one sentence). If the line does not cause undefined behavior, no explanation is required.

You should have an answer for each of the following points in the code:

POINT 1
POINT 2
POINT 3
POINT 4
POINT 5
POINT 6

7. Snake (11 points)

UTA Ryan has written a new variant of Snake: Serpent (sea snake)! In this game, the snake is amphibious: it can be on land and in the water.

Like in Snake, Ryan keeps track of each cell state using bitflags. Here are definitions for each cell state (note that these differ from our Snake project):

#define LAND_CELL  0b0000
#define FLAG_SNAKE 0b0001
#define FLAG_WATER 0b0010
#define FLAG_WALL  0b0100
#define FLAG_FOOD  0b1000

QUESTION 7A. UTA Ryan is trying to show UTA/STA Doren how bitflags can represent the state of each cell. For each of following representations, consider whether it correctly represents a cell containing both a snake and water.

To express your answer, write "correct" if the representation is correct (i.e., represents a cell with a snake and water) and "incorrect" if it does not, then explain your reasoning (a few words is fine, at most one sentence).

3
0x0011
0b0010 + 0b0001
0b0001 & 0b0010

QUESTION 7B. Doren thinks reading the cell states in binary is too confusing. Instead, he proposes defining an enum with values like this:

enum cell_type {
   LAND_CELL = 0, 
   FLAG_SNAKE = 1, 
   FLAG_WATER = 2, 
   FLAG_WALL = 3, 
   FLAG_FOOD = 4, 
}

Doren argues that we can still use bitwise operations to check for a snake in water, such as (cell & FLAG_SNAKE) && (cell & FLAG_WATER). Why will this approach of using bitwise operators with the enum not work correctly? Give your answer in at most 1-2 sentences.

8. DMalloc (8 points)

In DMalloc, one common approach is to store metadata in a struct preceding the payload (i.e., the metadata is located before the payload, at lower addresses). Our adventurous UTA Joel suggests storing the metadata after the payload instead:

Dmalloc allocation layout

For this example, imagine that Joel’s metadata struct contains the following:

struct metadata {
    int freed;        // 1 if the payload has already been freed, 0 otherwise
    int secret_flag;  // if contains secret value, then refers to a valid, uncorrupted allocation
};

Joel also uses a std::map where each key stores a pointer to the start of a payload, and each corresponding value stores a pointer to the start of the metadata associated with that payload. The map is defined like this:

    std::map<void*, metadata*> payload_metadata_map;

QUESTION 8A. Although Joel’s metadata does not contain the size of the payload, he argues he can use the data stored in his map to compute the size of the payload. How can Joel use his map’s data to do this?

To express your answer, write a line of code that determines the payload size. To do this, assume that the payload starts at payload_ptr and the map is a global variable called payload_metadata_map.

[Hint: you can look up an entry in a std::map using the at() method, so payload_metadata_map.at(ptr) will give the map value at key ptr. We will not grade you on the minutiae of C++ syntax, so long as we can understand your answer.]

QUESTION 8B. UTA Sarah observes that Joel’s implementation of metadata does not handle boundary write errors correctly. Give two different examples of a situation in which Joel’s design will fail to detect a boundary write error. (Your examples should not merely differ in the size of the boundary overrun.)

9. CS 300/1310 Feedback (6 points)

QUESTION 9A. If you could make one change to CS 300/1310 to improve it (other than firing the instructors and dropping the quizzes), what would you change? Any answer but no answer will receive full credit.

QUESTION 9B. What topics do you wish we had spent more time on? Any answer but no answer will receive full credit.

10. Submit! (1 points)

QUESTION 10A. Write "I AM DONE" in the box. (You may still continue editing other questions until the time limit expires.)