« Back to the main CS 300 website

Project 3: Caching I/O

Checkpoint Due by Monday, March 13th (group form here)
Project Due Monday, March 20th at 6:00 PM EST


Caching, or the act of putting a small/fast/expensive data store in front of a large/slow/cheap one is at the heart of many optimizations in computing. Examples include:

  1. The storage hierarchy (Hard Drive -> SSD -> DRAM -> L3-L1 Cache -> Registers), as seen in lectures.
  2. Memoization (i.e., saving results of expensive computations for future reuse).
  3. Your web browser’s cache, which prevents redundant network requests when you access pages you’ve recently visited.
  4. Content Delivery Networks, a technology that prevents long-distance Internet traversals by instead keeping copies of web pages close to users (this is the idea behind Akamai and Cloudflare, two billion-dollar companies).

In this assignment, you will be speeding up a performance-critical aspect of systems programming: the reading and writing of data to and from a filesystem. To do this, you will be using caching! The file I/O cache you will be implementing in this project is similar to the CPU cache discussed in lecture with some differences. Notably, the file I/O cache is defined in software, rather than provided by hardware (as CPU caches are). This allows the I/O cache to have variable size. Additionally, it is not necessary for your file I/O cache to have multiple fixed-size slots. In other words, the cache may store contiguous, variable-size segments of data up to the full cache size. A single slot cache (storing one contiguous segment of the file) can suffice.


Due to the design of Hard Disk Drives (HDD, “disks”), interacting with data on disk involves waiting on several slow physical mechanisms. For example, magnetic hard disks require requests to wait for platters to spin and the metal arms to move to the right place where it can read the data off the disk. If our programs were required to read and write their data solely from disk, they would be unacceptably slow. (We saw an example of this in the form of the disk-slow program in lectures.)

Fortunately, we can do better using caching! If we are willing to sacrifice a small amount of data integrity (i.e., in the event that your computer suddenly loses power, some data is lost), we can gain 100-1000x in performance. To do this, the computer temporarily holds the data you are using inside of its main memory (DRAM) instead of working directly with the disk. In other words, the main memory (DRAM) acts as a cache for the disk.

Project Description

You will be implementing an Input/Output (I/O) library that supports operations like read() (reading data from files), write() (writing data to files), or seek() (moving to a different offset within the file). Your I/O library uses a cache to prefetch data and reduce the number of disk operations required.

Your approach to implementing this project should be:

  1. Read the entire handout as some critical information is located in later sections.
  2. Fill out the conceptual questions.
  3. Together with a group, design your cache data structure and have it checked off by a TA. This includes the mechanics of how your cache works and all metadata required to maintain it.
  4. Fill out the functions in impl/student.c to implement caching file IO that conforms to the API specified in io300.h. In addition, you must ensure that make check reports that your implementation is correct, and make perf reports that your implementation is at most 5-10x slower than stdio (the standard library’s caching I/O functions) in all cases (see grading rubric).

Project Details


Ensure that your project repository has a handout remote. Type:

$ git remote show handout

If this reports an error, run:

$ git remote add handout https://github.com/csci0300/cs300-s23-projects.git

Then run:

$ git pull
$ git pull handout main

This will merge our Project 3 (fileio) stencil code with your repository.

Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the fileio directory in the working copy of your projects repository.

[Infrastructure Help]


This project, unlike DMalloc, is in C. This means you cannot use C++ language features and data structures in your solution.

In this project, we provide you with a number of existing test programs that do basic file manipulation (see test_programs/*.c). These programs are written in such a way that all interaction with the filesystem is done through the io300_file interface (a series of functions whose signatures are declared in io300.h). This means that if two libraries separately implement the functions in io300.h (potentially quite differently), the test programs will work with either implementation.

We provided you with two implementations, and you will develop a third.

  1. The naive implementation (impl/naive.c) reads from and writes directly to the disk without any caching. The initial project stencil is identical to this implementation.
  2. The standard I/O implementation (impl/stdio.c) leverages the existing C Standard Library’s “buffered” I/O, which does some clever caching.

The speed difference between these two solutions, as measured by running the test programs with each implementation, is astounding! Try it out for yourself:

$ make testdata    # generates 10MB test file
$ make IMPL=naive  # compiles naive implementation
$ time ./byte_cat /tmp/testdata /tmp/testout

real    0m18.515s
user    0m6.819s
sys     0m11.694s

$ make clean && make IMPL=stdio
$ time ./byte_cat /tmp/testdata /tmp/testout

real    0m0.140s
user    0m0.112s
sys     0m0.028s

The numbers will differ on your computer, but the general relationship (a ~100x performance difference) will hold. Note that real is the actual time taken (this is called “wall clock time”), while user refers to the part of that time spent in userspace, while sys is the time spent in the kernel.

In the project, your task is to fill in the third implementation (impl/student.c) and make it perform as close to the stdio implementation as possible.

Note: This project deliberately leaves you a lot of design freedom. You should feel free to come up with clever caching designs and benchmark them.

Both simple and involved schemes will work, but cleverer schemes will get closed to the performance of stdio, and may sometimes even beat it! We will award extra credit for impressive performance (see the grading rubric).

You’re welcome (and encouraged) to work with your partner or in groups to come up with your design. As usual, the code you hand in must be your own.


We provide you with the following files:

File Purpose
io300.h A list of all the methods your implementation must supply to be used with the test programs.
impl/ Contains the C source for this project. impl/student.c is the only file you have to edit.
test_scripts/ Test scripts provided by us that working implementations will pass. Run make check and make perf to use them.
test_programs/ Contains the test programs that use your implementation to do IO.
test_files/ Some files to run your implementation on while you are developing.
example/ Some C programs whose behavior offers insight into some of the technical aspects of this project.

Getting Started

The core idea behind this project is the following:

When a program asks the I/O library to write a byte to a file, instead of immediately going to the disk, put that byte in a cache, then write the entire cache to disk at a later point.

The first thing you should do is look at io300.h to understand the public API that each implementation offers. Next, you should run make -B IMPL=stdio check to run the tests on the implementation we provide (you should also do make -B IMPL=naive check to see how ridiculously slow it is).

Next, take a look at the test programs (test_programs/*.c) to see how your IO library is going to be used.

The first thing you’ll want to do is to work out why the stencil code (and the naive implementation) is so slow. To do so, you can use a handy tool called strace (for sycall trace), which prints the system calls (calls from your program into the OS kernel) that happen when your program runs.

System calls are expensive: they require changing the processor’s privilege mode, lots of safety checks, and they invoke kernel code. Moreover, the read or write syscalls often go directly to disk.

Start out by investigating how the naive implementation performs on a 755 byte test file:

$ make -B IMPL=naive $ strace -e trace=read ./byte_cat test_files/words.rot13.txt /tmp/out read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\v\2\0\0\0\0\0"..., 832) = 832 [...] read(3, "1", 1) = 1 read(3, ".", 1) = 1 read(3, "1", 1) = 1 read(3, " ", 1) = 1 read(3, "V", 1) = 1 read(3, "a", 1) = 1 read(3, " ", 1) = 1 read(3, "g", 1) = 1 [...]

The initial lines relate to read() system calls that happen as the OS starts up your executable (which is encoded in a format called “ELF”). You can ignore those lines. At some point, you’ll see a lot of lines like read(3, "V", 1) = 1. A line like this means that the program (or, in this case, the I/O library used) invoked the system call read() with arguments 3 (a number that refers to file handle), "V", and 1, and that the return value was 1 (as indicated by = 1).

Think about what this output means. Can you see a reason why this is inefficient?

Give me more hints!

Let’s consider a thought experiment, this time with the write() system call.

Imagine that I am writing a program that will sequentially write 1000 bytes (all of value 'c') to a file, one byte at a time.

One way to do it would be to call write() 1000 times, once per byte.

for (int i = 0; i < 1000; i++) { write(fd, 'c', 1); }

Another option would be to create a local variable char buff[40] and put bytes into that. Once we have written 40 bytes to buff, we issue one single call to write(fd, buff, 40).

for (int i = 0; i < 1000; i++) { buff[i % 40] = 'c'; // MORE LOGIC: // 1. check if buffer is full // 2. if so, call write(fd, buff, 40) }

The critical thing to remember is that a call to write(fd, 'c', 1) and write(fd, buff, 40) will take nearly the exact same amount of time because the cost of executing a system call is much larger than the difference between moving 1 byte and moving 40 bytes.

So with the second solution, we can reduce the number of syscalls by 40x.

In this project, you will be generalizing the second solution to deal with things like writing bytes backwards (decreasing indices) and writing chunks of bytes all at once.

Design Checkoff

To succeed at this project, it is crucial that you think conceptually about your design before you write code. One good way to check a design is to explain it to someone else. Hence, we ask that you work with a group. You may either come up with a design together, or explain your designs to each other. You’ll then sign up for a slot at TA hours together to get the design checked off.

With your group, design your cache. You must work with a group for this task (if you haven’t yet filled out the group form, do so here). Explain your cache design and the sequence below to get checked off by a TA at an assigned TA hours slot by the checkoff due date. Consider the following:

Now consider the following example code. The file we are caching is testfiles/tiny.txt and the cache size is just 8 bytes in the example.

io300_file* testFile = io300_open("testfiles/tiny.txt", "tiny!"); ssize_t r = io300_read(testFile, buffer, 5); // assume buffer is at least 5 bytes ssize_t w = io300_write(testFile, "aaa", 3); r = io300_read(testFile, buffer, 2); ssize_t s = io300_seek(testFile, 12); w = io300_write(testFile, "aaa", 3); r = io300_readc(testFile); io300_close(testFile);

Based on your design, fill out following template for every line in the sequence of operations:

State template

Your answer should consist of eight such pictures with notes on the state of the cache and the metadata. You can write these digitally (e.g., in a Google Doc) or on a piece of paper that you bring to TA hours. Drawing pictures on paper is an excellent way to work on this part of the assignment!

Now get your work checked off by a TA!

Things to Think About

Here are some questions to ask yourself before/while implementing:

  1. When should I call read() and fill my buffer?
  2. What happens to the data in the cache when a program reads or write the last byte in the cache? What about one byte past this?
  3. How do I know or keep track of the fact that the cache has been modified?
  4. What happens when flush is called, but nothing in the cache has been changed?
  5. What happens if a syscall fails?
  6. What happens when I read a byte from a file that is all 1s (0xFF, 0b11111111)? How is this different than the integer -1? How does this mess with return values from things like read() and write()? What is the deal with unsigned vs signed char? (some answers to these can be found in example/io_return_example.c).
  7. What happens if we seek to a location that is within the cache? How should this differ from seeking to a location outside of the cache?

To help you out as you’re designing your cache, we’ll share what your contents of your file, cache (i.e., the in-memory buffer) and return value are after two select lines of code.

After Line 2
After Line 6

Implementing Caching I/O

Now it’s time to get coding! Be sure to read the rest of the handout first, however.

Complete the functions in impl/student.c marked with TODO comments such that you implement cached file IO.

It will make sense to proceed step by step, and we provide some guidance on getting started below.

Your final implementation should:

  1. Conform to the API specified in io300.h.
  2. Pass all tests run in make check reports that your implementation is correct
  3. Perform no more than 5-10x slower than stdio in all cases tested in make perf.

We highly recommend that you thoroughly read and engage with the following sections when working on developing your caching I/O strategy.

Note: you cannot use any C Standard Library IO functions while implementing your library (this would defeat the purpose of the assignment!). These include functions like fread, fwrite, fgetc, fputc, fseek. Remember to only create one cache. You will not need to call malloc anywhere, except in the place that we already provide for you.


Making Progress

One of your early goals should be to get the byte_cat program working for a small text file.

$ touch /tmp/out.txt   # creates empty file
$ make -B
$ ./byte_cat test_files/tiny.txt /tmp/out.txt
$ cat /tmp/out.txt 
this is a test

The byte_cat program performs no seeks and is the most simple test case you can create. Here is a list of the functions that byte_cat uses: open, close, filesize, readc, writec. We provide all of filesize, and open/close are simple, so you just have to get your reading and writing logic down!

If you want to break things down further, you can implement the read side, keeping the naive write logic in place, test if your implementation works, and then continue with the write side.

Debugging Tips

The easiest way to spend a lot of time on this project is to just write code, or to randomly edit your code, without thinking and debugging carefully. Make sure you use the below techniques to make your debugging productive!

Note that the stencil implementation passes all correctness tests, but it is very slow! Additionally, you might find that the performance tests pass locally but not on the grading server. Tests must pass on the grading server to receive credit.

As you work on the project, you will likely break the correctness tests in your quest to improve performance.

If you fail a correctness test, the first thing you will usually want to look into is whether it failed because of a crash in your code (which you can develop using familiar tools such as GDB), or whether it ran, but produced incorrect output. When you see incorrect output file contents, it makes sense to work out how they’re incorrect, by looking at the actual bytes in the input and output files! Check out the Debugging section below for some helpful tools, such as xxd and diff.

Make sure your program can handle non-ASCII characters!

Additionally, it can also help to look at the output from strace, which lists the system calls that your caching I/O library makes, to see if anything unexpected happens during the test.

If your performance is poor, strace should be your first point of call. Poor performance in this project is almost always explained by your implementation making too many system calls. Running tests under strace will help you understand exactly what sequence of system calls happen, and how many bytes they move into and out of the cache.

Finally, you will likely want to have debug output in your library that you can disable easily when running performance tests, as printf-type functions slow down your program a lot. We provide the helper function dbg() in the stencil for this purpose: you can change the value of the DEBUG_PRINT constant to turn the debug printing code on or off in a single keystroke. dbg() works just like printf(), but if DEBUG_PRINT is set to 0, the compiler will throw away the code in the function and turn it into a no-op.

Consider also writing your own tests if you want to test specific behavior. For more information, look at Testing and Appendix VI.

Finishing Up

Once you have a working implementation, test it with make check (correctness) and make perf (performance). You probably won’t match stdio on all performance benchmarks yet, but you will meet the performance bar for this project if you achieve the following:

  1. Your implementation is within 10x of stdio on the byte_cat benchmarks (byte_cat and reverse_byte_cat).
  2. Your implementation is within 5x of stdio on the block_cat tests (block_cat, reverse_block_cat, and random_block_cat).

If you get closer to stdio (or even beat it on some tests), we will award extra credit.


make check tests your implementation for correctness. It does this by running all of the test programs with a variety of inputs on random files.

make perf tests your implementation for speed and compares it to stdio.

You may also want to write additional tests to test specific features individually of your code, particularly if you plan to implement extra credit features. Appendix VI provides some steps for doing so.


You may find the following tools (in addition to GDB and sanitizers) helpful in debugging this project.


xxd <file> allows you to view the contents of binary files. This will be helpful when you have a buggy implementation and you want to see what is going wrong, by printing out what’s in your result file. The code below demonstrates an example.

$ echo 'this is ascii' > out.bytes
$ xxd out.bytes
00000000: 7468 6973 2069 7320 6173 6369 690a       this is ascii.
$ dd if=/dev/urandom of=out.bytes bs=32 count=1
1+0 records in
1+0 records out
32 bytes copied, 0.000336055 s, 95.2 kB/s
$ xxd out.bytes 
00000000: 65b7 6c53 69f3 f1ed e6d2 09eb ec66 9403  e.lSi........f..
00000010: f33c e929 d703 314f e7dd 5e6b 56a0 2d28  .<.)..1O..^kV.-(


diff <file1> <file2> will tell you if the input files differ. This may again be helpful when you have a buggy implementation and you want to figure out where in the output file you’re differing from the expected content.


strace, as mentioned above, is a tool that provides diagnostic information about the system calls a program is making. This may be especially helpful when you are trying to improve the performance of your implementation. A complete description of strace and its usage is in Appendix V.

Note from Your TAs: Although usingstrace may seem intimidating, it is incredibly useful in improving performance! In particular, it can help you visualize the sequence of system calls.


Check out the function:

static void dbg(struct io300_file *f, char *fmt, ...)

in impl/student.c. Use it to debug while you are working, and then you can silence its output with one keystroke when you hand in. It acts just like printf(), but it also logs your file’s metadata so you can see what is happening as your program executes. When you add fields to your file structure, be sure to include them in the format string after the TODO in dbg().
Here is an example of using it.

int io300_writec(struct io300_file *f, int ch) { dbg(f, "writing char: %c\n", ch); ...

Extra Credit

To optimize your implementation and achieve more impressive performance, you can consider some of the following extensions! This extra credit work is not required to get an A, unless you are a graduate student taking CSCI 1310. In that case, you will need to implement and document at least one performance optimization. Possible optimizations include:

Adaptive Caching

Easy. Detect the direction of access within a file (are the reads occurring in forward or reverse) and adapting your caching strategy to pre-fetch that memory. Measure how much of an impact adaptive caching has on performance over your basic implementation by creating additional test programs in test_scripts/correctness_tests.py and test_scripts/performance_tests.py. Document how you implemented adaptive caching and its performance impact in your README file.

Detecting Strides

Moderately difficult. Have your cache detect “strides”, or access patterns that jump by fixed amounts within a file. For example, reading byte 1, then byte 1001, then byte 2001, and pre-fetching appropriately. Measure how much of an impact detecting strides has on performance over your basic implementation by creating additional test programs in test_scripts/correctness_tests.py and test_scripts/performance_tests.py. Document how you implemented stride detection and its performance impact in your README file.

Memory-Mapped I/O

Difficult. Research “memory-mapped I/O” and how it works, then use it in your implementation when possible. Measure how much of an impact memory-mapped I/O has on performance over your basic implementation by creating additional test programs in test_scripts/correctness_tests.py and test_scripts/performance_tests.py. Document how you implemented memory-mapped I/O and its performance impact in your README file.

Handing In & Grading

Handin instructions

As before, you will hand in your code using Git. In the fileio/ subdirectory of your project repository, you MUST fill in the text file called README.md.

Remind me again what the README.md should contain?
The README.md file will include the following:
  1. Any design decisions you made and comments for graders, under "Design Overview". If there's nothing interesting to say, just list "None".
  2. Any collaborators and citations for help that you received, under "Collaborators". CS 300 encourages collaboration, but we expect you to acknowledge help you received, as is standard academic practice.
  3. Notes on approximately how long it took you to complete the project. We won't use this for grading, but we collect the data to calibrate for future years.

Grading breakdown

Graduate students taking CSCI 1310 should implement and measure one performance optimization, and describe it in their README. Make sure to cover, in a few sentences, what the optimization is, why you expected it to improve performance, and whether it did or did not improve performance in your experiments.

Now head to the grading server, make sure that you have the “Caching I/O” page configured correctly with your project repository, and check that your tests pass on the grading server as expected.

Congratulations, you’ve completed the third CS 300 project! :page_facing_up:

Appendix I: Definitions of Terms

Here is a list of some words with non obvious definitions.

  1. API (Application Programming Interface) – This is a formal contract that a piece of software uses to declare to the outside world how it can be used. An example of this would be the List<E> interface in Java which states that anything claiming to be a List can be asked to add(), remove(), insert(), …, and other “list-like” things.
  2. System Call (syscall) – A function provided by your operating system to accomplish important things like reading/writing to files. These are considered “slow” (relatively) because the kernel has to temporarily “stop” running your code, “start” running its own code (the syscall), and then “resume” your code.
  3. Seek – To move the read/write head of a file to a new position. For example, you can “seek” from the beginning of a file to the end of a file to start adding new data to the end. See Appendix I.
  4. Buffer – Often you will see variables with names like buf, buff, or buffer. These are common names for chunks of memory that you are using as an intermediate storage location before doing something else with that data.

Appendix II: Unix

Here are some statements about files that should be true for most Unices (UNIX-derived operating systems, such as Linux or macOS).

  1. A file is an ordered sequence of bytes - not text, not ASCII characters, not UTF-8 characters, but just bytes. The way we interpret the bytes present in a file leads to a file’s colloquial “type” (like a text file, or a CSV, or a compiled binary). To this end, file extensions (.txt, .csv, .html, .c, .exe) are largely irrelevant. They merely serve as hints as to how their bytes should be interpreted. In fact, there is nothing special about the . character in a filename. So rather than thinking about a file called malte.jpeg as an object called malte of type .jpeg, it should be considered to be a sequence of bytes whose name is malte.jpeg (additionally, you may take the .jpeg suffix to informally mean “this probably contains data in the JPEG format and is suitable to be opened by photo viewers”).

    $ gcc helloworld.c -o helloworld
    $ ./helloworld
    hello world
    $ gcc helloworld.c -o helloworld.cool.program.300
    $ ./helloworld.cool.program.300 
    hello world
    $ head -c 10 helloworld.cool.program.300 | xxd
    00000000: cffa edfe 0700 0001 0300     ..........

    It’s just bytes!

  2. The way we interact with files is through file descriptors. A file descriptor is an integer that the operating system uses to identify a file. Any time you want to do something to that file, you pass the operating system the file descriptor as a reference. The way we get file descriptors is with the open syscall.

    int fd = open("myfile.txt", O_RDONLY);

    From now on, if we want to do something with myfile.txt, we will pass the operating system fd so that it knows which file to manipulate on our behalf.

  3. Once a file has been opened, we can start manipulating it. When a file is open, the operating system keeps track of an index into that file that represents the “read/write head” of that file. This is like using your finger to scan across the lines while you read. Wherever the “head” points is the next byte to be read, and if we want to write a byte, the location where the next byte will be written.

  4. Once we have a file descriptor (which identifies a file), there are three basic operations we can do to modify the bytes within that file.
    4.1 read(fd, buffer, size) will read size bytes from the file identified by fd into the location in memory pointed to by buffer. read returns the number of bytes that were successfully read from the file, and then increments the file’s read/write head by that number of bytes.
    4.2 write(fd, buffer, size) will write size bytes starting at memory location buffer into the file identified by fd. write returns the number of bytes that were successfully written to the file, and then increments the file’s read/write head by that number of bytes.
    4.3 lseek(fd, index, SEEK_SET) will change the the value of the read/write head of a file. This allows us to “skip” around in a file and read/write from arbitrary locations (we don’t always want to read/write sequentially from the start of a file).

Appendix III: Helpful Commands

man 2 open
man 2 close
man 2 read
man 2 write
man 2 lseek

Appendix IV: strace

strace (on Linux) allows you to view the system calls that a program makes. Here is an example of running strace on a plain hello world C program (fed in through stdin). Pay attention to the final two lines of the output, namely the call to write() and exit(). Everything else you see is just the shell starting the program and can be ignored.

$ printf 'int main(){printf("hello world\\n");}' \
>           | gcc -w -x c - && strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffe85175530 /* 26 vars */) = 0
brk(NULL)                               = 0x563c71960000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26696, ...}) = 0
mmap(NULL, 26696, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8ee6db4000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\35\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2030928, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8ee6db2000
mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8ee67a1000
mprotect(0x7f8ee6988000, 2097152, PROT_NONE) = 0
mmap(0x7f8ee6b88000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f8ee6b88000
mmap(0x7f8ee6b8e000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8ee6b8e000
close(3)                                = 0
arch_prctl(ARCH_SET_FS, 0x7f8ee6db34c0) = 0
mprotect(0x7f8ee6b88000, 16384, PROT_READ) = 0
mprotect(0x563c6fd55000, 4096, PROT_READ) = 0
mprotect(0x7f8ee6dbb000, 4096, PROT_READ) = 0
munmap(0x7f8ee6db4000, 26696)           = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
brk(NULL)                               = 0x563c71960000
brk(0x563c71981000)                     = 0x563c71981000
write(1, "hello world\n", 12hello world
)           = 12
exit_group(0)                           = ?
+++ exited with 0 +++

We can use strace to see how frequently our programs are making calls to read() and write() as an indirect way to measure performance.

The following is a demonstration of the number of these syscalls that take place when running byte_cat on a 755 byte long file.

Notice how the naive implementation makes around 750 reads and writes (this makes sense because the naive implementation calls read() and write() once per character) while the stdio implementation makes <10% fewer read and write calls.

$ wc -c test_files/words.rot13.txt 
755 test_files/words.rot13.txt
$ make -B IMPL=naive byte_cat
gcc -ggdb3 -Wall -Wextra -Wshadow -std=gnu11 -fsanitize=address -fsanitize=undefined -fsanitize=leak test_programs/byte_cat.c impl/naive.c -o byte_cat
$ strace ./byte_cat test_files/words.rot13.txt out.txt 2>&1 | grep read | wc -l
$ strace ./byte_cat test_files/words.rot13.txt out.txt 2>&1 | grep write | wc -l
$ make -B IMPL=stdio byte_cat
gcc -ggdb3 -Wall -Wextra -Wshadow -std=gnu11 -fsanitize=address -fsanitize=undefined -fsanitize=leak test_programs/byte_cat.c impl/stdio.c -o byte_cat
$ strace ./byte_cat test_files/words.rot13.txt out.txt 2>&1 | grep read | wc -l
$ strace ./byte_cat test_files/words.rot13.txt out.txt 2>&1 | grep write | wc -l

Appendix V: Implementation Swapping

This section explains how our test programs can use different implementations that match the io300.h API.

Consider the following to be our test program:

#include <stdio.h> // extern means: // "I'm not defining this function here, but someone // else (another source file) will have to define it // before the program is executed" // extern int getRandomNumber(); int main() { printf("%d\n", getRandomNumber()); }

Now we want an implementation of the getRandomNumber function that we can link with the test program to produce a fully functional number printer.

// real-rand.c -- delegate to C standard library (man 3 rand) #include <stdlib.h> int getRandomNumber() { return rand(); }
// my-rand.c -- our own implementation (very fast) int getRandomNumber() { return 4; }

Now when we compile the main test program, we have a choice. Since both implementations conform to the same public API (namely, they provide one function getRandomNumber :: void -> int), either can be swapped in and the main program won’t know the difference.

$ gcc test-prog.c real-rand.c && ./a.out 16807 $ gcc test-prog.c my-rand.c && ./a.out 4

This is the principle behind the different implementations for this project, but instead of the public API being a single function, it is a family of IO related functions defined in io300.h

At a high level, to test your implementation, we will be doing something like this:

$ gcc our-test-program.c impl/stdio.c && time ./a.out $ gcc our-test-program.c impl/naive.c && time ./a.out $ gcc our-test-program.c impl/student.c && time ./a.out

to compare the speed of your implementation against the provided ones.

Appendix VI: Writing Custom Tests

While you are certainly not required to design your own tests—passing the autograder tests we provide is more than enough—this assignment has a lot of moving parts! This appendix will help you write custom tests, which you may want to do for testing specific features (e.g. writing) without the requirement of everything else working.

Let’s create a test program, student_test.c which opens a file, checks its filesize, and writes out an equivalent number of Zs to an output file.

To create this custom test:

  1. Create a new test file, student_test.c, in the test_programs folder:
#include <stdio.h> #include "../io300.h" int main(int argc, char *argv[]) { /* our test takes 3 arguments: program, infile, and outfile * if the wrong number of arguments is provided, this test fails! */ if(argc != 3) { fprintf(stderr, "usage: %s <in-file> <out-file>\n", argv[0]); return 1; } // open our input file, as we need to read its contents struct io300_file *in = io300_open(argv[1], "student test infile"); // if our input file doesn't exist, this test fails! if (in == NULL) { return 1; } // open our output file struct io300_file *out = io300_open(argv[2], "student test z-ified outfile"); // if our output file couldn't be opened, this test fails! if (out == NULL) { io300_close(in); return 1; } off_t const filesize = io300_filesize(in); // if we can't determine the input filesize, this test fails! if (filesize == -1) { fprintf(stderr, "error: could not compute filesize\n"); io300_close(in); io300_close(out); return 1; } io300_close(in); // write out our Zs! for (int i = 0; i < filesize; i++) { // if a write fails, this test fails! if(io300_writec(out, 'Z') == -1) { fprintf(stderr, "error: write should not fail.\n"); io300_close(out); return 1; }; } io300_close(out); // if nothing went wrong, our test passed! return 0; }
  1. Add your test name to the Makefile

In order to create the binary for this test, add its name (for this example, student_test) to TEST_PROGRAMS in the project Makefile. This will make sure this new test gets built along with the existing autograder tests when make is run

  1. Test\(^2\) – Test your test!

A bad test can be very misleading, leading you to think that your code is failing when in actuality it may be completely correct. Try running your test manually on sample files from test_files, and triple check that everything seems to work as intended.

  1. Add your test to the autograder

This project has autograders for both correctness (test_scripts/correctness_test.py) and performance (test_scripts/performance_test.py). Let’s add our test to the correctness autograder, as we’re testing the results of our write function

Our autograder runs each provided test, and asks two questions: did it pass (i.e. returned 0), and is the output actually correct?

Let’s add a new Python function for our test:

def student_test(infile, outfile, outfile2): # create an file equal in size to infile of all Zs (manually); # (you don't need to understand this!) assert shell_return(f"printf '%*s' $(cat {infile} | wc -c) | tr ' ' 'Z' > {outfile2}") == 0 # check if our test passes, and if the output file is correct return shell_return(f'./student_test {infile} {outfile}') == 0 \ and files_same(outfile, outfile2)

You can see the two parts of our test above. Did it pass? Well, shell_return(f'./student_test {infile} {outfile}') == 0 means an error wasn’t raised. Is our output correct? If our manual and generated “Z-files” are the same, we can trust that it is: files_same(outfile, outfile2).

Finally, we can add our new test to our runtests block at the bottom of our autograder file, which handles the execution of each test:

runtests({ #...other existing tests 'rot13': rot13, #your new test! 'student_test': student_test })
  1. Run the autograder

Huzzah: your test is complete! Running make check should provide the results of your custom test, which hopefully passes :)

byte_cat: PASSED
reverse_byte_cat: PASSED
block_cat_1: PASSED
ascii_independence: PASSED
block_cat_17: PASSED
block_cat_334: PASSED
block_cat_huge: PASSED
block_cat_gargantuan: PASSED
reverse_block_cat_1: PASSED
reverse_block_cat_13: PASSED
reverse_block_cat_987: PASSED
reverse_block_cat_huge: PASSED
reverse_block_cat_gargantuan: PASSED
random_block_cat: PASSED
rot13: PASSED
student_test: PASSED

This test was just one sample test you could write. All of our autograder tests are available in test_scripts/, so feel free to reference them in creating your own. Further, the process for adding performance tests is very similar; reference test_scripts/performance_test.py to see the specific syntax used for its tests!

Be careful when modifying tests or the autograder, as changing existing tests could lead to misleading autograder results.

Acknowledgements: The IO project was developed for CS 300.