« Back to the main CS 300 website
Lab 1: C Programming and Makefiles
Due Tuesday, February 6th at 8:00 PM EST
Before attempting this lab, please make sure that you have:
1. Completed Lab 0 – This will ensure that your Docker container and grading server account are set up properly.
2. Completed the Diversity Survey – Your grades for Lab 0 and Lab 1 will depend on whether you’ve submitted this (though you don’t have to answer any of the questions).
3. Signed up for Section on CAB – Our first round of sections will start this week!
Introduction
The purpose of this lab is to give you some experience with writing and understanding the syntax of C programs and apply the tools used to compile and run them. After this lab, you will also be more familiar with pointers and why C and C++ use them.
If you take away anything from this course, hopefully, it’s that Computer Systems are not magic and that much of it actually makes a lot of sense. Don’t be afraid to look up questions on Stack Overflow and Linux Man Pages (which provide great documentation on C library functions), and if that doesn’t help, ask on EdStem!
Why C?
Here are some of highlights: C is an imperative programming language that was mainly developed as a systems programming language to write operating systems. The main features of the C language include low-level access to memory, a simple set of keywords, and clean style, these features make C suitable and widely-used for system programming. C gives you a huge amount of power over what the computer does, which helps optimize the performance of your programs and allows writing low-level sofware that interacts directly with hardware. It also gives you the awesome feeling of really being in control. But with that power comes the responsibility to use it correctly: C has very few safeguards to protect your program’s data or exit gracefully when you make mistakes, and it will happily overwrite your memory with garbage or make your program explode if you make mistakes. Don’t worry, though, we’ll help you find and avoid them!
If you are looking for a detailed tutorial on C, check out the links on our C primer.
Assignment
Assignment Installation
Start with the cs300-s24-labs-YOURNAME
repository you used for Lab 0.
First, ensure that your repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s24-labs.git
Then run:
$ git pull
$ git pull handout main
This will merge our Lab 1 stencil code with your previous work.
You may get a warning about “divergent branches” like shown below:

This means you need to set your Git configuration to merge our code by default. To do so, run (within the same directory in your labs repository workspace):
$ git config pull.rebase false
$ git pull handout main
If you have any “conflicts” from Lab 0 (although this is unlikely), resolve them before continuing further. Run git push
to save your work back to your personal repository.
Assignment Part I: C Programming
In this part of the lab, you will be writing a program that will reverse an array of strings (or, as they are known in C, char pointers). You will be writing two functions in the file reverse.c
and you will test your implementation with the code found in test_reverse.c
.
Setup
After you set up the lab, you should find within the lab1
folder a couple of files:
File |
Description |
reverse.h |
Header file for reverse.c . Contains declarations for the function you should be implementing. (Explained Below) |
reverse.c |
You will be writing your code in this file. |
test_reverse.c |
Contains the test suite in which your implementation will be tested. |
Header Files
You’ll notice that there are three files in the provided stencil code. reverse.c
and test_reverse.c
are similar to what we’ve seen before, containing C code. But what about reverse.h
?
files ending in the .h
extension are called header files, and declare functions so that they can be used in multiple different .c
files. Without a header file, reverse_test.c
wouldn’t be able to use the functions that you create in reverse.c
, which would make testing impossible!
reverse.h
includes the signature of the reverse_arr
function, but with no implementation:
void reverse_arr(char** arr, int num);
and then reverse_test.c
has this line at the top, telling the C compiler to look for function definitions in reverse.h
#include "reverse.h"
This way, when the reverse_arr
function is used in reverse_test.c
, the C compiler checks reverse.h
for a matching signature, and then checks reverse.c
for an implementation of the function.
Review of pointers and strings
Pointers are memory locations that store addresses (i.e., they “point” at whatever is at that address!). For instance, int*
is a pointer to an integer. On a 64-bit architecture (which most computers today use), the pointer occupies 8 bytes in memory, which store the address it points. And that address refers to the first byte of a 4-byte sequence of memory that stores an int
.
As you will notice, there isn’t an explicit data type called “string” in C. That’s because strings in memory are just a sequence of one bytes, each represented as a char
(a 1-byte value). Instead of having a datatype explicitly called “string”, in C, you can think of char
pointers (i.e., char*
) as strings.
char* store = “hello”;
for (int i = 0; *(store + i) != '\0'; i++) {
printf("character: %c\n", *(store + i));
}
Since store
is defined as a char
pointer, store
will point to a byte of memory that stores a character. And if you increment the value of the pointer by 1 (going to the next box) and dereference that value, you would get the next character of the string. This raises the question: couldn’t you just keep incrementing this pointer? How would you know where the end of the string is?
The answer: all strings in C are terminated by a NUL
byte (a char
storing numeric value 0), also known as \0
. This byte indicates that you have reached the end of the string.
Consider the following memory layout for the example program above:
store 0x2000 0x2001 ....
+---+---+ +-----+-----+-----+-----+-----+-----+
| 0x2000| ---> | 'h' | 'e' | 'l' | 'l' | o | \0 |
+---+---+ +-----+-----+-----+-----+-----+-----+
Now think through what happens at each iteration of the loop:
- if
i == 0
, then *(store + 0)
dereferences the memory address stored in store
, which is 0x2000
, and at address 0x2000
, the character “h” is stored as a single byte. This is equivalent to writing store[0]
.
- if
i == 1
, then *(store + 1)
dereferences the memory address stored in store + 1
, which is 0x2001
. At 0x2001
, there is the character “e”. This is equivalent to writing store[1]
.
What you saw here is an example of pointer arithmetic, that is, arithmetic on memory addresses.
Let’s get coding!
Task: You will be writing two functions in the file reverse.c
:
reverse_arr
will take in two inputs, a char*
array and the number of input elements in the array. And reverse_arr
will reverse the inputted array with the help of another function called swap
.
- Note: you can assume that you will have the same number of elements in the array as specified by the second argument, and you will not have to reverse an empty array and all elements will be defined (i.e., not
NULL
).
swap
will take in two elements from the array and swap them.
Note: Remember that pointers are also passed by value as an argument to a function, meaning a copy of the address is passed. Thus, if you make changes to the value of the pointer, the address to which the pointer points changes (rather than the memory object that the pointer points to). Since the address value is copied when passing a pointer, changes to the address will not be reflected outside that function.
Running and Testing
Once you have finished writing your code, you are ready to test your implementation!
To test your code, we provide a file called test_reverse.c
which calls reverse_arr
, which you implemented in reverse.c
. In order to run it, you must compile it and link the two files together into an executable by running the following command:
$ gcc test_reverse.c reverse.c -o reverse_test
This generates an executable called reverse_test
. An executable is a special file that contains machine instructions which are made up of machine instructions encoded as 0s and 1s. And running this file causes the computer to perform those instructions.
In this case, those instructions are to run the program starting from main()
, which first parses input from the command line, reverses the array given, and calls functions that run the tests found in test_reverse()
. (One of the tests will open a file called test.txt
in the current directory, reverses each line of the file, and writes it to an output file called testout.txt.
)
You run your executable via:
$ ./reverse_test <NUM_ELEMENTS> <ELEMENT0 ELEMENT1 ELEMENT2 ...>
For instance:
$ ./reverse_test 2 hello world
will print out the results of reversing the input array and running the test suite. If you fail a test, the output provides the expected result at a given index in the array and the actual result.
To debug, you may find it helpful to print what’s happening in your swap
and reverse_arr
function. For instance, if you wanted to print the variable store from the code sample above, you can do:
printf("string: %s\n", store);
to print a string.
Hint: If you want to print out the value of a pointer, use the %p
syntax for printf.
Once all of your tests pass, you are ready to move on!
Assignment Part II: More on Compiling
As you saw from the previous section, you compiled your program by running:
$ gcc test_reverse.c reverse.c -o reverse_test
With the -o
flag, you can direct the output of the gcc compiler into a file specified by the argument following the flag. If you didn’t use the -o flag, you could run:
$ gcc test_reverse.c reverse.c
And this will produce an executable file called a.out
(this is just a default filename defined by the compiler), which you can run by typing ./a.out
.
Flags
The gcc compiler supports the use of hundreds of different flags, which we can use to customize the compilation process. Flags, typically prefixed by a dash or two (-<flag>
or --<flag>
), can help us in many ways from warning us about programming errors to optimizing our code so that it runs faster.
The general structure for compiling a program with flags is:
$ gcc <flags> <c-files> -o <executable-name>
Warning Flags:
-Wall
- One of the most common flags is the
-Wall
flag. It will cause the compiler to warn you about technically legal but potentially problematic syntax, including:
- Uninitialized and unused variables
- Incorrect return types
- Invalid type comparisons
-Werror
- The
-Werror
flag forces the compiler to treat all compiler warnings as errors, meaning that your code won’t be compiled until you fix the errors. This may seem annoying at first, but in the long run, it can save you lots of time by forcing you to take a look at potentially problematic code.
-Wextra
- This flag adds a few more warnings (which will appear as errors thanks to
-Werror
, but are not covered by -Wall. Some problems that -Wextra
will warn you about include:
- Assert statements that always evaluate to true because of the datatype of the argument
- Unused function parameters (only when used in conjunction with
-Wall
)
- Empty if/else statements.
Task: Add the -Wall
, -Werror
, and -Wextra
flag when compiling test_reverse.c
and fix the errors that come up.
Notice that in test_reverse.c
the main()
function takes in two parameters:
- What’s
argc
supposed to do?
argc
indicates the number of arguments passed into the program.
- Use argc and change the body of
main()
so that when:
argc == 1
then only the test suite should be executed
argc > 1
, the arguments on the command line are in the following order:
- The number of elements to be reversed
- The elements to be reversed
- For example:
./reverse_test 2 csci 300
- You should check to make sure that the number elements inputted by the user actually corresponds to the number of elements to be reversed.
- For example:
./reverse_test 2 csci
should cause an error
- Make sure to return 1 from main on an error, so that the OS can detect that your program exited with errors.
Debugging with Sanitizers: The warning flags don’t catch all errors. For example, memory leaks, stack or heap corruption, and cases of undefined behavior are often not detected by the compiler. You can use sanitizers to help with identifying these bugs! Sanitizers sacrifice efficiency to add additional checks and perform analysis on your code. You will be using these flags in the next lab in greater detail.
-
-fsanitize=address
- This flag enables the AddressSanitizer program, which is a memory error detector developed by Google. This can detect bugs such as out-of-bounds access to heap / stack, global variables, and dangling pointers (using a pointer after the object being pointed to is freed). In practice, this flag also adds another sanitizer, the LeakSanitizer, which detects memory leaks (also available via
-fsanitize=leak
).
-
-fsanitize=undefined
- This flag enables the UndefinedBehaviorSanitizer program. It can detect and catch various kinds of undefined behavior during program execution, such as using null pointers, or signed integer overflow.
-
-g
- This flag requests the compiler to generate and embed debugging information in the executable, especially the source code. This provides more specific debugging information when you’re running your executable with gdb or address sanitizers. You will see this flag being utilized in the next lab.
Optimizations
In addition to flags that let you know about problems in your code, there are also optimization flags that will speed up the runtime of your code at the cost of longer compilation times. Higher optimization levels will optimize by running analyses on your program to determine if the compiler can make certain changes that improve its speed. The higher the optimization level, the longer the compiler will take to compile the program, because it performs more sophisticated analyses on your program. These are the capital O
flags, which include -O0
, -O1
, -O2
, -O3
, and -Os
.
-
-O0
- This will compile your code without optimizations — it’s equivalent to not specifying the
-O
option at all. Because higher optimization levels will often remove and modify portions of your original code, it’s best to use this flag when you’re debugging with gdb or address sanitizers.
-
-O3
- This will enable the most aggressive optimizations, making your code run the fastest.
Task: Time your program before you add the -O3
flag and then after you’ve added the -O3
flag to your compilation. Because this program is so small, you probably won’t be able to detect a difference in speed, but in future assignments where there is a lot more code, the optimization flag will come in handy.
The -O3
flag will ask the compiler to examine what your code is trying to do and rather than following the provided code verbatim it will replace it with machine instructions that functionally do the same thing, but in a more efficient manner.
You can time your program by running the time command in your Docker container. For this exercise, pay attention to the real time, but if you’re curious about the different types of times below, check out this post.
time ./reverse_test
real 0m0.007s
user 0m0.002s
sys 0m0.000s
Assignment Part III: Makefiles
Now you know how to compile C programs! This is great, but actual software projects rarely require you to invoke the compiler directly like we did so far. Often (e.g., in the CS 300 projects!) you need to compile many source files and use specific sets of flags. It’s very easy to forget a flag or source file, and doing this all by hand on the command line is time-consuming. Additionally, when you have many source files (more than 2), it can be annoying to individually recompile/relink each source file when you make a change to it.
This is why the make tool was created! Running the make tool will read a file called the Makefile for specifications on how to compile and link a program. A well-written Makefile
automates all the complicated parts of compilation, so you don’t have to remember each step. Additionally, they can do tasks other than just program compilation — they can execute any shell command we provide.
In this part of the lab, you will be writing a Makefile to use when compiling your reverse array program.
A Makefile consists of one or more rules. The basic structure of a Makefile rule is:
<target>: <dependencies>
[ tab ]<shell_command>
- The target is the name of an output file generated by this rule, or a rule label that you choose in certain special cases.
- The dependencies are the files or other targets that this target depends on.
- The shell command is the command you want to run when the target or dependencies are out of date.
- General Rules:
- From gnu.org: A target is out of date if it does not exist or if it is older than any of the dependencies (by comparison of last-modification times). The idea is that the contents of the target file are computed based on information in the dependencies, so if any of the dependencies changes, the contents of the existing target file are no longer necessarily valid.
- If a target is out of date, running make
<target>
will first remake any of its target dependencies and then run the <shell_command>
.
- In general, the name of the Makefile target should be the same as the name of the output file, because then running make
<target>
will rebuild the target when the output file is older than its dependencies.
Linking is the process of combining many object files and libraries into a single (usually executable) file. If you look at the file test_reverse.c
, at the top, you can see there is an #include “reverse.h”
. This is so that we can use the functions that you wrote to test them, and as you can see, reverse_arr
is called in the function test_reverse
. You can link these two files together with the following Makefile rule:
reverse_test: test_reverse.c reverse.c reverse.h
gcc test_reverse.c reverse.c -o reverse_test
The target is the executable named reverse_test, the dependencies are test_reverse.c
, reverse.c
, and reverse.h
. And to compile, instead of typing the shell command, you can just type:
$ make reverse_test
This will cause the Makefile to run the reverse_test
target, which will execute the command gcc test_reverse.c -o reverse_test
if a reverse_test
executable doesn’t exist or if the reverse_test
executable is older than any of the dependencies. Notice how this only works properly if the name of the output executable is the same as the target name.
That was a lot of reading and information, but now you are ready to create your own Makefile!
Task:
- Create an empty Makefile by typing
touch Makefile
in your lab directory.
- Modify your Makefile so that it has one target,
reverse_test
, that will compile reverse.c
and test_reverse.c
.
- Run
make reverse_test
to make sure it compiles successfully. (You may need to delete the reverse_test
binary via rm -f reverse_test
to make this work.)
Variables
Makefiles support defining variables, so that you can reuse flags and names you commonly use. MY_VAR = "something"
will define a variable that can be used as $(MY_VAR) or ${MY_VAR}
in your rules. A common way to define flags for C program compilation is to have a CFLAGS
variable that you include whenever you run gcc. For example, you can then rewrite your target like this:
CFLAGS = -Wall -Werror
reverse_test: test_reverse.c reverse.c
gcc $(CFLAGS) test_reverse.c reverse.c -o reverse_test
Automatic Variables are special variables called automatic variables that can have a different value for each rule in a Makefile and are designed to make writing rules simpler. They can only be used in the command portion of a rule!
Here are some common automatic variables:
$@
represents the name of the current rule’s target.
$^
represents the names of all of the current rule’s dependencies, with spaces in between.
$<
represents the name of the current rule’s first dependency.
If we wanted to stop using test_reverse.c
and reverse.c
to avoid repetitiveness, we could rewrite our target like this:
reverse_test: test_reverse.c reverse.c reverse.h
gcc $(CFLAGS) $^ -o $@
Task: Use regular variables (i.e. CFLAGS
) and automatic variables simplify your Makefile and add the -O3
flag.
Note: you can do MY_VAR += <additional flags>
if you want to compile with more flags and only use one variable.
Phony Targets
There are also targets known as ‘phony’ targets. These are targets that themselves create no files, but rather exist to provide shortcuts for doing other common operations, like making all the targets in our Makefile or getting rid of all the executables that we made.
To mark targets as phony, you need to include this line before any targets in your Makefile:
.PHONY: target1 target2 etc.
Why do we need to declare a target as phony?
- To avoid a conflict with a file of the same name: We learned earlier that targets will only execute their
<shell_command>
if the target file is out-of-date. This is problematic because phony targets generally don’t create files under the target name. If somehow there exists a file under the same name as a phony target, the phony target’s command will never be run. You can avoid this by explicitly declaring a target as phony to specify to the make tool to rebuild the target even if it’s not “out-of-date”.
- To improve performance: there’s also a more advanced performance advantage that you can learn more about here.
Here are some common phony targets that we’ll be using in this course:
all
target
We use the all target to make all of the executables (non-phony targets) in our project simultaneously. This is what it generally looks like:
all: target1 target2 target3
As you can see, there are no shell commands associated with the all target. In fact, we don’t need to include shell commands for all, because by including each target (target1, target2, target3) as dependencies for the all target, the Makefile will automatically build those targets in order to fulfill the requirements of all.
In other words, since the all
target depends on all the executables in a project, building the all target causes make to first build every other target in our Makefile.
clean
target
We also have a target for getting rid of all the executables (and other files we created with make) in our project. This is the clean target.
The clean target generally looks like this:
clean:
rm -f exec1 exec2 obj1.o obj2.o
As you can see, the clean target is fundamentally just a shell command to remove all the executables and object files that we made earlier. By convention, the clean target should remove all content automatically generated by make. It must be a phony target, because by definition, make clean doesn’t generate output files (but rather removes them)!
Note: Be careful which files you put after the rm -f command
, as they will be deleted when you run make clean. Don’t put your .c
or .h
files because you might lose the code that you wrote!
format
target
In this class, you will notice that all of the Makefiles will also contain a format target, which use a command called clang-format
to style your .c
and .h
files following a specified standard. A typical format command would look like this:
format:
clang-format -style=Google -i <file1>.h <file2>.c ...
The above command will format any listed files according to Google’s coding conventions (a set of stylistic and technical conventions that Google engineers agreed to use).
Note: When using this, keep in mind the order of your #include
files. Formatting might change the order of include statements. This is something to consider if, for example, you are importing a header file that relies on standard libraries from the file you’re importing it in. To avoid this, make sure that your header files are self-contained (i.e., include all the headers they need).
check
target
You’ll also notice a check target in the Makefiles we provide in future labs and projects. If you were to create a check target in this particular instance, the dependency for the check target is the reverse_test
executable.
Task: Add all
, clean
, and format
targets to your Makefile.
- Running make without any targets will run the first target in your Makefile. Consequently, you should place the
all
target as the first target so that typing make will automatically generate all the executables.
- Don’t forget to mark these targets as phony!
Simplifying Linking
It is often a good idea to break compilation of a large program into smaller sub-steps. Consider, for example, this command you used earlier:
gcc test_reverse.c reverse.c -o reverse_test
For this program, gcc creates two separate .o files, one for test_reverse.c
and one for reverse.c
and then links them together. But what if you had hundreds of source files?
Large vs. Small Projects: For small projects, the above works well. However, for large projects it can be much faster to generate intermediate .o files (so-called “object files”) and then separately link the .o files together into an executable. Linking is the process of combining multiple object files (which already contain machine code, but not a full program) into a full executable program.
Why does this make sense? Imagine a project that generates two shared libraries and four executables, all of which separately link a file called data.c
. Let’s say the data.o
file takes 1 second to compile. If you compile and link each executable in one command (without creating intermediate .o files), gcc will rebuild the data.o file five times, resulting in 5 seconds of build time. If you separately build the data.o file, you’ll build the data.c file only once (taking 1 second) and then link it (which is much faster than compiling from scratch, especially with large source files). So, if linking takes 0.2 seconds per file, the total build time will be 2 seconds instead of 5 seconds.
Although this technique won’t yield a huge performance benefit in the case of our small lab, let’s try this to drive the concept of linking home! We can then use our Makefile to automate this process for us, so that we don’t have to regenerate all object and source files every time we edit one of them.
To do this, we need to first generate object files for each file, containing the machine instructions. Then we need to link these programs together into one executable.
To create the object files without linking them, we use the -c flag when running gcc. For example, to create object files for test_reverse.c
and reverse.c
, we would run:
$ gcc <flags> -c reverse.c -o reverse.o
$ gcc <flags> -c test_reverse.c -o test_reverse.o
This will generate reverse.o
and test_reverse.o
files. Then, to link the object files into an executable, we would run:
$ gcc test_reverse.o reverse.o -o <executable name>`
The advantage of creating object files independently is that when a source file is changed, we only need to create the object file for that source file. For example, if we changed reverse.c
, we would just have to run gcc -c reverse.c -o reverse.o
to get the object file, and then gcc reverse.o test_reverse.o
instead of also regenerating test_reverse.o
to get the final executable.
Task: In your Makefile, create targets for test_reverse.o
and reverse.o
, that each include the corresponding source file as a dependency.
- Each of these targets should compile their source file into an object file (not an executable). They also need their correct flags for optimization and debugging.
- Update your
reverse_test
targets to use the .o
files.
- Update your clean and format targets.
- Thanks to this, make will only recompile each individual object file if that file’s source was changed. It may not make the biggest difference for this lab, but in a larger project doing this will save you lots of time.
Pattern Rules
The last Makefile technique we’ll discuss are pattern rules. These are very commonly used in Makefiles. A pattern rule uses the %
character in the target to create a general rule. As an example:
file_%: %.c
gcc $< -o $@
The %
will match any non empty substring in the target, and the %
used in dependencies will substitute the target’s matched string. In this case, this will specify how to make any file_<name>
executable with another file called <name>.c
as a dependency. If <name>.c
doesn’t exist or can’t be made, this will throw an error.
As you may have noticed, both the test_reverse.o and reverse.o targets are running the same command, which means that we can simplify it.
Task: Use pattern rules to simplify your Makefile targets such that you can generate reverse.o
and test_reverse.o
using only one rule rather than two seperate rules.
If you need help, this documentation might help.
Handin instructions
Turn in your code by pushing your git repository to csci0300-s24-labs-YOURUSERNAME.git
.
Then, head to the grading server. On the “Labs” page, use the “Lab 1 checkoff” button to check off your lab.
Note: Lab checkoffs are tied to Git commits. So, when you check off Lab 1 (with a new commit), your grade for Lab 0 will disappear.
This is nothing to worry about! Your grade is still associated with the older commit, and if you select that commit from the dropdown on the grading server, you will be able to see the prior grade.
At the end of the semester, we will collate all lab grades across your commit history.
« Back to the main CS 300 website
Lab 1: C Programming and Makefiles
Due Tuesday, February 6th at 8:00 PM EST
1. Completed Lab 0 – This will ensure that your Docker container and grading server account are set up properly.
2. Completed the Diversity Survey – Your grades for Lab 0 and Lab 1 will depend on whether you’ve submitted this (though you don’t have to answer any of the questions).
3. Signed up for Section on CAB – Our first round of sections will start this week!
Introduction
The purpose of this lab is to give you some experience with writing and understanding the syntax of C programs and apply the tools used to compile and run them. After this lab, you will also be more familiar with pointers and why C and C++ use them.
If you take away anything from this course, hopefully, it’s that Computer Systems are not magic and that much of it actually makes a lot of sense. Don’t be afraid to look up questions on Stack Overflow and Linux Man Pages (which provide great documentation on C library functions), and if that doesn’t help, ask on EdStem!
Why C?
Here are some of highlights: C is an imperative programming language that was mainly developed as a systems programming language to write operating systems. The main features of the C language include low-level access to memory, a simple set of keywords, and clean style, these features make C suitable and widely-used for system programming. C gives you a huge amount of power over what the computer does, which helps optimize the performance of your programs and allows writing low-level sofware that interacts directly with hardware. It also gives you the awesome feeling of really being in control. But with that power comes the responsibility to use it correctly: C has very few safeguards to protect your program’s data or exit gracefully when you make mistakes, and it will happily overwrite your memory with garbage or make your program explode if you make mistakes. Don’t worry, though, we’ll help you find and avoid them!
If you are looking for a detailed tutorial on C, check out the links on our C primer.
Assignment
Assignment Installation
Start with the
cs300-s24-labs-YOURNAME
repository you used for Lab 0.First, ensure that your repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our Lab 1 stencil code with your previous work.
You may get a warning about “divergent branches” like shown below:
This means you need to set your Git configuration to merge our code by default. To do so, run (within the same directory in your labs repository workspace):
$ git config pull.rebase false $ git pull handout main
If you have any “conflicts” from Lab 0 (although this is unlikely), resolve them before continuing further. Run
git push
to save your work back to your personal repository.Assignment Part I: C Programming
In this part of the lab, you will be writing a program that will reverse an array of strings (or, as they are known in C, char pointers). You will be writing two functions in the file
reverse.c
and you will test your implementation with the code found intest_reverse.c
.Setup
After you set up the lab, you should find within the
lab1
folder a couple of files:reverse.h
reverse.c
. Contains declarations for the function you should be implementing. (Explained Below)reverse.c
test_reverse.c
Header Files
You’ll notice that there are three files in the provided stencil code.
reverse.c
andtest_reverse.c
are similar to what we’ve seen before, containing C code. But what aboutreverse.h
?files ending in the
.h
extension are called header files, and declare functions so that they can be used in multiple different.c
files. Without a header file,reverse_test.c
wouldn’t be able to use the functions that you create inreverse.c
, which would make testing impossible!reverse.h
includes the signature of thereverse_arr
function, but with no implementation:and then
reverse_test.c
has this line at the top, telling the C compiler to look for function definitions inreverse.h
This way, when the
reverse_arr
function is used inreverse_test.c
, the C compiler checksreverse.h
for a matching signature, and then checksreverse.c
for an implementation of the function.Review of pointers and strings
Pointers are memory locations that store addresses (i.e., they “point” at whatever is at that address!). For instance,
int*
is a pointer to an integer. On a 64-bit architecture (which most computers today use), the pointer occupies 8 bytes in memory, which store the address it points. And that address refers to the first byte of a 4-byte sequence of memory that stores anint
.As you will notice, there isn’t an explicit data type called “string” in C. That’s because strings in memory are just a sequence of one bytes, each represented as a
char
(a 1-byte value). Instead of having a datatype explicitly called “string”, in C, you can think ofchar
pointers (i.e.,char*
) as strings.Since
store
is defined as achar
pointer,store
will point to a byte of memory that stores a character. And if you increment the value of the pointer by 1 (going to the next box) and dereference that value, you would get the next character of the string. This raises the question: couldn’t you just keep incrementing this pointer? How would you know where the end of the string is?The answer: all strings in C are terminated by a
NUL
byte (achar
storing numeric value 0), also known as\0
. This byte indicates that you have reached the end of the string.Consider the following memory layout for the example program above:
Now think through what happens at each iteration of the loop:
i == 0
, then*(store + 0)
dereferences the memory address stored instore
, which is0x2000
, and at address0x2000
, the character “h” is stored as a single byte. This is equivalent to writingstore[0]
.i == 1
, then*(store + 1)
dereferences the memory address stored instore + 1
, which is0x2001
. At0x2001
, there is the character “e”. This is equivalent to writingstore[1]
.What you saw here is an example of pointer arithmetic, that is, arithmetic on memory addresses.
Let’s get coding!
Task: You will be writing two functions in the file
reverse.c
:reverse_arr
will take in two inputs, achar*
array and the number of input elements in the array. Andreverse_arr
will reverse the inputted array with the help of another function calledswap
.NULL
).swap
will take in two elements from the array and swap them.Note: Remember that pointers are also passed by value as an argument to a function, meaning a copy of the address is passed. Thus, if you make changes to the value of the pointer, the address to which the pointer points changes (rather than the memory object that the pointer points to). Since the address value is copied when passing a pointer, changes to the address will not be reflected outside that function.
Running and Testing
Once you have finished writing your code, you are ready to test your implementation!
To test your code, we provide a file called
test_reverse.c
which callsreverse_arr
, which you implemented inreverse.c
. In order to run it, you must compile it and link the two files together into an executable by running the following command:$ gcc test_reverse.c reverse.c -o reverse_test
This generates an executable called
reverse_test
. An executable is a special file that contains machine instructions which are made up of machine instructions encoded as 0s and 1s. And running this file causes the computer to perform those instructions.In this case, those instructions are to run the program starting from
main()
, which first parses input from the command line, reverses the array given, and calls functions that run the tests found intest_reverse()
. (One of the tests will open a file calledtest.txt
in the current directory, reverses each line of the file, and writes it to an output file calledtestout.txt.
)You run your executable via:
$ ./reverse_test <NUM_ELEMENTS> <ELEMENT0 ELEMENT1 ELEMENT2 ...>
For instance:
$ ./reverse_test 2 hello world
will print out the results of reversing the input array and running the test suite. If you fail a test, the output provides the expected result at a given index in the array and the actual result.
To debug, you may find it helpful to print what’s happening in your
swap
andreverse_arr
function. For instance, if you wanted to print the variable store from the code sample above, you can do:to print a string.
Hint: If you want to print out the value of a pointer, use the
%p
syntax for printf.Once all of your tests pass, you are ready to move on!
Assignment Part II: More on Compiling
As you saw from the previous section, you compiled your program by running:
$ gcc test_reverse.c reverse.c -o reverse_test
With the
-o
flag, you can direct the output of the gcc compiler into a file specified by the argument following the flag. If you didn’t use the -o flag, you could run:$ gcc test_reverse.c reverse.c
And this will produce an executable file called
a.out
(this is just a default filename defined by the compiler), which you can run by typing./a.out
.Flags
The gcc compiler supports the use of hundreds of different flags, which we can use to customize the compilation process. Flags, typically prefixed by a dash or two (
-<flag>
or--<flag>
), can help us in many ways from warning us about programming errors to optimizing our code so that it runs faster.The general structure for compiling a program with flags is:
$ gcc <flags> <c-files> -o <executable-name>
Warning Flags:
-Wall
-Wall
flag. It will cause the compiler to warn you about technically legal but potentially problematic syntax, including:-Werror
-Werror
flag forces the compiler to treat all compiler warnings as errors, meaning that your code won’t be compiled until you fix the errors. This may seem annoying at first, but in the long run, it can save you lots of time by forcing you to take a look at potentially problematic code.-Wextra
-Werror
, but are not covered by -Wall. Some problems that-Wextra
will warn you about include:-Wall
)Task: Add the
-Wall
,-Werror
, and-Wextra
flag when compilingtest_reverse.c
and fix the errors that come up.Notice that in
test_reverse.c
themain()
function takes in two parameters:argc
supposed to do?argc
indicates the number of arguments passed into the program.main()
so that when:argc == 1
then only the test suite should be executedargc > 1
, the arguments on the command line are in the following order:./reverse_test 2 csci 300
./reverse_test 2 csci
should cause an errorDebugging with Sanitizers: The warning flags don’t catch all errors. For example, memory leaks, stack or heap corruption, and cases of undefined behavior are often not detected by the compiler. You can use sanitizers to help with identifying these bugs! Sanitizers sacrifice efficiency to add additional checks and perform analysis on your code. You will be using these flags in the next lab in greater detail.
-fsanitize=address
-fsanitize=leak
).-fsanitize=undefined
-g
Optimizations
In addition to flags that let you know about problems in your code, there are also optimization flags that will speed up the runtime of your code at the cost of longer compilation times. Higher optimization levels will optimize by running analyses on your program to determine if the compiler can make certain changes that improve its speed. The higher the optimization level, the longer the compiler will take to compile the program, because it performs more sophisticated analyses on your program. These are the capital
O
flags, which include-O0
,-O1
,-O2
,-O3
, and-Os
.-O0
-O
option at all. Because higher optimization levels will often remove and modify portions of your original code, it’s best to use this flag when you’re debugging with gdb or address sanitizers.-O3
Task: Time your program before you add the
-O3
flag and then after you’ve added the-O3
flag to your compilation. Because this program is so small, you probably won’t be able to detect a difference in speed, but in future assignments where there is a lot more code, the optimization flag will come in handy.The
-O3
flag will ask the compiler to examine what your code is trying to do and rather than following the provided code verbatim it will replace it with machine instructions that functionally do the same thing, but in a more efficient manner.You can time your program by running the time command in your Docker container. For this exercise, pay attention to the real time, but if you’re curious about the different types of times below, check out this post.
Assignment Part III: Makefiles
Now you know how to compile C programs! This is great, but actual software projects rarely require you to invoke the compiler directly like we did so far. Often (e.g., in the CS 300 projects!) you need to compile many source files and use specific sets of flags. It’s very easy to forget a flag or source file, and doing this all by hand on the command line is time-consuming. Additionally, when you have many source files (more than 2), it can be annoying to individually recompile/relink each source file when you make a change to it.
This is why the make tool was created! Running the make tool will read a file called the Makefile for specifications on how to compile and link a program. A well-written
Makefile
automates all the complicated parts of compilation, so you don’t have to remember each step. Additionally, they can do tasks other than just program compilation — they can execute any shell command we provide.In this part of the lab, you will be writing a Makefile to use when compiling your reverse array program.
A Makefile consists of one or more rules. The basic structure of a Makefile rule is:
<target>
will first remake any of its target dependencies and then run the<shell_command>
.<target>
will rebuild the target when the output file is older than its dependencies.Linking is the process of combining many object files and libraries into a single (usually executable) file. If you look at the file
test_reverse.c
, at the top, you can see there is an#include “reverse.h”
. This is so that we can use the functions that you wrote to test them, and as you can see,reverse_arr
is called in the functiontest_reverse
. You can link these two files together with the following Makefile rule:The target is the executable named reverse_test, the dependencies are
test_reverse.c
,reverse.c
, andreverse.h
. And to compile, instead of typing the shell command, you can just type:$ make reverse_test
This will cause the Makefile to run the
reverse_test
target, which will execute the commandgcc test_reverse.c -o reverse_test
if areverse_test
executable doesn’t exist or if thereverse_test
executable is older than any of the dependencies. Notice how this only works properly if the name of the output executable is the same as the target name.That was a lot of reading and information, but now you are ready to create your own Makefile!
Task:
touch Makefile
in your lab directory.reverse_test
, that will compilereverse.c
andtest_reverse.c
.make reverse_test
to make sure it compiles successfully. (You may need to delete thereverse_test
binary viarm -f reverse_test
to make this work.)Variables
Makefiles support defining variables, so that you can reuse flags and names you commonly use.
MY_VAR = "something"
will define a variable that can be used as$(MY_VAR) or ${MY_VAR}
in your rules. A common way to define flags for C program compilation is to have aCFLAGS
variable that you include whenever you run gcc. For example, you can then rewrite your target like this:Automatic Variables are special variables called automatic variables that can have a different value for each rule in a Makefile and are designed to make writing rules simpler. They can only be used in the command portion of a rule!
Here are some common automatic variables:
$@
represents the name of the current rule’s target.$^
represents the names of all of the current rule’s dependencies, with spaces in between.$<
represents the name of the current rule’s first dependency.If we wanted to stop using
test_reverse.c
andreverse.c
to avoid repetitiveness, we could rewrite our target like this:Task: Use regular variables (i.e.
CFLAGS
) and automatic variables simplify your Makefile and add the-O3
flag.Note: you can do
MY_VAR += <additional flags>
if you want to compile with more flags and only use one variable.Phony Targets
There are also targets known as ‘phony’ targets. These are targets that themselves create no files, but rather exist to provide shortcuts for doing other common operations, like making all the targets in our Makefile or getting rid of all the executables that we made.
To mark targets as phony, you need to include this line before any targets in your Makefile:
.PHONY: target1 target2 etc.
Why do we need to declare a target as phony?
<shell_command>
if the target file is out-of-date. This is problematic because phony targets generally don’t create files under the target name. If somehow there exists a file under the same name as a phony target, the phony target’s command will never be run. You can avoid this by explicitly declaring a target as phony to specify to the make tool to rebuild the target even if it’s not “out-of-date”.Here are some common phony targets that we’ll be using in this course:
all
targetWe use the all target to make all of the executables (non-phony targets) in our project simultaneously. This is what it generally looks like:
all: target1 target2 target3
As you can see, there are no shell commands associated with the all target. In fact, we don’t need to include shell commands for all, because by including each target (target1, target2, target3) as dependencies for the all target, the Makefile will automatically build those targets in order to fulfill the requirements of all.
In other words, since the
all
target depends on all the executables in a project, building the all target causes make to first build every other target in our Makefile.clean
targetWe also have a target for getting rid of all the executables (and other files we created with make) in our project. This is the clean target.
The clean target generally looks like this:
As you can see, the clean target is fundamentally just a shell command to remove all the executables and object files that we made earlier. By convention, the clean target should remove all content automatically generated by make. It must be a phony target, because by definition, make clean doesn’t generate output files (but rather removes them)!
Note: Be careful which files you put after the
rm -f command
, as they will be deleted when you run make clean. Don’t put your.c
or.h
files because you might lose the code that you wrote!format
targetIn this class, you will notice that all of the Makefiles will also contain a format target, which use a command called
clang-format
to style your.c
and.h
files following a specified standard. A typical format command would look like this:The above command will format any listed files according to Google’s coding conventions (a set of stylistic and technical conventions that Google engineers agreed to use).
Note: When using this, keep in mind the order of your
#include
files. Formatting might change the order of include statements. This is something to consider if, for example, you are importing a header file that relies on standard libraries from the file you’re importing it in. To avoid this, make sure that your header files are self-contained (i.e., include all the headers they need).check
targetYou’ll also notice a check target in the Makefiles we provide in future labs and projects. If you were to create a check target in this particular instance, the dependency for the check target is the
reverse_test
executable.Task: Add
all
,clean
, andformat
targets to your Makefile.all
target as the first target so that typing make will automatically generate all the executables.Simplifying Linking
It is often a good idea to break compilation of a large program into smaller sub-steps. Consider, for example, this command you used earlier:
gcc test_reverse.c reverse.c -o reverse_test
For this program, gcc creates two separate .o files, one for
test_reverse.c
and one forreverse.c
and then links them together. But what if you had hundreds of source files?Large vs. Small Projects: For small projects, the above works well. However, for large projects it can be much faster to generate intermediate .o files (so-called “object files”) and then separately link the .o files together into an executable. Linking is the process of combining multiple object files (which already contain machine code, but not a full program) into a full executable program.
Why does this make sense? Imagine a project that generates two shared libraries and four executables, all of which separately link a file called
data.c
. Let’s say thedata.o
file takes 1 second to compile. If you compile and link each executable in one command (without creating intermediate .o files), gcc will rebuild the data.o file five times, resulting in 5 seconds of build time. If you separately build the data.o file, you’ll build the data.c file only once (taking 1 second) and then link it (which is much faster than compiling from scratch, especially with large source files). So, if linking takes 0.2 seconds per file, the total build time will be 2 seconds instead of 5 seconds.Although this technique won’t yield a huge performance benefit in the case of our small lab, let’s try this to drive the concept of linking home! We can then use our Makefile to automate this process for us, so that we don’t have to regenerate all object and source files every time we edit one of them.
To do this, we need to first generate object files for each file, containing the machine instructions. Then we need to link these programs together into one executable.
To create the object files without linking them, we use the -c flag when running gcc. For example, to create object files for
test_reverse.c
andreverse.c
, we would run:$ gcc <flags> -c reverse.c -o reverse.o $ gcc <flags> -c test_reverse.c -o test_reverse.o
This will generate
reverse.o
andtest_reverse.o
files. Then, to link the object files into an executable, we would run:$ gcc test_reverse.o reverse.o -o <executable name>`
The advantage of creating object files independently is that when a source file is changed, we only need to create the object file for that source file. For example, if we changed
reverse.c
, we would just have to rungcc -c reverse.c -o reverse.o
to get the object file, and thengcc reverse.o test_reverse.o
instead of also regeneratingtest_reverse.o
to get the final executable.Task: In your Makefile, create targets for
test_reverse.o
andreverse.o
, that each include the corresponding source file as a dependency.reverse_test
targets to use the.o
files.Pattern Rules
The last Makefile technique we’ll discuss are pattern rules. These are very commonly used in Makefiles. A pattern rule uses the
%
character in the target to create a general rule. As an example:The
%
will match any non empty substring in the target, and the%
used in dependencies will substitute the target’s matched string. In this case, this will specify how to make anyfile_<name>
executable with another file called<name>.c
as a dependency. If<name>.c
doesn’t exist or can’t be made, this will throw an error.As you may have noticed, both the test_reverse.o and reverse.o targets are running the same command, which means that we can simplify it.
Task: Use pattern rules to simplify your Makefile targets such that you can generate
reverse.o
andtest_reverse.o
using only one rule rather than two seperate rules.If you need help, this documentation might help.
Handin instructions
Turn in your code by pushing your git repository to
csci0300-s24-labs-YOURUSERNAME.git
.Then, head to the grading server. On the “Labs” page, use the “Lab 1 checkoff” button to check off your lab.
Note: Lab checkoffs are tied to Git commits. So, when you check off Lab 1 (with a new commit), your grade for Lab 0 will disappear.
This is nothing to worry about! Your grade is still associated with the older commit, and if you select that commit from the dropdown on the grading server, you will be able to see the prior grade.
At the end of the semester, we will collate all lab grades across your commit history.