C Primer Part 3 (Debugging Edition)
CS300 is a difficult class, so we’ve created this primer to help you figure out where certain issues you’re experiencing may be coming from.
Read on ahead to figure out how to debug your docker, or several strategies for figuring out how to debug code.
Contents:
Using Print Statements
Using print statements are a great idea to help debug where your code may be going wrong.
Several uses for print statements are:
- Checking what variables are storing
- Verifying is program is entering certain function
- Enumerating the number of times a program is entering a function or loop
Print statements should be descriptive so you know:
- What you’re trying to test
- Where you are in your program
Example of a bad print statement:
int anInt;
anInt = 6;
if(anInt == 6){
printf("Here 1");
} else {
printf("Here 2");
}
The printf
statements above didn’t really indicate what you were trying to test. Also, the statements were very similar – you would have to go back and check what “Here 1” meant to see if the program worked.
Example of slightly better print statement:
int anInt;
anInt = 6;
if (anInt == 6) {
printf("In the if block");
} else {
printf("In the else block");
}
While this message is better – we now know where the program will end up – we don’t know whether being in the if
block is what we want.
Example of a much better print statement:
int anInt;
anInt = 6;
if (anInt == 6) {
printf("Checking whether the declaration worked, in the if block\n");
} else {
printf("Declaration did not work. In the else block. "
"anInt's value is: %d\n", anInt);
}
In the above example the printf
statements give you a lot more information.
First, we know whether or not the program succeeded. Second, where the program ended up. And, in case the program did not succeed, the actual value of the variable we wanted to test.
printf
statements are a great way to debug your code when your program compiles, but produces unexpected results.
GDB
GDB is a great way to step through your code and see where the program goes and values that variables have as the program runs. GDB can be used:
- If your program has a segmentation fault, and you’d want to figure out where and when it occurs.
- If you’re not sure why certain variables have the values they do.
- To understand how execution flows when your program runs and whether specific code ever executes.
- For many other reasons!
We highly suggest reviewing Lab 2 if you need a more thorough refresher on GDB.
We also highly recommend this GDB Guide or this GDB Cheat Sheet.
Inspecting File Contents
You may have times during this course when you need to read or write files and check their content, even if those files are not text files.
To check the contents of a binary (non-text) file, we recommend a hexdump tool like xxd
.
The above command will “dump” the contents of the file in hexadecimal format, so you can see what it contains even if the data does not consist of printable characters.
To efficiently compare and contrast the contents of two text files we recommend:
diff -u <file1 name> <file2 name>
If there exists a difference between the files, the above command will produce the location of the difference, and the information that is different in the two files.
However, diff
does not work well on non-text files. But you can combine it with a hexdump tool to compare those!
$ xxd <file 1 name> > file1.hex
$ xxd <file 2 name> > file2.hex
$ diff -u file1.hex file2.hex
Note that the >
character redirects the output from xxd
into a file (e.g., file1.hex
).
Address Sanitizers
Modern C compilers come with a handy tool called the address sanitizer. The address sanitizer helps detect invalid memory accesses and memory leaks, and is enabled via the flag -fsanitize=address
.
In some projects, we compiler your code with this flag by default, but you can disable it by adding ASAN=0
to your make command. However, you almost always want to pass your tests with sanitizers enabled. The grading server will compile your handin with sanitizers — so your code must pass the tests with sanitizers enabled to get full credit!
You may find that gdb
sometimes conflicts with the address sanitizer. If so, you can re-make
the project without sanitizers (make -B all ASAN=0
). Just remember to reenable sanitizers once you’ve finished debugging with GDB!
The biggest piece of advice we can give you is to read the errors you get from the sanitizer! The address sanitizer errors are very detailed, and often contain where and why issues are occuring.
Common AddressSanitizer Errors
Here is a list of common ASAN errors:
- heap-buffer-overflow/underflow
Reason: Your code accessed memory outside a valid heap (dynamic lifetime) allocation.
Typical bugs that cause this: Off-by-one error on size passed to malloc(); incorrect index calculation when accessing heap memory via array subscript or pointer arithmetic; copied data larger than allocation into an allocation.
Next steps: Check which allocation is affected (ASAN tells you this), and by how much. Investigate if you allocated too little memory or wrote too much data.
- stack-buffer-overflow/underflow
Reason: Your code accessed memory outside a valid memory region on the stack (automatic lifetime). Often involves arrays.
Typical bugs that cause this: Allocated a fixed-size array on the stack, but tried to write more data into it (e.g., 1005-character name into 1000-character array); off by one error on indexing; incorrect index calculation; writing larger type into variable of smaller type (e.g., char* into int, or int into char); incorrect pointer arithmetic with stack addresses.
Next steps: Check the backtrace to see what functions and variable’s space you exceeded. Depending on the amount of overflow, ASAN may not be perfect at reporting the source, but it does usually correctly report the place where the memory access happens.
- dynamic-stack-buffer-overflow/underflow
Same as above, but for a dynamically-sized array in the stack segment.
- global-buffer-overflow/underflow
Same as above, but for global (static lifetime) variables.
- SEGV on unknown address
Reason: Your code dereferenced something as an address that isn’t a valid address in this program. Commonly involves address 0x0, the NULL pointer.
Typical bugs that cause this: You didn’t initialize a pointer variable, and ended up dereferencing some garbage left over in memory as an address; a pointer was NULL at runtime but got dereferenced; your code accidentally overwrote a pointer variable with data and you’re dereferencing that data (e.g., a deref of a small integer like 0x1).
Next steps: Find the pointer in question from ASAN’s back trace, and figure out where its value comes from. This may involve checking where it gets assigned, and debugging with GDB or print statements where the value of the pointer variable changes (this could happen due to a seemingly unrelated assignment if that assignment corrupts the pointer).
- heap-use-after-free
Reason: Your code accessed memory in a heap allocation after it was already freed.
Typical bugs that cause this: You left a pointer to a free’d heap allocation in a data structure and later dereferenced it; you passed a pointer to already freed heap memory to somewhere that ended up dereferencing the pointer.
Next steps: Look at where the pointer is dereferenced (ASAN’s backtrace tells you), and then see where the allocation got freed (also in the back trace). Try to figure out what happened in between, and how the pointer to the now-dead dynamic lifetime memory continued to exist.
- double-free
Reason: Your code calls free() twice with the same address as an argument, and the dynamic lifetime memory is dead on the second call. (Note that calling free() multiple times with the same address is fine if you called malloc() in between and it gave you the memory at this address again).
Typical bugs that cause this: Multiple cleanup code paths that clear up the same resources; pointers stored in data structures that already got freed elsewhere.
Next steps: Figure out where the offending free() happens, and where the prior free() call happened (both are in the ASAN backtraces). Then understand your logic and why both calls happened on the same pointer; change it to avoid this.
- stack-use-after-return
Reason: You returned a pointer to a stack (automatic lifetime) variable, whose lifetime has ended by the time the calling function gets to run again.
Typical bugs that cause this: Returning a pointer to a local variable, or into a local array.
Next steps: Check what you’re returning in the location indicated by the backtrace and trace it back to where it comes from.
- LeakSanitizer error
Reason: Your program did not call free() for a heap-allocated memory region before exiting.
Typical bugs that cause this: You forgot a free() call; or you lost track of a pointer that you needed to free, either by overwriting it or by not storing/passing it for another part of your code to free it.
Next steps: Find out which allocation is affected, and track where the pointer returned from malloc() gets passed or stored. Check that all code paths to the exit of the program end up calling free() on this pointer.
C Primer Part 3 (Debugging Edition)
CS300 is a difficult class, so we’ve created this primer to help you figure out where certain issues you’re experiencing may be coming from.
Read on ahead to figure out how to debug your docker, or several strategies for figuring out how to debug code.
Contents:
Using Print Statements
Using print statements are a great idea to help debug where your code may be going wrong.
Several uses for print statements are:
Print statements should be descriptive so you know:
Example of a bad print statement:
The
printf
statements above didn’t really indicate what you were trying to test. Also, the statements were very similar – you would have to go back and check what “Here 1” meant to see if the program worked.Example of slightly better print statement:
While this message is better – we now know where the program will end up – we don’t know whether being in the
if
block is what we want.Example of a much better print statement:
In the above example the
printf
statements give you a lot more information.First, we know whether or not the program succeeded. Second, where the program ended up. And, in case the program did not succeed, the actual value of the variable we wanted to test.
printf
statements are a great way to debug your code when your program compiles, but produces unexpected results.GDB
GDB is a great way to step through your code and see where the program goes and values that variables have as the program runs. GDB can be used:
We highly suggest reviewing Lab 2 if you need a more thorough refresher on GDB.
We also highly recommend this GDB Guide or this GDB Cheat Sheet.
Inspecting File Contents
You may have times during this course when you need to read or write files and check their content, even if those files are not text files.
To check the contents of a binary (non-text) file, we recommend a hexdump tool like
xxd
.The above command will “dump” the contents of the file in hexadecimal format, so you can see what it contains even if the data does not consist of printable characters.
To efficiently compare and contrast the contents of two text files we recommend:
If there exists a difference between the files, the above command will produce the location of the difference, and the information that is different in the two files.
However,
diff
does not work well on non-text files. But you can combine it with a hexdump tool to compare those!Note that the
>
character redirects the output fromxxd
into a file (e.g.,file1.hex
).Address Sanitizers
Modern C compilers come with a handy tool called the address sanitizer. The address sanitizer helps detect invalid memory accesses and memory leaks, and is enabled via the flag
-fsanitize=address
.In some projects, we compiler your code with this flag by default, but you can disable it by adding
ASAN=0
to your make command. However, you almost always want to pass your tests with sanitizers enabled. The grading server will compile your handin with sanitizers — so your code must pass the tests with sanitizers enabled to get full credit!You may find that
gdb
sometimes conflicts with the address sanitizer. If so, you can re-make
the project without sanitizers (make -B all ASAN=0
). Just remember to reenable sanitizers once you’ve finished debugging with GDB!The biggest piece of advice we can give you is to read the errors you get from the sanitizer! The address sanitizer errors are very detailed, and often contain where and why issues are occuring.
Common AddressSanitizer Errors
Here is a list of common ASAN errors:
Reason: Your code accessed memory outside a valid heap (dynamic lifetime) allocation.
Typical bugs that cause this: Off-by-one error on size passed to malloc(); incorrect index calculation when accessing heap memory via array subscript or pointer arithmetic; copied data larger than allocation into an allocation.
Next steps: Check which allocation is affected (ASAN tells you this), and by how much. Investigate if you allocated too little memory or wrote too much data.
Reason: Your code accessed memory outside a valid memory region on the stack (automatic lifetime). Often involves arrays.
Typical bugs that cause this: Allocated a fixed-size array on the stack, but tried to write more data into it (e.g., 1005-character name into 1000-character array); off by one error on indexing; incorrect index calculation; writing larger type into variable of smaller type (e.g., char* into int, or int into char); incorrect pointer arithmetic with stack addresses.
Next steps: Check the backtrace to see what functions and variable’s space you exceeded. Depending on the amount of overflow, ASAN may not be perfect at reporting the source, but it does usually correctly report the place where the memory access happens.
Same as above, but for a dynamically-sized array in the stack segment.
Same as above, but for global (static lifetime) variables.
Reason: Your code dereferenced something as an address that isn’t a valid address in this program. Commonly involves address 0x0, the NULL pointer.
Typical bugs that cause this: You didn’t initialize a pointer variable, and ended up dereferencing some garbage left over in memory as an address; a pointer was NULL at runtime but got dereferenced; your code accidentally overwrote a pointer variable with data and you’re dereferencing that data (e.g., a deref of a small integer like 0x1).
Next steps: Find the pointer in question from ASAN’s back trace, and figure out where its value comes from. This may involve checking where it gets assigned, and debugging with GDB or print statements where the value of the pointer variable changes (this could happen due to a seemingly unrelated assignment if that assignment corrupts the pointer).
Reason: Your code accessed memory in a heap allocation after it was already freed.
Typical bugs that cause this: You left a pointer to a free’d heap allocation in a data structure and later dereferenced it; you passed a pointer to already freed heap memory to somewhere that ended up dereferencing the pointer.
Next steps: Look at where the pointer is dereferenced (ASAN’s backtrace tells you), and then see where the allocation got freed (also in the back trace). Try to figure out what happened in between, and how the pointer to the now-dead dynamic lifetime memory continued to exist.
Reason: Your code calls free() twice with the same address as an argument, and the dynamic lifetime memory is dead on the second call. (Note that calling free() multiple times with the same address is fine if you called malloc() in between and it gave you the memory at this address again).
Typical bugs that cause this: Multiple cleanup code paths that clear up the same resources; pointers stored in data structures that already got freed elsewhere.
Next steps: Figure out where the offending free() happens, and where the prior free() call happened (both are in the ASAN backtraces). Then understand your logic and why both calls happened on the same pointer; change it to avoid this.
Reason: You returned a pointer to a stack (automatic lifetime) variable, whose lifetime has ended by the time the calling function gets to run again.
Typical bugs that cause this: Returning a pointer to a local variable, or into a local array.
Next steps: Check what you’re returning in the location indicated by the backtrace and trace it back to where it comes from.
Reason: Your program did not call free() for a heap-allocated memory region before exiting.
Typical bugs that cause this: You forgot a free() call; or you lost track of a pointer that you needed to free, either by overwriting it or by not storing/passing it for another part of your code to free it.
Next steps: Find out which allocation is affected, and track where the pointer returned from malloc() gets passed or stored. Check that all code paths to the exit of the program end up calling free() on this pointer.