« Back to the main CS 131 website
Lab 1: Writing and Debugging C Programs
Due February 11, 2020, at 8:00PM
Before attempting this lab, please make sure that you have:
1. Completed Lab 0 – This will ensure that your VM and grading server account are set up properly.
2. Completed the Diversity Survey – Your grades for Lab 0 and Lab 1 will depend on whether you’ve submitted this (though all questions are optional).
Introduction
The purpose of this lab is to give you some experience with the syntax and basic features of the C programming language, as well as introduce you to a C debugging tool called gdb
(GNU Debugger). Learning C will help you understand a lot of the underlying architecture of the operating system, and as a whole demystify how programs run.
If you take away anything from this course, hopefully, it’s that Computer Systems are not magic and that much of it actually makes a lot of sense. Don’t be afraid to look up questions on Stack Overflow and Linux Man Pages (which provide great documentation on C library functions), and if that doesn’t help, ask on Piazza!
Why C?
Check out this article for more on why C programming is awesome! Here are some of the article’s highlights: C is a procedural programming language that was mainly developed as a systems programming language to write operating systems. The main features of the C language include low-level access to memory, a simple set of keywords, and clean style, these features make C language suitable for system programming like operating system or compiler development.
If you are looking for a detailed tutorial on C, check out the links on our C primer.
Assignment
Assignment installation
Start with the cs131-s20-labs-YOURNAME
repository you used for Lab 0.
First, ensure that your repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci1310/cs131-s20-labs.git
Then run:
$ git pull
$ git pull handout master
This will merge our Lab 1 stencil code with your previous work. If you have any “conflicts” from Lab 0 (very unlikely!), resolve them before continuing further. Run git push
to save your work back to your personal repository.
Exercise 1: Running and Debugging
Here’s how to run a C program 
To run a C program, you first need to compile the source code into a binary. There are several widely-used C compilers, but for this lab and CS 131, you will mostly use gcc
(the GNU C Compiler).
In the next lab, we’ll go over more information on the compilation process.
$ gcc name_of_program.c -o name_of_executable
$ ./name_of_executable
However, sometimes things don’t go as planned, and instead of smiling, you’re pulling up your sleeves to solve a bug!
Like with other programming languages, C programmers frequently make use of print statements to look at the state of their program (in C, you use the printf
function for this). This so-called “printf debugging” is an important approach that can get you quite far, and you’ll probably use it a lot.
Often, however, you may wish that you could stop your program in its tracks (e.g., just before you hit a bug) and interactively inspect its state. This is what debugger tools like gdb
are for.
Here’s how to debug a C program using the GDB Debugger 
$ gcc name_of_program.c -g -o name_of_executable
$ gdb name_of_executable
(gdb) b name_of_a_function
(gdb) r ARGS
(gdb) layout src
(gdb) p VAR
(gdb) q
As explained on the gnu website, GDB can do four main things (plus other things) to help you catch bugs in the act:
- Start your program, specifying anything that might affect its behavior.
- Make your program stop on specified conditions.
- Examine what has happened, when your program has stopped.
- Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
Here’s a cheatsheet of common gdb commands. Throughout this lab we’ll use a few.
Task:
- Take a look at
math_prog.c
. There are two bugs in this program – don’t fix them quite yet.
- Try compiling and running the program. (You’ll notice an unpleasant surprise.)
- Try running the program in gdb.
- Set a breakpoint at the function called
add_arr
, run the program, open the source code, and then print out the variable a
.
Example
$ gcc math_prog.c -g -o math_prog
$ ./math_prog
$ gdb math_prog
(gdb) b add_arr
(gdb) r
(gdb) layout src
(gdb) p a
(gdb) q
Finding the Bugs using GDB 
Note: For the remainder of the this lab, try to refrain from using print statements to debug. The following gdb commands can be very helpful in debugging C programs (particuarly the bt
command), and the sooner you get familiar with working with gdb, the easier your life will be.
Once you’re stopped at a breakpoint at add_arr
, run the following commands:
(gdb) c
(gdb) layout src
(gdb) p *(c + i)
Cannot access memory at address 0xf0b5ff
(gdb) bt
The bt
command shows you the function calls that led up to where you currently are in the program (in our case, the segfault). Each function call comes with a stack frame, which contains information specific to that call (such as arguments and local variables). We will hear more about stack frames later in the course. In gdb
, we can check out different frames (i.e. check out different function calls), like so:
(gdb) f 1
(gdb) p c
Hopefully you noticed that the pointer c
is initially pointing at uninitialized memory! We can fix this in two ways:
- Stack allocate enough space for the whole array – and then pass in a pointer to that array to
add_arr
.
- Heap allocate enough space for the array – and then pass in a pointer to that array to
add_arr
.
(In this case, because we’re only using the arr
for a short period of time, the stack allocation makes sense.)
First, try it yourself, but here are some tips if you need help.
- To stack allocate the array change the declaration to:
int c[6];
- To heap allocate the array:
int *c = malloc(sizeof(int) * 6);
# ... use the pointer and when you're done ...
free(c)
Once you fix the bug and re-compile your program, you should notice that the program no longer segfaults, but it’s still not working as expected.
Task: Use gdb
to find (and then fix) the second bug.
Hint!
Typically when C programmers pass arrays as arguments to functions, they also include the length of the array as another argument to the function. Think about why they might do this.
Exercise 2: Let’s get programming! 
Take a look at simple_repl.c
. This program reads in input from the terminal and breaks up a single line of text by either a space or comma! Fun fact: “REPL” stands for “read-eval-print” loop, and one place where you may have encountered a REPL before is the Python interpreter: you type a line, it evaluates it, and it prints some result.
As you’re reading through the code, here are some functions and variables you might want to look into:
Task:
- Compile and run
simple_repl.c
. Enter a few lines of text to get a feel for how it works.
- You can also redirect standard input using the
<
symbol, so that instead of reading in commands from the terminal, it reads them from a file.
Try:
$ ./simple_repl < files/three-star.csv
or
$ ./simple_repl < files/A_Christmas_Carol_in_Prose.txt
Similarly, you can write:
$ echo "hello world" | ./simple_repl
, piping the output from echo
into your REPL.
- Typing
Ctrl-D
will send an End-Of-File (EOF) signal to the program, causing fgets
to return NULL
and exiting the program.
- Run the program in
gdb
, and perform the following commands:
- Break at main.
- Use the next (
n
) command until the call to fgets
.
- Use the print (
p
) and examine (x
) commands to examine the contents of buf
before and after the call to fgets
.
- Where is the char array
buf
allocated (the stack or the heap)?
Help
(gdb) b main
(gdb) layout src
(gdb) r
(gdb) n
(gdb) n
(gdb) p buf
hello there
(gdb) p buf
(gdb) x/10c buf
strtok
In this section, you will be writing your own version of strtok
. It might sound daunting, but we’ll walk you through it. Take a look at the link above if you need clarification on what exactly strtok
does.
Note: You may have noticed that strtok
maintains state internally from iteration to iteration. It does this by declaring a static local variable. Essentially, the function creates the variable in a region of memory that will persist until the end of the program (almost like a global variable), but the variable is only accessible within the function. This part has been written for you.
Task: Take a look at my_strtok.c
. You’ll be implementing your own version of strtok
.
-
In simple_repl.c
:
- At the top,
#include "my_strtok.h"
.
- Change the calls to
strtok
to use my_strtok
.
-
Fill in the my_strtok.c
according to the TODOs in the comments.
- Exclusively use pointer operations rather than array notation (brackets
[]
).
- Here are some function you may want to look into:
- Note: For the above functions, if you ever want to check out their behavior on edge cases (e.g., what would happen if you pass in an empty string, or a null string?), we highly recommend using repl.it for testing!
-
You can test your code using simple_repl.c
and some test cases in test_runner.c
. Compiling and running test_runner.c
will run the test cases in the function test_strtok
.
- In order to compile with your own implementation of
strtok
you will need to add my_strtok.c
to the source list. For instance to compile the repl with my_strtok()
the command would be:
gcc simple_repl.c my_strtok.c -g -o simple_repl
Note: Don’t worry about the interplay between my_strtok.c
and my_strtok.h
for now. If you are curious, a comment in my_strtok.h
explains what it’s about, but we will go over compilation more in Lab 2!
getline
This REPL is really good at tokenizing based on commas and spaces now, but you may have realized that the program as a whole might struggle with parsing long sentences.
Task:
- Try running:
$ ./simple_repl < files/A_Christmas_Carol_excerpt.txt
. This file contains the first two paragraphs of the Christmas Carol text file, and places each sentence on its own line. If you look at the output, you’ll see some weird-looking lines. This is because our program can’t parse more than 99 characters at a time.
- Why can’t our program parse more than 99 characters?
One solution to this problem is to increase our BUFFER_SIZE
to something like 1,000,000 (roughly 1 MB), but in the cases where we’re reading smaller lines, this will waste a lot of space on our stack. Plus, what if someone had a really, really long line with more than a million characters? We really need to be able to dynamically adjust the size of our buffer (hint… the heap
).
getline
is a great function for this! It uses malloc
and realloc
to dynamically allocate memory as it’s reading in more characters from a file.
Task:
- Change your
simple_repl.c
to use getline!
- Test it on
files/A_Christmas_Carol_excerpt.txt
.
- Remember to free the character array before the program exits.
Hint
- Your char array no longer needs to be allocated on the stack. If you declare a
NULL
char
pointer, getline
will intialize it correctly.
- However, because
getline
will modify the contents of the char
pointer itself (i.e getline
isn’t changing the contents of what the pointer is pointing at, it’s changing the address that the pointer points at), it needs the address of a char pointer that’s stack allocated.
[Optional] Lecture Review: How are C Programs Laid Out?
Before you start coding, let’s use the debugger to examine how our C-program is laid out in memory.
Variables in C never overlap; each variable occupies distinct storage. Additionally, each variable in C has a lifetime, which is called storage duration by the standard. There are three different kinds of lifetime.
- static lifetime: The variable lasts as long as the program runs.
- automatic lifetime: The compiler allocates and destroys the memory automatically as the program runs, based on the variable’s scope (the region of the program in which it is meaningful).
- dynamic lifetime: The programmer allocates and destroys the object explicitly.
The compiler and operating system work together to put variables at different addresses. A program’s address space (which is the range of addresses accessible to a program) divides into regions called segments. Objects with different lifetimes are placed into different segments. The most important segments are:
Segment |
Lifetime |
Contains |
Code (text, read-only data) |
static, unmodifiable |
program instructions and constant global variables |
Data (data, bss) |
static, modifiable |
initialized and uninitialized non-constant global variables |
Stack |
automatic, modifiable |
temporary local variables for each function call |
Heap |
dynamic, modifiable |
memory that is explicitly allocated and deallocated |
An executable is normally at least as big as the static-lifetime data (the code and data segments together). Since all that data must be in memory for the entire lifetime of the program, it’s written to disk and then when a program runs, the operating system loads the segments into memory. The stack and heap segments, by contrast, grow on demand.
Note on disks
A harddisk (HDD, for hard disk drive, or SSD, for solid-state drive) is a persistent form of storage for data. The data on disk is maintained after your computer shuts down or the power fails, but data in memory is not!
Let’s take a look at this in action! We’ll be looking at hello_world.c
and the binary compiled from it.
Note: Modern compilers employ many optimizations to make it difficult for users to examine memory, because malicious users can perform some serious attacks on unprotected programs. We’re using the -fno-pic
and -no-pie
flags to turn off these optimizations for the purposes of this exercise.
$ gcc hello_world.c -no-pie -fno-pic -g -o hello_world
$ gdb hello_world
(gdb) info files
Quick Interjection:
info files
will print out the static segments that have been loaded into memory. The segments are formatted as:
[segment-start-address] - [segment-end-address] is [name-of-segment]
- We want to pay attention to the
.text
(the C’s program instructions, i.e., its code), .rodata
(read-only data), .data
(initialized data), and .bss
(uninitialized data). These are static segments of our program that have already been placed into memory.
- The
Entry point: 0x400590
will refer to an address in the .text
region of memory corresponding to the first instruction the program will run.
(gdb) p GLOBAL_VAR
200
(gdb) p &GLOBAL_VAR
(int *) 0x601058 <GLOBAL_VAR>
(gdb) x/d &GLOBAL_VAR
0x601058 <GLOBAL_VAR>: 200
Notice that the address of the global GLOBAL_VAR
variable is in the .data
segment – the region where intialized global memory lives.
Task:
- Examine the addresses of
const_variable
, uninitialized_variable
, and main
in gdb, and identify the segment of memory each has been loaded into.
- Now print out the first 33 strings in the
.rodata
section. Notice that any static strings used in the course of the hello_world
program are stored in this section.
Hint:
- use
x/d
to examine as a decimal
- use
x/s
to examine as a string
- use
x/c
to examine as a character
- use
x/a
to examine as an address
- use
x/i
to examine as an instruction
- use
x/3i
to examine next 3 instructions that begin at an address
- Similarly
x/3s
will examine the first 3 strings beginning at an address
Now, let’s continue our program in gdb. Set a breakpoint in main and run.
(gdb) b main
(gdb) r
(gdb) info proc mappings
Here, the command info proc mappings
shows the address ranges currently accessible to the program and their corresponding regions. Note that the mappings for this process currently include a stack (labeled [stack]
), but not a heap.
Task:
- Step through to line 18 of
hello_world.c
(past the declaration of local_var
) and then examine the address of local_var
. What section is local_var
contained in?
- Additionally, print
local_var
. Since it is a char pointer, this should show the address local_var
points to and the value (string) at that address. In what section is the address contained in local_var
? (Hint: you examined this section in the previous task)
Tip:
- You can use the command
layout src
to see where you are in the code while it is running in gdb
- To skip to line 18 of
hello_world.c
, you can use the next
or n
command in gdb so that you can step over any function calls
- Additionally, you can set a breakpoint on line 41 with
b 18
and then use continue
or c
to continue straight to that line
Now, let’s continue stepping through main until line 22 (past the initialization of heap_allocated
).
Task:
- Once again, print out the addresses accessible to the program using
info proc mappings
. Do you notice any differences?
- Additionally, examine the address of
heap_allocated
, and identify the section it is contained in.
Hint:
The first time you examined the addresses accessible to the process right at the start of main
, the program had not yet allocated any data in the heap. Hence, the heap was not listed as an accessible section.
Handin Instructions
You will turn in your code by pushing your git repository to github.com/csci1310/cs131-s20-labs-YOURNAME.git
.
As a quick recap, you do this by running git commit
; either use git commit -a
to commit all changes; or use git add -p
to interactively choose which changes to “stage” for commit, and then commit them using git commit
. Finally, push your changes to your git repository via git push
.
Then, head to the grading server. On the “Labs” page, use the “Lab 1 checkoff” button to check off your lab.
Note: Your lab grades are associated with the commit that you used as your lab checkoff, so when you check off your Lab 1, the grade for Lab 0 will no longer be shown. But rest assured: if you switch to the commit you used for the Lab 0 checkoff, you’ll hopefully see a 2/2 next to Lab 0 
« Back to the main CS 131 website
Lab 1: Writing and Debugging C Programs
Due February 11, 2020, at 8:00PM
1. Completed Lab 0 – This will ensure that your VM and grading server account are set up properly.
2. Completed the Diversity Survey – Your grades for Lab 0 and Lab 1 will depend on whether you’ve submitted this (though all questions are optional).
Introduction
The purpose of this lab is to give you some experience with the syntax and basic features of the C programming language, as well as introduce you to a C debugging tool called
gdb
(GNU Debugger). Learning C will help you understand a lot of the underlying architecture of the operating system, and as a whole demystify how programs run.If you take away anything from this course, hopefully, it’s that Computer Systems are not magic and that much of it actually makes a lot of sense. Don’t be afraid to look up questions on Stack Overflow and Linux Man Pages (which provide great documentation on C library functions), and if that doesn’t help, ask on Piazza!
Why C?
Check out this article for more on why C programming is awesome! Here are some of the article’s highlights: C is a procedural programming language that was mainly developed as a systems programming language to write operating systems. The main features of the C language include low-level access to memory, a simple set of keywords, and clean style, these features make C language suitable for system programming like operating system or compiler development.
If you are looking for a detailed tutorial on C, check out the links on our C primer.
Assignment
Assignment installation
Start with the
cs131-s20-labs-YOURNAME
repository you used for Lab 0.First, ensure that your repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our Lab 1 stencil code with your previous work. If you have any “conflicts” from Lab 0 (very unlikely!), resolve them before continuing further. Run
git push
to save your work back to your personal repository.Exercise 1: Running and Debugging
Here’s how to run a C program
To run a C program, you first need to compile the source code into a binary. There are several widely-used C compilers, but for this lab and CS 131, you will mostly use
gcc
(the GNU C Compiler).In the next lab, we’ll go over more information on the compilation process.
# compile your c-program into an executable binary (ones and zeros) $ gcc name_of_program.c -o name_of_executable # run the executable $ ./name_of_executable # Smile at the exciting output of your program.
However, sometimes things don’t go as planned, and instead of smiling, you’re pulling up your sleeves to solve a bug!
Like with other programming languages, C programmers frequently make use of print statements to look at the state of their program (in C, you use the
printf
function for this). This so-called “printf debugging” is an important approach that can get you quite far, and you’ll probably use it a lot.Often, however, you may wish that you could stop your program in its tracks (e.g., just before you hit a bug) and interactively inspect its state. This is what debugger tools like
gdb
are for.Here’s how to debug a C program using the GDB Debugger
# compile your C program using the `-g` flag to compile with debugging info $ gcc name_of_program.c -g -o name_of_executable # run the executable in gdb $ gdb name_of_executable # set a breakpoint at a function (gdb) b name_of_a_function # run the program optionally with arguments ARGS (if necessary) (gdb) r ARGS # display the source code as you debug (gdb) layout src # print a variable VAR (gdb) p VAR # Run other gdb commands # Track down your bug # quit out of gdb (gdb) q
As explained on the gnu website, GDB can do four main things (plus other things) to help you catch bugs in the act:
Here’s a cheatsheet of common gdb commands. Throughout this lab we’ll use a few.
Task:
math_prog.c
. There are two bugs in this program – don’t fix them quite yet.add_arr
, run the program, open the source code, and then print out the variablea
.Example
# compile your c-program using the `-g` flag to compile with degugging info $ gcc math_prog.c -g -o math_prog # run the executable $ ./math_prog # run the executable in gdb $ gdb math_prog # set a breakpoint at a function (gdb) b add_arr # run the program optionally with arguments(if necessary) (gdb) r # display the source code as you debug (gdb) layout src # print the variable a (gdb) p a # Quit gdb (gdb) q
Finding the Bugs using GDB
Note: For the remainder of the this lab, try to refrain from using print statements to debug. The following gdb commands can be very helpful in debugging C programs (particuarly the
bt
command), and the sooner you get familiar with working with gdb, the easier your life will be.Once you’re stopped at a breakpoint at
add_arr
, run the following commands:(gdb) c # continues the program to the next breakpoint or to termination # ...You should notice a SEGFAULT # this should show you exactly when the fault occured (gdb) layout src # this call is accessing invalid memory (gdb) p *(c + i) Cannot access memory at address 0xf0b5ff # ... Hmm where was the variable `c` initialized? # Prints a backtrace of the program # The 'bt' command is incredibly useful anytime you encounter a SEGFAULT. (gdb) bt
The
bt
command shows you the function calls that led up to where you currently are in the program (in our case, the segfault). Each function call comes with a stack frame, which contains information specific to that call (such as arguments and local variables). We will hear more about stack frames later in the course. Ingdb
, we can check out different frames (i.e. check out different function calls), like so:# The 'f' command allows you to switch frames # the below command switches to frame #1, which corresponds to the main function (gdb) f 1 (gdb) p c # ... Oh `c` was declared in `main`, but never intialized
Hopefully you noticed that the pointer
c
is initially pointing at uninitialized memory! We can fix this in two ways:add_arr
.add_arr
.(In this case, because we’re only using the
arr
for a short period of time, the stack allocation makes sense.)First, try it yourself, but here are some tips if you need help.
int c[6];
Once you fix the bug and re-compile your program, you should notice that the program no longer segfaults, but it’s still not working as expected.
Task: Use
gdb
to find (and then fix) the second bug.Hint!
Typically when C programmers pass arrays as arguments to functions, they also include the length of the array as another argument to the function. Think about why they might do this.
Exercise 2: Let’s get programming!
Take a look at
simple_repl.c
. This program reads in input from the terminal and breaks up a single line of text by either a space or comma! Fun fact: “REPL” stands for “read-eval-print” loop, and one place where you may have encountered a REPL before is the Python interpreter: you type a line, it evaluates it, and it prints some result.As you’re reading through the code, here are some functions and variables you might want to look into:
fgets
printf
stdin
andstdout
strtok
(This is a wacky function that we’ll use later, so pay special attention to it.)Task:
simple_repl.c
. Enter a few lines of text to get a feel for how it works.<
symbol, so that instead of reading in commands from the terminal, it reads them from a file.Try:
$ ./simple_repl < files/three-star.csv
or$ ./simple_repl < files/A_Christmas_Carol_in_Prose.txt
Similarly, you can write:
$ echo "hello world" | ./simple_repl
, piping the output fromecho
into your REPL.Ctrl-D
will send an End-Of-File (EOF) signal to the program, causingfgets
to returnNULL
and exiting the program.gdb
, and perform the following commands:n
) command until the call tofgets
.p
) and examine (x
) commands to examine the contents ofbuf
before and after the call tofgets
.buf
allocated (the stack or the heap)?Help
# set a break point at main (gdb) b main # show source code, and then run the program (gdb) layout src (gdb) r # use the n command to execute the next line of code (gdb) n # keep using the n command until you're about to execute the `fgets` (gdb) n #... # print out the buffer before executing fgets and after (gdb) p buf # the program will hang # (it's waiting for input from stdin for the fgets function) hello there # type a line of text # print the buffer (gdb) p buf # you should see the text you inputted (gdb) x/10c buf # examines (x) 10 characters (/10c) starting at buf
strtok
In this section, you will be writing your own version of
strtok
. It might sound daunting, but we’ll walk you through it. Take a look at the link above if you need clarification on what exactlystrtok
does.Note: You may have noticed that
strtok
maintains state internally from iteration to iteration. It does this by declaring a static local variable. Essentially, the function creates the variable in a region of memory that will persist until the end of the program (almost like a global variable), but the variable is only accessible within the function. This part has been written for you.Task: Take a look at
my_strtok.c
. You’ll be implementing your own version ofstrtok
.In
simple_repl.c
:#include "my_strtok.h"
.strtok
to usemy_strtok
.Fill in the
my_strtok.c
according to the TODOs in the comments.[]
).strtok
strspn
strlen
strcspn
You can test your code using
simple_repl.c
and some test cases intest_runner.c
. Compiling and runningtest_runner.c
will run the test cases in the functiontest_strtok
.strtok
you will need to addmy_strtok.c
to the source list. For instance to compile the repl withmy_strtok()
the command would be:gcc simple_repl.c my_strtok.c -g -o simple_repl
Note: Don’t worry about the interplay between
my_strtok.c
andmy_strtok.h
for now. If you are curious, a comment inmy_strtok.h
explains what it’s about, but we will go over compilation more in Lab 2!getline
This REPL is really good at tokenizing based on commas and spaces now, but you may have realized that the program as a whole might struggle with parsing long sentences.
Task:
$ ./simple_repl < files/A_Christmas_Carol_excerpt.txt
. This file contains the first two paragraphs of the Christmas Carol text file, and places each sentence on its own line. If you look at the output, you’ll see some weird-looking lines. This is because our program can’t parse more than 99 characters at a time.One solution to this problem is to increase our
).
BUFFER_SIZE
to something like 1,000,000 (roughly 1 MB), but in the cases where we’re reading smaller lines, this will waste a lot of space on our stack. Plus, what if someone had a really, really long line with more than a million characters? We really need to be able to dynamically adjust the size of our buffer (hint… the heapgetline
is a great function for this! It usesmalloc
andrealloc
to dynamically allocate memory as it’s reading in more characters from a file.Task:
simple_repl.c
to use getline!files/A_Christmas_Carol_excerpt.txt
.Hint
NULL
char
pointer,getline
will intialize it correctly.getline
will modify the contents of thechar
pointer itself (i.egetline
isn’t changing the contents of what the pointer is pointing at, it’s changing the address that the pointer points at), it needs the address of a char pointer that’s stack allocated.[Optional] Lecture Review: How are C Programs Laid Out?
Before you start coding, let’s use the debugger to examine how our C-program is laid out in memory.
Variables in C never overlap; each variable occupies distinct storage. Additionally, each variable in C has a lifetime, which is called storage duration by the standard. There are three different kinds of lifetime.
The compiler and operating system work together to put variables at different addresses. A program’s address space (which is the range of addresses accessible to a program) divides into regions called segments. Objects with different lifetimes are placed into different segments. The most important segments are:
An executable is normally at least as big as the static-lifetime data (the code and data segments together). Since all that data must be in memory for the entire lifetime of the program, it’s written to disk and then when a program runs, the operating system loads the segments into memory. The stack and heap segments, by contrast, grow on demand.
Note on disks
A harddisk (HDD, for hard disk drive, or SSD, for solid-state drive) is a persistent form of storage for data. The data on disk is maintained after your computer shuts down or the power fails, but data in memory is not!
Let’s take a look at this in action! We’ll be looking at
hello_world.c
and the binary compiled from it.Note: Modern compilers employ many optimizations to make it difficult for users to examine memory, because malicious users can perform some serious attacks on unprotected programs. We’re using the
-fno-pic
and-no-pie
flags to turn off these optimizations for the purposes of this exercise.# compile your program with the following flags $ gcc hello_world.c -no-pie -fno-pic -g -o hello_world $ gdb hello_world # before setting any breakpoints, do the following in gdb: (gdb) info files # don't quit yet ...
Quick Interjection:
info files
will print out the static segments that have been loaded into memory. The segments are formatted as:[segment-start-address] - [segment-end-address] is [name-of-segment]
.text
(the C’s program instructions, i.e., its code),.rodata
(read-only data),.data
(initialized data), and.bss
(uninitialized data). These are static segments of our program that have already been placed into memory.Entry point: 0x400590
will refer to an address in the.text
region of memory corresponding to the first instruction the program will run.# ... back to the terminal (gdb) p GLOBAL_VAR # print the contents of GLOBAL_VAR 200 (gdb) p &GLOBAL_VAR # print the address of GLOBAL_VAR (int *) 0x601058 <GLOBAL_VAR> # the address may vary on your machine # examine (x) the contents at the address of GLOBAL_VAR as an integer (/d) (gdb) x/d &GLOBAL_VAR 0x601058 <GLOBAL_VAR>: 200
Notice that the address of the global
GLOBAL_VAR
variable is in the.data
segment – the region where intialized global memory lives.Task:
const_variable
,uninitialized_variable
, andmain
in gdb, and identify the segment of memory each has been loaded into..rodata
section. Notice that any static strings used in the course of thehello_world
program are stored in this section.Hint:
x/d
to examine as a decimalx/s
to examine as a stringx/c
to examine as a characterx/a
to examine as an addressx/i
to examine as an instructionx/3i
to examine next 3 instructions that begin at an addressx/3s
will examine the first 3 strings beginning at an addressNow, let’s continue our program in gdb. Set a breakpoint in main and run.
(gdb) b main (gdb) r #Now in main: (gdb) info proc mappings # Again, don't quit yet ...
Here, the command
info proc mappings
shows the address ranges currently accessible to the program and their corresponding regions. Note that the mappings for this process currently include a stack (labeled[stack]
), but not a heap.Task:
hello_world.c
(past the declaration oflocal_var
) and then examine the address oflocal_var
. What section islocal_var
contained in?local_var
. Since it is a char pointer, this should show the addresslocal_var
points to and the value (string) at that address. In what section is the address contained inlocal_var
? (Hint: you examined this section in the previous task)Tip:
layout src
to see where you are in the code while it is running in gdbhello_world.c
, you can use thenext
orn
command in gdb so that you can step over any function callsb 18
and then usecontinue
orc
to continue straight to that lineNow, let’s continue stepping through main until line 22 (past the initialization of
heap_allocated
).Task:
info proc mappings
. Do you notice any differences?heap_allocated
, and identify the section it is contained in.Hint:
The first time you examined the addresses accessible to the process right at the start of
main
, the program had not yet allocated any data in the heap. Hence, the heap was not listed as an accessible section.Handin Instructions
You will turn in your code by pushing your git repository to
github.com/csci1310/cs131-s20-labs-YOURNAME.git
.As a quick recap, you do this by running
git commit
; either usegit commit -a
to commit all changes; or usegit add -p
to interactively choose which changes to “stage” for commit, and then commit them usinggit commit
. Finally, push your changes to your git repository viagit push
.Then, head to the grading server. On the “Labs” page, use the “Lab 1 checkoff” button to check off your lab.
Note: Your lab grades are associated with the commit that you used as your lab checkoff, so when you check off your Lab 1, the grade for Lab 0 will no longer be shown. But rest assured: if you switch to the commit you used for the Lab 0 checkoff, you’ll hopefully see a 2/2 next to Lab 0