Lecture 8: Stack, Buffer Overflow, and Intro to C++
» Lecture code (Stack) –
Lecture code (C++)
S1: The Stack, continued
You will recall the stack segment of memory from earlier lectures: it is where all variables with automatic lifetime are stored. These include local variables declared inside functions, but importantly also function arguments.
The arguments and local variables of f()
live inside f()
's stack frame. Subsequent
arguments (second, third, fourth, etc.) are stored at subsequently lower addresses below %rsp
(see call02.s
and call03.s
for examples with more arguments), followed eventually by
any local variables in the caller.
How does%rsp
change?
The convention is that
%rsp
always points to the lowest (leftmost) stack address that is currently used. This means that when a function declares a new local variable,%rsp
has to move down (left) and if a function returns,%rsp
has to move up (right) and back to where it was when the function was originally called.Moving
%rsp
happens in two ways: explicit modification via arithmetic instructions, and implicit modification as a side effect of special instructions. The former happens when the compiler knows exactly how many bytes a function requires%rsp
to move by, and involves instructions likesubq $0x10, %rsp
, which moves the stack pointer down by 16 bytes. The latter, side-effect modification happens when instructionpush
andpop
run. These instructions write the contents of a register onto the stack memory immediately to the left of the current%rsp
and also modify%rsp
to point to the beginning of this new data. For example,pushq %rax
would write the 8 bytes from register%rax
at address%rsp - 8
and set%rsp
to that address; it is equivalent tomovq %rax, -8(%rsp); subq $8, %rsp
orsubq $8, %rsp; movq %rax, (%rsp)
.
Return Address
As a function executes, it eventually reaches a ret
instruction in its assembly. The effect of
ret
is to return to the caller (a form a control flow, as the next instruction needs to change).
But how does the processor know what instruction to execute next, and what to set %rip
to?
It turns out that the stack plays a role here, too. In a nutshell, each function call stores the return address as the very first (i.e., rightmost) data in the callee's stack frame. (If the function called takes more than six arguments, the return address is to the left of the 7th argument in the caller's stack frame.)
The stored return address makes it possible for each function to know exactly where to continue execution once it returns to its caller. (However, storing the return address on the stack also has some dangerous consequences, as we will see shortly.)
We can now define the full function entry and exit sequence. Both the caller and the callee have responsibilities in this sequence.
To prepare for a function call, the caller performs the following tasks:
The caller stores the first six arguments in the corresponding registers.
If the callee takes more than six arguments, or if some of its arguments are large, the caller must store the surplus arguments on its stack frame (in increasing order). The 7th argument must be stored at
(%rsp)
(that is, the top of the stack) when the caller executes itscallq
instruction.The caller saves any caller-saved registers (see last lecture's list). These are registers whose values the callee might overwrite, but which the caller needs to retain for later use.
The caller executes
callq FUNCTION
. This has an effect likepushq $NEXT_INSTRUCTION; jmp FUNCTION
(or, equivalently,subq $8, %rsp; movq $NEXT_INSTRUCTION, (%rsp); jmp FUNCTION
), whereNEXT_INSTRUCTION
is the address of the instruction immediately followingcallq
.
To return from a function, the callee does the following:
The callee places its return value in
%rax
.The callee restores the stack pointer to its value at entry ("entry
%rsp
"), if necessary.The callee executes the
retq
instruction. This has an effect likepopq %rip
, which removes the return address from the stack and jumps to that address (because the instruction writes it into the special%rip
register).Finally, the caller then cleans up any space it prepared for arguments and restores caller-saved registers if necessary.
S2: Base Pointers and Buffer Overflow
Base Pointers and the %rbp
Register
Keeping track of the entry %rsp
can be tricky with more complex functions that allocate lots of
local variables and modify the stack in complex ways. For these cases, the x86-64 Linux calling convention allows
for the use of another register, %rbp
as a special-purpose register.
%rbp
holds the address of the base of the current stack frame: that is, the address of
the rightmost (highest) address that points to a value still part of the current stack frame. This corresponds the
rightmost address of an object in the callee's stack, and to the first address that isn't part of an argument to
the callee or one of its local variables. It is called the base pointer, since the address
points at the "base" of the callee's stack frame (if %rsp
points to the "top",
%rbp
points to the "base" (= bottom). The %rbp
register maintains this value for
the whole execution of the function (i.e., the function may not overwrite the value in that register), even as
%rsp
changes.
This scheme has the advantage that when the function exits, it can restore its original entry %rsp
by loading it from %rbp
. In addition, it also facilitates debugging because each function stores the
old value of %rbp
to the stack at its point of entry. The 8 bytes holding the caller's
%rbp
are the very first thing stored inside the callee's stack frame, and they are right below the
return address in the caller's stack frame. This mean that the saved %rbp
s form a chain that allows
each function to locate the base of its caller's stack frame, where it will find the %rbp
of the
"grand-caller's" stack frame, etc. The backtraces you see in GDB and in Address Sanitizer error messages
are generated precisely using this chain!
Therefore, with a base pointer, the function entry sequence becomes:
The first instruction executed by the callee on function entry is
pushq %rbp
. This saves the caller's value for%rbp
into the callee's stack. (Since%rbp
is callee-saved, the callee is responsible for saving it.)The second instruction is
movq %rsp, %rbp
. This saves the current stack pointer in%rbp
(so%rbp
= entry%rsp
- 8).This adjusted value of
%rbp
is the callee's "frame pointer" or base pointer. The callee will not change this value until it returns. The frame pointer provides a stable reference point for local variables and caller arguments. (Complex functions may need a stable reference point because they reserve varying amounts of space.)Note, also, that the value stored at
(%rbp)
is the caller's%rbp
, and the value stored at8(%rbp)
is the return address. This information can be used to trace backwards by debuggers (a process called "stack unwinding").The function ends with
movq %rbp, %rsp; popq %rbp; retq
, or, equivalently,leave; retq
. This sequence is the last thing the callee does, and it restores the caller's%rbp
and entry%rsp
before returning.
You can find an example of this in call07.s
. Lab 3 also uses the %rbp
-based calling
convention, so make sure you keep the extra 8 bytes for storing the caller's %rbp
on the stack in mind!
Buffer overflow attacks
Now that we understand the calling convention and the stack, let's take a step back and think of some of the consequences of this well-defined memory layout. While a callee is not supposed to access its caller's stack frame (unless it's explicitly passed a pointer to an object within it), there is no principled mechanism in the x86-64 architecture that prevents such access.
In particular, if you can guess the address of a variable on the stack (either a local within the current function or a local/argument in a caller of the current function), your program can just write data to that address and overwrite whatever is there.
This can happen accidentally (due to bugs), but it becomes a much bigger problem if done deliberately by malicious actors: a user might provide input that causes a program to overwrite important data on the stack. This kind of attack is called a buffer overflow attack.
Consider the code in attackme.cc
. This program computes checksums of strings provided to it as command
line arguments. You don't need to understand in deep detail what it does, but observe that the checksum()
function uses a 100-byte stack-allocated buffer (as part of the buf
union) to hold the input string, which
it copies into that buffer.
A sane execution of attackme
might look like this:
$ ./attackme hey yo CS300
hey: checksum 00796568, sha1 7aea02175315cd3541b03ffe78aa1ccc40d2e98a -
yo: checksum 00006f79, sha1 dcdc24e139db869eb059c9355c89c382de15b987 -
CS300: checksum 30335373, sha1 b93b5d831563598f15cae55ba445f0b3cd5da036 -
But what if the user provides an input string longer than 99 characters (remember that we also need the zero terminator
in the buffer)? The function just keeps writing, and it will write over whatever is adjacent to buf
on the
stack.
From our prior pictures, we know that buf
will be in checksum
's stack frame, below the
entry %rsp
. Moreover, directly above the entry %rsp
is the return address! In this
case, that is an address in main()
. So, if checksum
writes beyond the end of buf
,
will overwrite the return address on the stack; if it keeps going further, it will overwrite data in main
's
stack frame.
Why is overwriting the return address dangerous? It means that a clever attacker can direct the program to execute
any function within the program. In the case of attackme.cc
, note the run_shell()
function,
which runs a string as a shell command. This has a lot of nefarious potential – what if we could cause that
function to execute with a user-provided string? We could print a lot of sad face emojis to the shell, or, more
dangerously, run a command like rm -rf /
, which deletes all data on the user's computer!
If we run ./attackme.unsafe
(a variant of attackme
with safety features added by mondern
compilers to combat these attacks disabled), it behaves as normal with sane strings:
$ ./attackme.unsafe hey yo CS300
hey: checksum 00796568, sha1 7aea02175315cd3541b03ffe78aa1ccc40d2e98a -
yo: checksum 00006f79, sha1 dcdc24e139db869eb059c9355c89c382de15b987 -
CS300: checksum 30335373, sha1 b93b5d831563598f15cae55ba445f0b3cd5da036 -
But if we pass a very long string with more than 100 characters, things get a bit more unusual:
$ ./attackme.unsafe sghfkhgkfshgksdhrehugresizqaugerhgjkfdhgkjdhgukhsukgrzufaofuoewugurezgureszgukskgreukfzreskugzurksgzukrestgkurzesi
Segmentation fault (core dumped)
The crash happens because the return address for checksum()
was overwritten by garbage from our string,
which isn't a valid address. But what if we figure out a valid address and put it in exactly the right
place in our string?
This is what the input in attack.txt
does. Specifically, using GDB, I figured out that the address of
run_shell
in my compiled version of the code is 0x400870 (an address in the code/text segment of the
executable). attack.txt
contains a carefully crafted "payload" that puts the value 0x400870
into the right bytes on the stack. The attack payload is 115 characters long because we need 100 characters to overrun
buf
, 3 bytes for the malicious return address, and 12 bytes of extra payload because stack frames on
x86-64 Linux are aligned to 16-byte boundaries.
Executing this attack works as follows:
$ ./attackme.unsafe "$(cat attack.txt)"
OWNED
OWNED
OWNED
OWNED
OWNED
OWNED
sh: 7: ��5��: not found
Segmentation fault (core dumped)
The cat attack.txt
shell command simple pastes the contents of the attack.txt
file into the
string we're passing to the program. (The quotes are required to make sure our attack payload is processed as a single
string even if it contains spaces.)
S3: Introduction to C++
Why are we using C++ for the rest of the course?
You will by now have started to appreciate the power of the C programming language: it gives you direct access to memory, it matches very closely with the concepts in the underlying hardware, and it is able to achieve very high performance because the language adds no to very little overhead to your program.
However, writing programs in C can feel a bit like trying to build your own car from scratch: very educational, but you really have to do everything yourself. C doesn't come with any data structures in a standard library, and as you will recall from the vectors part of Project 1, writing a generic datastructure is somewhat painful. C++ tries to make things easier: it still gives you access to all the low-level power of C, but it also allows you to write code at a higher level of abstraction, using classes, objects, and various other advanced features.
So far, we've used the C programming language in the course. We will now increasingly start working on C++, which is a seperate programming language from C. However, C++ and C are closely related: indeed, pretty much any valid C program is also a valid C++ program.
Note that C++ is a huge programming language, especially compared to C, and comes with many advanced (and some ill-advised) features. The C++ we will write in this course mostly focuses on the C-like subset of the language, with the addition of classes, objects, and some standard library data structures. If you'd like to learn more about the advanced features of C++, check out the links on our C/C++ Primer page.
Compiling C++ programs
Despite its similarity to C, C++ is a separate programming language with its own compilers and tools.
There are many C++ compilers, but the most widely-used ones are GCC's g++
(the C++ equivalent
of gcc
) and LLVM's clang++
(the C++ equivalent of clang
). You can use
either for the course; clang++
sometimes has easier-to-understand error messages than
g++
.
The good news, though, is that the command line options for these compilers are practically identical to those for their C equivalents, and that all your favorite build and debugging tools (Makefiles, GDB, sanitizers, etc.) still work for C++.
Classes and Objects
C++ is an object-oriented programming language, meaning that it includes the notion of classes that you can instantiate into objects. If you know Java, these concepts will seem very familiar to you. A class defines data and functionality associated with a specific type, and each class can have many individual instances in the form of objects of that class. Classes and objects help programmers organize their code, and allow for some data to be accessible only via specific functions – an idea known as "encapsulation". Object-orientation also allows you to write programs that are crazy difficult to understand, and some believe it's overkill – we won't pass judgement on this, but rather make use of C++'s object-oriented features to make our life as systems programmers easier.
Let's look at the specific example of a program that seeks to represent pets via the Animal
class type. If you were to write a C program for this purpose, you might define a struct type that tracks
information about a specific animal, such as its name, age, and weight:
typedef struct animal {
char* name;
int age;
int weight;
} animal_t;
Specific instances of this structure can exist on the stack or on the heap:
int main() {
animal_t stack_cat;
stack_cat.name = "kitty";
stack_cat.age = 5;
stack_cat.weight = 10;
animal_t* heap_dog = (animal_t*) malloc(sizeof(animal_t));
heap_dog->name = "doggy";
heap_dog->age = 8;
// [...]
}
This works fine, but comes with several downsides:
- Any piece of code can set the values of any of the struct's members without validation; for example,
nothing prevents non-sensical assignments like
heap_dog->weight = 999;
. - Any function that operates on an animal needs to explictly take a pointer to the specific
animal in question as an argument (recall how all the vector methods in Project 1 took a
vector_t*
as their first argument), so that the function body knows what memory to access.
C++ extends the C struct notion with functionality to allow instances of a type to have behavior (i.e.,
associated methods). To do so, you can define functions as part of the struct
definition:
typedef struct Animal {
char* name;
int age;
int weight;
// new in C++: define methods on instances of this struct
void setWeight(int w) {
if (w > 50) {
printf("error: unrealistic weight!\n");
return;
}
this->weight = w;
}
int getWeight() {
return this->weight;
}
} animal_t;
You'll notice the this
keyword inside setWeight
and getWeight
methods
here. this
is always a pointer to the instance of the struct that the method was called on
– in other words, its type is Animal*
in this example. This implicit access to a pointer
to the current instance allows calling methods on an animal using the same syntax as C struct member access:
int main() {
Animal stack_cat;
stack_cat.name = "kitty";
stack_cat.age = 5;
stack_cat.setWeight(10);
animal_t stack_dog; // can still use type alias, just like in C!
stack_dog.name = "doggy";
stack_dog.age = 5;
stack_dog.setWeight(999); // will report an error
}
To be backwards-compatible with C, a C++ struct
without any methods has exactly the
same syntax and behaves exactly the same way as the C struct
would. But you may note that even
though our Animal
struct defines handy methods to get and set the weight of the animal, including
some validation in setWeight
, there is nothing preventing code from directly modifying the
weight
member of the struct.
C++ provides the class
keyword to help you define structs whose members are protected from
arbitrary access. The definition of class
looks exactly the same as that of a struct
,
with the exception that you can define some members to be public
and some to be private
,
as follows:
typedef class Animal {
public:
char* name;
int age;
private:
int weight;
public:
// to allow access to the private `weight` member via methods, these need to be public
void setWeight(int w) {
if (w > 50) {
printf("error: unrealistic weight!\n");
return;
}
this->weight = w;
}
int getWeight() {
return this->weight;
}
} animal_t;
These access modifiers split the definition into sections, and the compiler will prevent any access to private
members from outside the methods associated with the class. (Both member variables, called fields, and
member functions, called methods can be private.)
Are there ways around access modifiers?
It's important to realize that access modifiers are merely a helpful aid to the programmer, not a failsafe protection mechanism. Only the compiler looks at access modifiers and checks them; once the compiler has turned the C++ code into assembly or machine code, no notion of access modifier protection remains. In particular, the access modifiers are never checked at program runtime!
But even the compiler can be fooled. Since C++ is a systems programming language, it allows for direct memory access, including pointer arithmetic. This actually provides a way for programs to circumvent the
private
access modifier: knowing at what byte offset in a class or struct a field is located is sufficient to form a pointer to that field, and to ultimately access the memory. No C++ compiler can prove the absence of such illegal accesses without additional hints; this is an instance of the pointer aliasing problem.
Finally, what if you want to create an instance of a class? In the above examples, we've already seen
stack-allocated objects of the Animal
class. To make heap-allocated objects, C++ uses the
new
keyword:
int main() {
Animal* heap_cat = new Animal;
heap_cat->name = "kitty";
heap_cat->age = 5;
heap_cat->setWeight(10);
}
new
here works exactly like (Animal*) malloc(sizeof(Animal))
, allocating sufficient
heap memory for an Animal
structure. On top of allocating memory, however, new
also
calls a special method on the class called the constructor. Constructors are helpful in order to
initialize the fields of the object: recall that uninitialized memory may contain arbitrary garbage!
(cpp1.cc
in the lecture code shows an example of how a stack-allocated object can have surprising
contents if the fields aren't set correctly.) To define a constructor, you add a method without a return type
(think about this: what would the constructor return?) and with the same name as the class/struct name:
typedef class Animal {
public:
char* name;
int age;
private:
int weight;
public:
// constructor, takes two arguments
Animal(char* name, int age) {
this->name = name;
this->age = age;
this->weight = 0; // can access private field from constructor
}
// ... other methods
} animal_t;
int main() {
// calls constructor on creating stack-allocated object
Animal stack_cat("kitty", 5);
stack_cat.setWeight(10);
// calls constructor on creating heap-allocated object
Animal* heap_dog = new Animal("doggy", 5);
// [...]
}
Constructors help set up objects; by default, C++ adds an empty zero-argument constructor to each struct or class,
which is why the above examples without an explicit constructor call are still valid.
Initializer list syntax for constructors
Rather than writing several lines of the form
this->field = ...
in the constructor to initialize fields, C++ permits a shorthand syntax called initializer list. Separated by a colon from the constructor declaration, the initializer list consists of a comma-separated list offield(argument)
pairs. For example:// | initializer list // v Animal(char* name, int age) : name(name), age(age), weight(0) { // obsoleted by initializer list // this->name = name; // this->age = age; // this->weight = 0; }
Finally, how do you get rid of a heap-allocated object? For this purpose, and as a counterpoint to new
,
C++ provides the delete
keyword. delete
is to new
what free()
is to
malloc()
.
You can see some more examples of C++ classes and structures in cpp1.cc
.
Summary
Today, we also understood in more detail how the stack segment of memory is structured and managed, and discussed how it grows and shrinks. We learned about how the compiler manages the stack pointer and how base pointers help it "unwind" the stack for debugging.
We then looked into how the very well-defined memory layout of the stack can become a danger if a program is compromised through a malicious input: by carefully crafting inputs that overwrite part of the stack memory via a buffer overflow, we can change important data and cause a program to execute arbitrary code.
In Lab 3, you will craft and execute buffer overflow attacks on a program yourself!
Finally, we talked about the C++ programming language and how it adds object-oriented features to C. We saw that C++
classes are basically fancy struct
s with methods (member functions), and how C++ allows you to create
instances (objects) of such classes.