Lecture 9: C++ and Caching
» Lecture video (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 6pm Wednesday, February 26).
Introduction to C++
Why are we using C++ for the rest of the course?
You will by now have started to appreciate the power of the C programming language: it gives you direct access to memory, it matches very closely with the concepts in the underlying hardware, and it is able to achieve very high performance because the language adds no to very little overhead to your program.
However, writing programs in C can feel a bit like trying to build your own car from scratch: very educational, but you really have to do everything yourself. C doesn't come with any data structures in a standard library, and as you will recall from the vectors part of Project 1, writing a generic datastructure is somewhat painful. C++ tries to make things easier: it still gives you access to all the low-level power of C, but it also allows you to write code at a higher level of abstraction, using classes, objects, and various other advanced features.
So far, we've used the C programming language in the course. We will now increasingly start working on C++, which is a seperate programming language from C. However, C++ and C are closely related: indeed, pretty much any valid C program is also a valid C++ program.
Note that C++ is a huge programming language, especially compared to C, and comes with many advanced (and some ill-advised) features. The C++ we will write in this course mostly focuses on the C-like subset of the language, with the addition of classes, objects, and some standard library data structures. If you'd like to learn more about the advanced features of C++, check out the links on our C/C++ Primer page.
Compiling C++ programs
Despite its similarity to C, C++ is a separate programming language with its own compilers and tools.
There are many C++ compilers, but the most widely-used ones are GCC's g++
(the C++ equivalent
of gcc
) and LLVM's clang++
(the C++ equivalent of clang
). You can use
either for the course; clang++
sometimes has easier-to-understand error messages than
g++
.
The good news, though, is that the command line options for these compilers are practically identical to those for their C equivalents, and that all your favorite build and debugging tools (Makefiles, GDB, sanitizers, etc.) still work for C++.
Classes and Objects
C++ is an object-oriented programming language, meaning that it includes the notion of classes that you can instantiate into objects. If you know Java, these concepts will seem very familiar to you. A class defines data and functionality associated with a specific type, and each class can have many individual instances in the form of objects of that class. Classes and objects help programmers organize their code, and allow for some data to be accessible only via specific functions – an idea known as "encapsulation". Object-orientation also allows you to write programs that are crazy difficult to understand, and some believe it's overkill – we won't pass judgement on this, but rather make use of C++'s object-oriented features to make our life as systems programmers easier.
Let's look at the specific example of a program that seeks to represent pets via the Animal
class type. If you were to write a C program for this purpose, you might define a struct type that tracks
information about a specific animal, such as its name, age, and weight:
typedef struct animal {
char* name;
int age;
int weight;
} animal_t;
Specific instances of this structure can exist on the stack or on the heap:
int main() {
animal_t stack_cat;
stack_cat.name = "kitty";
stack_cat.age = 5;
stack_cat.weight = 10;
animal_t* heap_dog = (animal_t*) malloc(sizeof(animal_t));
heap_dog->name = "doggy";
heap_dog->age = 8;
// [...]
}
This works fine, but comes with several downsides:
- Any piece of code can set the values of any of the struct's members without validation; for example,
nothing prevents non-sensical assignments like
heap_dog->weight = 999;
. - Any function that operates on an animal needs to explictly take a pointer to the specific
animal in question as an argument (recall how all the vector methods in Project 1 took a
vector_t*
as their first argument), so that the function body knows what memory to access.
C++ extends the C struct notion with functionality to allow instances of a type to have behavior (i.e.,
associated methods). To do so, you can define functions as part of the struct
definition:
typedef struct Animal {
char* name;
int age;
int weight;
// new in C++: define methods on instances of this struct
void setWeight(int w) {
if (w > 50) {
printf("error: unrealistic weight!\n");
return;
}
this->weight = w;
}
int getWeight() {
return this->weight;
}
} animal_t;
You'll notice the this
keyword inside setWeight
and getWeight
methods
here. this
is always a pointer to the instance of the struct that the method was called on
– in other words, its type is Animal*
in this example. This implicit access to a pointer
to the current instance allows calling methods on an animal using the same syntax as C struct member access:
int main() {
Animal stack_cat;
stack_cat.name = "kitty";
stack_cat.age = 5;
stack_cat.setWeight(10);
animal_t stack_dog; // can still use type alias, just like in C!
stack_dog.name = "doggy";
stack_dog.age = 5;
stack_dog.setWeight(999); // will report an error
}
To be backwards-compatible with C, a C++ struct
without any methods has exactly the
same syntax and behaves exactly the same way as the C struct
would. But you may note that even
though our Animal
struct defines handy methods to get and set the weight of the animal, including
some validation in setWeight
, there is nothing preventing code from directly modifying the
weight
member of the struct.
C++ provides the class
keyword to help you define structs whose members are protected from
arbitrary access. The definition of class
looks exactly the same as that of a struct
,
with the exception that you can define some members to be public
and some to be private
,
as follows:
typedef class Animal {
public:
char* name;
int age;
private:
int weight;
public:
// to allow access to the private `weight` member via methods, these need to be public
void setWeight(int w) {
if (w > 50) {
printf("error: unrealistic weight!\n");
return;
}
this->weight = w;
}
int getWeight() {
return this->weight;
}
} animal_t;
These access modifiers split the definition into sections, and the compiler will prevent any access to private
members from outside the methods associated with the class. (Both member variables, called fields, and
member functions, called methods can be private.)
Are there ways around access modifiers?
It's important to realize that access modifiers are merely a helpful aid to the programmer, not a failsafe protection mechanism. Only the compiler looks at access modifiers and checks them; once the compiler has turned the C++ code into assembly or machine code, no notion of access modifier protection remains. In particular, the access modifiers are never checked at program runtime!
But even the compiler can be fooled. Since C++ is a systems programming language, it allows for direct memory access, including pointer arithmetic. This actually provides a way for programs to circumvent the
private
access modifier: knowing at what byte offset in a class or struct a field is located is sufficient to form a pointer to that field, and to ultimately access the memory. No C++ compiler can prove the absence of such illegal accesses without additional hints; this is an instance of the pointer aliasing problem.
Finally, what if you want to create an instance of a class? In the above examples, we've already seen
stack-allocated objects of the Animal
class. To make heap-allocated objects, C++ uses the
new
keyword:
int main() {
Animal* heap_cat = new Animal;
heap_cat->name = "kitty";
heap_cat->age = 5;
heap_cat->setWeight(10);
}
new
here works exactly like (Animal*) malloc(sizeof(Animal))
, allocating sufficient
heap memory for an Animal
structure. On top of allocating memory, however, new
also
calls a special method on the class called the constructor. Constructors are helpful in order to
initialize the fields of the object: recall that uninitialized memory may contain arbitrary garbage!
(cpp1.cc
in the lecture code shows an example of how a stack-allocated object can have surprising
contents if the fields aren't set correctly.) To define a constructor, you add a method without a return type
(think about this: what would the constructor return?) and with the same name as the class/struct name:
typedef class Animal {
public:
char* name;
int age;
private:
int weight;
public:
// constructor, takes two arguments
Animal(char* name, int age) {
this->name = name;
this->age = age;
this->weight = 0; // can access private field from constructor
}
// ... other methods
} animal_t;
int main() {
// calls constructor on creating stack-allocated object
Animal stack_cat("kitty", 5);
stack_cat.setWeight(10);
// calls constructor on creating heap-allocated object
Animal* heap_dog = new Animal("doggy", 5);
// [...]
}
Constructors help set up objects; by default, C++ adds an empty zero-argument constructor to each struct or class,
which is why the above examples without an explicit constructor call are still valid.
Initializer list syntax for constructors
Rather than writing several lines of the form
this->field = ...
in the constructor to initialize fields, C++ permits a shorthand syntax called initializer list. Separated by a colon from the constructor declaration, the initializer list consists of a comma-separated list offield(argument)
pairs. For example:// | initializer list // v Animal(char* name, int age) : name(name), age(age), weight(0) { // obsoleted by initializer list // this->name = name; // this->age = age; // this->weight = 0; }
Finally, how do you get rid of a heap-allocated object? For this purpose, and as a counterpoint to new
,
C++ provides the delete
keyword. delete
is to new
what free()
is to
malloc()
.
You can see some more examples of C++ classes and structures in cpp1.cc
.
Standard Library Data Structures
One big advantage of C++ over C is that C++ comes with a large standard library with many common data structures implemented. We will use some of these data structures in the rest of the course.
The data structure part of the C++ library is called the Standard Template Library (STL), and it contains various "container" structures represent different kinds of collections. For example:
std::vector
is a vector (dynamically-sized array) similar to the vector you implemented in Project 1.std::map
provides an ordered key-value map, with an API somewhat similar to a Python dictionary, though with much stricter rules (fixed key and value types, no nesting, and others). The ordered map is typically implemented as a heap (the data structure, not the memory segment) or tree, so many operations are O(log N) complexity for a map of size N>.std::unordered_map
provides an unordered key-value map, implemented as a hashtable with most operations having O(1) amortized complexity. Again, the API is somewhat similar to a Python dictionary, but with all the constraints of amap
and the added constraint that the key type must be hashable (true of the primitive C++ types, but requires additional implementation for more complex types).std::set
andstd::unordered_set
provide ordered and unordered set abstractions, with APIs that support addition, removal, membership checking and other set operations.
STL collections are generic, meaning that they can hold elements of any type. This is extremely
handy, because it means that we don't need separate implementations for, say, a vector of integers and a
vector of strings. Recall that in your C vector implementation for Project 1, you had to use void*
pointers and explicit element size arguments to make the vector generic; fortunately, generic C++ data structures
require no such things. To tell the data structure what specific types it should assume, we include the types
in angle brackets when we refer to the data structure type: for example, a std::vector<int>
is a
vector of int
s, while a std::vector<Animal>
would be a vector of Animal
objects, and std::vector<int*>
is a vector of pointers to integers.
How do generic data structures work?
The details of how generic C++ STL data structures work are complex and related to an advanced feature of the C++ language called "templating". You won't need to understand how to write templated classes for this course, but you can think of this as writing a class with one or more type parameter that the compiler searches and replaces with the actual types before it compiles your code. For example, a
std::vector<T>
specifies a type parameterT
for the type of the vector elements, and all code implementing the vector will useT
to refer to the element type. Only when you actually use, e.g., avector<int>
will the compiler generate and compile code for a vector of integers and appropriately set all element sizes in the code.
You can declare both stack-allocated and heap-allocated STL container data structures, and cpp2.cc
shows
some examples. However, one very important thing to realize is that these C++ data structures may themselves allocate
memory on the heap (in fact, they usually do!), even if the data structure itself is declared as stack-allocated.
If you think about this, this makes sense: all of these data structures are dynamic in size, i.e., you can add and remove
elements in your code as you wish. This means that the data structures cannot be entirely on the stack or in the static
segment, since both of these segments require object storage sizes to be known at compile time.
We won't be able to cover in detail all the APIs that STL collections offer in lectures, and we encourage you to
make use of the reference links on our C++ primer page to explore them. The reference
material can seem verbose and confusing at first; often, it's easiest to look at the code examples included in the
documentation for specific methods to develop an intuition for how you use them. The methods you want often have relatively
obvious names (e.g., contains(T element)
checks if an std::vector<T>
contains
element
; push_back(T element)
on the same vector adds an element to the back), but not always
(e.g., the easiest way to insert into a std::map<K, V>
is to use emplace(K key, V
value)
).
Caching
We are now switching gears to talk about one of the most important performance-improving concepts in computer systems. This concept is the idea of cache memory.
Why are we covering this?
Caching is an immensely important concept to optimize performance of a computer system. As a software engineer in industry, or as a researcher, you will probably find yourself in countless situations where "add a cache" is the answer to a performance problem. Understanding the idea behind caches, as well as when a cache works well, is important to being able to build high-performance applications.
We will look at specific examples of caches, but a generic definition is the following: a cache is a small amount of fast storage used to speed up access to larger, slower storage.
A cache basically works by storing copies of data whose primary home is on slower storage. A program can access data faster when a it's located "nearby", in fast storage. Examples of this include storing data from the computer's harddisk in memory, or storing information in memory in caches closer to the processor.
Caches abound in computer systems: processors have caches for primary memory (RAM). The operating system uses primary memory (RAM) as a cache for disks and other stable storage devices. Running programs reserve some of their private memory to cache the operating system's cache of the disk. You will explore these caches further in Lab 3.
The programs diskio-slow
and diskio-fast
in the lecture code illustrate the huge difference
caching can make to performance. Both programs write bytes to a file they create (the file is simply called
data
; you can see it in the lecture code directory after running these programs).
diskio-slow
is a program that writes data to the computer's disk (SSD or
harddisk) one byte at a time, and ensures that the byte is written to disk immediately and before the operation
returns (the O_SYNC
flag passed to open
ensures this). It can write a few hundred bytes per
second – hardly an impressive speed, as writing a single picture (e.g., from your smartphone camera) would take
several minutes if done this way!
diskio-fast
, on the other hand, writes to disk via series of caches. It easily achieves write throughputs
of hundreds of megabytes per second: in fact, it writes 50 MB in about a tenth of a second on my laptop! This happens
because these writes don't actually go to the computer's disk immediately. Instead, the program just writes to memory
and relies on the operating system to "flush" the data out to stable storage over time in a way that it deems
efficient. This improves performance, but it does come with a snag: if my computer loses power before the operating
system gets around to putting my data on disk, it may get lost, even though my program was under the impression that the
write to the file succeeded.
Finally, consider how the concept of caching abounds in everyday life, too. Imagine how life would differ without, say, fast access to food storage – if every time you felt hungry, you had to walk to a farm and eat a carrot you pulled out of the dirt. Your whole day would be occupied with finding and eating food! (Indeed, this is what some animals spend most of their time doing.) Instead, your refrigerator (or your dorm's refrigerator) acts as a cache for your neighborhood grocery store, and that grocery store acts as a cache for all the food producers worldwide.
Summary
Today, we talked about the C++ programming language and how it adds object-oriented features to C. We saw that C++
classes are basically fancy struct
s with methods (member functions), and how C++ allows you to create
instances (objects) of such classes.
We also looked into the handy data structures provided by the C++ standard library, and got an intial feel for how you can use them to make your life easier.
Finally, we started talking about the notion of caching as a performance optimization at a high level; next time, we will dive deeper into how your processor's cache, a specific instance of the caching paradigm, works and speeds up program execution.