Lecture 9: C++ and Caching

» Lecture video (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 6pm Wednesday, February 26).

Introduction to C++

Why are we using C++ for the rest of the course?

You will by now have started to appreciate the power of the C programming language: it gives you direct access to memory, it matches very closely with the concepts in the underlying hardware, and it is able to achieve very high performance because the language adds no to very little overhead to your program.

However, writing programs in C can feel a bit like trying to build your own car from scratch: very educational, but you really have to do everything yourself. C doesn't come with any data structures in a standard library, and as you will recall from the vectors part of Project 1, writing a generic datastructure is somewhat painful. C++ tries to make things easier: it still gives you access to all the low-level power of C, but it also allows you to write code at a higher level of abstraction, using classes, objects, and various other advanced features.

So far, we've used the C programming language in the course. We will now increasingly start working on C++, which is a seperate programming language from C. However, C++ and C are closely related: indeed, pretty much any valid C program is also a valid C++ program.

Note that C++ is a huge programming language, especially compared to C, and comes with many advanced (and some ill-advised) features. The C++ we will write in this course mostly focuses on the C-like subset of the language, with the addition of classes, objects, and some standard library data structures. If you'd like to learn more about the advanced features of C++, check out the links on our C/C++ Primer page.

Compiling C++ programs

Despite its similarity to C, C++ is a separate programming language with its own compilers and tools.

There are many C++ compilers, but the most widely-used ones are GCC's g++ (the C++ equivalent of gcc) and LLVM's clang++ (the C++ equivalent of clang). You can use either for the course; clang++ sometimes has easier-to-understand error messages than g++.

The good news, though, is that the command line options for these compilers are practically identical to those for their C equivalents, and that all your favorite build and debugging tools (Makefiles, GDB, sanitizers, etc.) still work for C++.

Classes and Objects

C++ is an object-oriented programming language, meaning that it includes the notion of classes that you can instantiate into objects. If you know Java, these concepts will seem very familiar to you. A class defines data and functionality associated with a specific type, and each class can have many individual instances in the form of objects of that class. Classes and objects help programmers organize their code, and allow for some data to be accessible only via specific functions – an idea known as "encapsulation". Object-orientation also allows you to write programs that are crazy difficult to understand, and some believe it's overkill – we won't pass judgement on this, but rather make use of C++'s object-oriented features to make our life as systems programmers easier.

Let's look at the specific example of a program that seeks to represent pets via the Animal class type. If you were to write a C program for this purpose, you might define a struct type that tracks information about a specific animal, such as its name, age, and weight:

typedef struct animal {
  char* name;
  int age;
  int weight;
} animal_t;
Specific instances of this structure can exist on the stack or on the heap:
int main() {
  animal_t stack_cat;
  stack_cat.name = "kitty";
  stack_cat.age = 5;
  stack_cat.weight = 10;

  animal_t* heap_dog = (animal_t*) malloc(sizeof(animal_t));
  heap_dog->name = "doggy";
  heap_dog->age = 8;

  // [...]
}
This works fine, but comes with several downsides:
  1. Any piece of code can set the values of any of the struct's members without validation; for example, nothing prevents non-sensical assignments like heap_dog->weight = 999;.
  2. Any function that operates on an animal needs to explictly take a pointer to the specific animal in question as an argument (recall how all the vector methods in Project 1 took a vector_t* as their first argument), so that the function body knows what memory to access.

C++ extends the C struct notion with functionality to allow instances of a type to have behavior (i.e., associated methods). To do so, you can define functions as part of the struct definition:

typedef struct Animal {
  char* name;
  int age;
  int weight;

  // new in C++: define methods on instances of this struct
  void setWeight(int w) {
    if (w > 50) {
      printf("error: unrealistic weight!\n");
      return;
    }
    this->weight = w;
  }

  int getWeight() {
    return this->weight;
  }
} animal_t;
You'll notice the this keyword inside setWeight and getWeight methods here. this is always a pointer to the instance of the struct that the method was called on – in other words, its type is Animal* in this example. This implicit access to a pointer to the current instance allows calling methods on an animal using the same syntax as C struct member access:
int main() {
  Animal stack_cat;
  stack_cat.name = "kitty";
  stack_cat.age = 5;
  stack_cat.setWeight(10);

  animal_t stack_dog; // can still use type alias, just like in C!
  stack_dog.name = "doggy";
  stack_dog.age = 5;
  stack_dog.setWeight(999); // will report an error
}

To be backwards-compatible with C, a C++ struct without any methods has exactly the same syntax and behaves exactly the same way as the C struct would. But you may note that even though our Animal struct defines handy methods to get and set the weight of the animal, including some validation in setWeight, there is nothing preventing code from directly modifying the weight member of the struct.

C++ provides the class keyword to help you define structs whose members are protected from arbitrary access. The definition of class looks exactly the same as that of a struct, with the exception that you can define some members to be public and some to be private, as follows:

typedef class Animal {
 public:
  char* name;
  int age;
 private:
  int weight;

 public:
  // to allow access to the private `weight` member via methods, these need to be public
  void setWeight(int w) {
    if (w > 50) {
      printf("error: unrealistic weight!\n");
      return;
    }
    this->weight = w;
  }

  int getWeight() {
    return this->weight;
  }
} animal_t;
These access modifiers split the definition into sections, and the compiler will prevent any access to private members from outside the methods associated with the class. (Both member variables, called fields, and member functions, called methods can be private.)

Are there ways around access modifiers?

It's important to realize that access modifiers are merely a helpful aid to the programmer, not a failsafe protection mechanism. Only the compiler looks at access modifiers and checks them; once the compiler has turned the C++ code into assembly or machine code, no notion of access modifier protection remains. In particular, the access modifiers are never checked at program runtime!

But even the compiler can be fooled. Since C++ is a systems programming language, it allows for direct memory access, including pointer arithmetic. This actually provides a way for programs to circumvent the private access modifier: knowing at what byte offset in a class or struct a field is located is sufficient to form a pointer to that field, and to ultimately access the memory. No C++ compiler can prove the absence of such illegal accesses without additional hints; this is an instance of the pointer aliasing problem.

Finally, what if you want to create an instance of a class? In the above examples, we've already seen stack-allocated objects of the Animal class. To make heap-allocated objects, C++ uses the new keyword:

int main() {
  Animal* heap_cat = new Animal;
  heap_cat->name = "kitty";
  heap_cat->age = 5;
  heap_cat->setWeight(10);
}

new here works exactly like (Animal*) malloc(sizeof(Animal)), allocating sufficient heap memory for an Animal structure. On top of allocating memory, however, new also calls a special method on the class called the constructor. Constructors are helpful in order to initialize the fields of the object: recall that uninitialized memory may contain arbitrary garbage! (cpp1.cc in the lecture code shows an example of how a stack-allocated object can have surprising contents if the fields aren't set correctly.) To define a constructor, you add a method without a return type (think about this: what would the constructor return?) and with the same name as the class/struct name:

typedef class Animal {
 public:
  char* name;
  int age;
 private:
  int weight;

 public:
  // constructor, takes two arguments
  Animal(char* name, int age) {
    this->name = name;
    this->age = age;
    this->weight = 0;  // can access private field from constructor
  }

  // ... other methods
} animal_t;

int main() {
  // calls constructor on creating stack-allocated object
  Animal stack_cat("kitty", 5);
  stack_cat.setWeight(10);

  // calls constructor on creating heap-allocated object
  Animal* heap_dog = new Animal("doggy", 5);

  // [...]
}
Constructors help set up objects; by default, C++ adds an empty zero-argument constructor to each struct or class, which is why the above examples without an explicit constructor call are still valid.

Initializer list syntax for constructors

Rather than writing several lines of the form this->field = ... in the constructor to initialize fields, C++ permits a shorthand syntax called initializer list. Separated by a colon from the constructor declaration, the initializer list consists of a comma-separated list of field(argument) pairs. For example:

//                            | initializer list
//                            v
Animal(char* name, int age) : name(name), age(age), weight(0) {
 // obsoleted by initializer list
 // this->name = name;
 // this->age = age;
 // this->weight = 0;
}

Finally, how do you get rid of a heap-allocated object? For this purpose, and as a counterpoint to new, C++ provides the delete keyword. delete is to new what free() is to malloc().

You can see some more examples of C++ classes and structures in cpp1.cc.

Standard Library Data Structures

One big advantage of C++ over C is that C++ comes with a large standard library with many common data structures implemented. We will use some of these data structures in the rest of the course.

The data structure part of the C++ library is called the Standard Template Library (STL), and it contains various "container" structures represent different kinds of collections. For example:

The difference between the ordered and unordered variants of these data structures matters when iterating over them: an ordered collection always guarantees the same, specific iteration order, while and unordered collection makes no such guarantee.

STL collections are generic, meaning that they can hold elements of any type. This is extremely handy, because it means that we don't need separate implementations for, say, a vector of integers and a vector of strings. Recall that in your C vector implementation for Project 1, you had to use void* pointers and explicit element size arguments to make the vector generic; fortunately, generic C++ data structures require no such things. To tell the data structure what specific types it should assume, we include the types in angle brackets when we refer to the data structure type: for example, a std::vector<int> is a vector of ints, while a std::vector<Animal> would be a vector of Animal objects, and std::vector<int*> is a vector of pointers to integers.

How do generic data structures work?

The details of how generic C++ STL data structures work are complex and related to an advanced feature of the C++ language called "templating". You won't need to understand how to write templated classes for this course, but you can think of this as writing a class with one or more type parameter that the compiler searches and replaces with the actual types before it compiles your code. For example, a std::vector<T> specifies a type parameter T for the type of the vector elements, and all code implementing the vector will use T to refer to the element type. Only when you actually use, e.g., a vector<int> will the compiler generate and compile code for a vector of integers and appropriately set all element sizes in the code.

You can declare both stack-allocated and heap-allocated STL container data structures, and cpp2.cc shows some examples. However, one very important thing to realize is that these C++ data structures may themselves allocate memory on the heap (in fact, they usually do!), even if the data structure itself is declared as stack-allocated. If you think about this, this makes sense: all of these data structures are dynamic in size, i.e., you can add and remove elements in your code as you wish. This means that the data structures cannot be entirely on the stack or in the static segment, since both of these segments require object storage sizes to be known at compile time.

We won't be able to cover in detail all the APIs that STL collections offer in lectures, and we encourage you to make use of the reference links on our C++ primer page to explore them. The reference material can seem verbose and confusing at first; often, it's easiest to look at the code examples included in the documentation for specific methods to develop an intuition for how you use them. The methods you want often have relatively obvious names (e.g., contains(T element) checks if an std::vector<T> contains element; push_back(T element) on the same vector adds an element to the back), but not always (e.g., the easiest way to insert into a std::map<K, V> is to use emplace(K key, V value)).

Caching

We are now switching gears to talk about one of the most important performance-improving concepts in computer systems. This concept is the idea of cache memory.

Why are we covering this?

Caching is an immensely important concept to optimize performance of a computer system. As a software engineer in industry, or as a researcher, you will probably find yourself in countless situations where "add a cache" is the answer to a performance problem. Understanding the idea behind caches, as well as when a cache works well, is important to being able to build high-performance applications.

We will look at specific examples of caches, but a generic definition is the following: a cache is a small amount of fast storage used to speed up access to larger, slower storage.

A cache basically works by storing copies of data whose primary home is on slower storage. A program can access data faster when a it's located "nearby", in fast storage. Examples of this include storing data from the computer's harddisk in memory, or storing information in memory in caches closer to the processor.

Caches abound in computer systems: processors have caches for primary memory (RAM). The operating system uses primary memory (RAM) as a cache for disks and other stable storage devices. Running programs reserve some of their private memory to cache the operating system's cache of the disk. You will explore these caches further in Lab 3.

The programs diskio-slow and diskio-fast in the lecture code illustrate the huge difference caching can make to performance. Both programs write bytes to a file they create (the file is simply called data; you can see it in the lecture code directory after running these programs).

diskio-slow is a program that writes data to the computer's disk (SSD or harddisk) one byte at a time, and ensures that the byte is written to disk immediately and before the operation returns (the O_SYNC flag passed to open ensures this). It can write a few hundred bytes per second – hardly an impressive speed, as writing a single picture (e.g., from your smartphone camera) would take several minutes if done this way!

diskio-fast, on the other hand, writes to disk via series of caches. It easily achieves write throughputs of hundreds of megabytes per second: in fact, it writes 50 MB in about a tenth of a second on my laptop! This happens because these writes don't actually go to the computer's disk immediately. Instead, the program just writes to memory and relies on the operating system to "flush" the data out to stable storage over time in a way that it deems efficient. This improves performance, but it does come with a snag: if my computer loses power before the operating system gets around to putting my data on disk, it may get lost, even though my program was under the impression that the write to the file succeeded.

Finally, consider how the concept of caching abounds in everyday life, too. Imagine how life would differ without, say, fast access to food storage – if every time you felt hungry, you had to walk to a farm and eat a carrot you pulled out of the dirt. Your whole day would be occupied with finding and eating food! (Indeed, this is what some animals spend most of their time doing.) Instead, your refrigerator (or your dorm's refrigerator) acts as a cache for your neighborhood grocery store, and that grocery store acts as a cache for all the food producers worldwide.

Summary

Today, we talked about the C++ programming language and how it adds object-oriented features to C. We saw that C++ classes are basically fancy structs with methods (member functions), and how C++ allows you to create instances (objects) of such classes.

We also looked into the handy data structures provided by the C++ standard library, and got an intial feel for how you can use them to make your life easier.

Finally, we started talking about the notion of caching as a performance optimization at a high level; next time, we will dive deeper into how your processor's cache, a specific instance of the caching paradigm, works and speeds up program execution.