Lecture 24: Condition Variables, Summary, and Outlook

🎥 Lecture video (Brown ID required)
💻 Lecture code
❓ Post-Lecture Quiz (due 11:59pm, Monday, May 8).

Condition Variables

A condition variable supports the following operations:

Logically, the writer to the bounded buffer should block when the buffer becomes full, and should unblock when the buffer becomes nonfull again. Let's create a condition variable, called nonfull_, in the bounded buffer, just under the mutex. Note that we conveniently named the condition variable after the condition under which the function should unblock. It will make code easier to read later on. The write() method implements blocking is in bbuffer-cond.cc. It looks like the following:

ssize_t bbuffer::write(const char* buf, size_t sz) {
    std::unique_lock guard(this->mutex_);
    while (this->blen_ == bcapacity) {  // #1
    size_t pos = 0;
    while (pos < sz && this->blen_ < bcapacity) {
        size_t bindex = (this->bpos_ + this->blen_) % bcapacity;
        this->bbuf_[bindex] = buf[pos];

The new code at #1 implements blocking until the condition is met. This is a pattern when using condition variables: the condition variable's wait() function is almost always called in a while loop, and the loop tests the condition in which the function must block.

On the other hand, notify_all() should be called whenever some changes we made might turn the unblocking condition true. In our scenario, this means we must call notify_all() in the read() method, which takes characters out of the buffer and can potentially unblock the writer, as shown in the inserted code #2 below:

ssize_t bbuffer::read(char* buf, size_t sz) {
    std::unique_lock guard(this->mutex_);
    while (pos < sz && this->blen_ > 0) {
        buf[pos] = this->bbuf_[this->bpos_];
        this->bpos_ = (this->bpos_ + 1) % bcapacity;
    if (pos > 0) {                   // #2

With condition variables, our bounded buffer program runs significantly more efficiently: instead of making millions of calls to read and write, it now makes about a 100k read calls and about 1M write calls, since the threads are blocked while the buffer is full (writes) or empty (reads).

Why the while loop around cv.wait()?

Why is it necessary to have wait() in a while loop?

wait() is almost always used in a loop because of what we call spurious wakeups. Since notify_all() wakes up all threads blocking on a certain wait() call, by the time when a particular blocking thread locks the mutex and gets to run, it's possible that some other blocking thread has already unblocked, made some progress, and changed the unblocking condition back to false. For this reason, a "woken-up" must revalidate the unblocking condition before proceeding further, and if the unblocking condition is not met it must go back to blocking. The while loop achieves exactly this.

Infrastructure at Scale

Modern web services can have millions of users, and the companies that operate them run serious distributed systems infrastructure to support these services. Below picture shows a simplified view of the way such infrastructure is typically structured.

End-users contact one of several datacenters, typically the one geographically closest to them. Inside that datacenter, their requests are initially terminated at a load-balancer (LB). This is a simple server that forwards requests onto different frontend servers (FE) that run an HTTPS server (Apache, nginx, etc.) and the application logic (e.g., code to generate a Twitter timeline, or a Facebook profile page).

The front-end servers are stateless, and they contact backend servers for information required to dynamically generate web page data to return to the end-users. Depending on the consistency requirements for this data, the front-end server may either talk directly to a strongly-consistent database, or first check for the data on servers in a cache tier, which store refined copies of the database contents in an in-memory key-value store to speed up access to them. If the data is in the cache, the front-end server reads it from there and continues; if it is not in the cache, the front-end server queries the database.

Note that the database which is usually itself sharded and which acts as the source of ground-truth, is replicated across servers, often with a backup replica in another datacenter to protect against datacenter outages.

Finally, the preceeding infrastructure serves end-user requests directly and must produce responses quickly. This is called a service or interactive workload. Other computations in the datacenter are less time-critical, but may process data from many users. Such batch processing workloads include data science and analytics, training of machine learning models, backups, and other special-purpose tasks that run over large amounts of data. The systems executing these jobs typically split the input data into shards and have different servers work on distinct partitions of the input data in parallel. If the computation can be structured in such a way that minimal communication between shards is required, this approach scales very well.


This is the end of CS 300! If you're thinking of courses to take next, here are some courses that dive deeper into the concepts we learned about in this course.


Condition variables are another type of synchronization object that make it possible to implement blocking of threads until a condition is satisfied (e.g., there is space in a bounded buffer again). This improves efficiency of the bounded buffer program, as threads no longer spin; for some other programs that require threads to wait, condition variables are actually required for correctness. To wait, a thread calls wait() on the condition variable while holding a mutex lock. If the condition does not hold, the mutex is released and the thread is blocked. Any waiting threads for a condition variable are unblocked by a call to notify_all (or notify_one) from another thread.

We finished off our discussion of thread synchronization by considering two details of the conditional variable API: why wait() needs to atomically release the lock and block the calling thread, and why we need to wrap calls to wait() in a while loop. For both of these choices, it turns out that a condition variable without them allows for subtly incorrect executions when multiple threads interleave in a pessimal way, and we saw examples of this.