« Back to the main CS 0300 website
Project 5A: Concurrent Store 🗂🗃🗄
Due: Wednesday, May 3 at 6:00 PM EST
Introduction
A key-value store is a storage system in which data (a value) is addressed by a key; common examples are Python’s dictionaries or Java/C++ hash maps. Unlike file systems or other structures where objects are arranged in a hierarchical directory structure, keys map directly to their values. Key-value stores are popular for large-scale web applications because they are particularly amenable to performance and scalability optimizations, such as concurrent access/modification and sharding. Such optimizations are paramount to the success of Internet-scale applications like Facebook, AirBnB, and Twitter, which must handle thousands of connection requests and petabytes of data every day.
In this assignment, you will be exploring ways to implement and optimize a concurrent key-value store. Connections will be able to get, put, append, and delete key-value pairs, and dynamically add or remove servers to distribute load. This assignment will consist of two parts:
- In this part, Concurrent Store, you will implement multiple synchronized data structures for a multithreaded backend server to use. You will first implement a simple synchronized key-value store, then a key-value store with fine-grained synchronization for better scalability. You will also modify a given queue data structure to support synchronized access, which will be used by the server to accept multiple connections concurrently.
- In Part B, Distributed Store, you will distribute your store across multiple servers, turning it into a distributed system. Later lectures will cover this material!
Optional: Interested in real-world use cases of key-value stores and sharding?
If you’re curious how what you’re building relates to real-world industry use, take a look at these papers:
- Scaling Memcache at Facebook by Nishtala et al., which describes how Facebook deploys the memcached key-value store at scale. The store you’re implementing in the first part of this assignment is somewhat similar to memcached.
- Slicer: Auto-Sharding for Datacenter Applications by Adya et al., which describes Google’s auto-sharding system (called “Slicer”). The shardmaster you will implement in the second part of this assignment is similar to Google’s Slicer.
Other examples of key-value stores used in industry include Redis, RocksDB, and LevelDB.
Learning Objectives
This assignment will help you:
- Understand the fundamentals of threading, specifically managing and synchronizing threads.
- Think about problems introduced by concurrency, like deadlocks and race conditions on shared data structures.
- Familiarize yourself with using synchronization primitives, namely mutexes and condition variables.
Assignment Installation
Ensure that your project repository has a handout remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s23-projects.git
Then run:
$ git pull
$ git pull handout main
This will merge our Project 5 stencil code with your repository.
Important Note: This project requires a newer version of C++ (C++20
) that the container’s compilers don’t completely support; you will thus need to install a newer C++ toolchain. Please run the following commands before proceeding:
$ sudo apt-get update
$ sudo apt-get install clang++-12
$ sudo apt-get install g++-10
Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the kvstore
directory in the working copy of your projects repository.
Infrastructure Help
Stencil and Initial State
The stencil includes implementations for the client and server, as well as some utility functions and classes you may find useful. We are providing you with a fully implemented client and server, so you are responsible for implementing the simple key-value store, the synchronized queue class, and the concurrent key-value store.
You may find it helpful to frequently refer back to this table in both parts of the assignment; the support code is substantial, but we will indicate both here and within the stencil which utilities may be useful in your implementation.
Directories |
Summary |
build/ |
|
tests/ and test_utils/ |
See “Testing”. |
client/ and repl/ |
Contains the clients and relevant support code for interacting with your KvServer and Shardmaster s. You should not modify these. |
cmd/ |
Contains the runner code that starts the client, server, and shardmaster. You should not modify these. |
common/ |
Contains the definition of the Shard struct, as well as some useful helpers for managing or operating on them. Additionally, contains miscellaneous utility functions, such as color printing and string hashing. You will need utils.hpp for the concurrent key-value store, and shard.hpp for Distributed Store. (You might also find config.{hpp, cpp} helpful for Distributed Store.) |
kvstore/ |
Contains the definition of the abstract KvStore interface, as well as a simple synchronized key-value store implementation. Implement your simple and concurrent key-value store here. |
net/ |
Contains the networking support code used by the clients and servers. You should not modify these. |
server/ |
Contains the definition of the KvServer class, and supporting data structures. Implement your synchronized queue here. You will also modify your server to make it sharding-aware in Distributed Store. |
shardmaster/ |
Contains the definition of the Shardmaster class, and support code to process persistent connections. Implement your shardmaster in Distributed Store here. |
Compile and Run
To compile, cd
into the build/
directory and run make
. This will create two executables: simple_client
andserver
.
The server will not function properly until after you implement part 3!
To run the server, sp
$ ./server <port> [n_workers]
To run the simple client, specify the server address:
$ ./simple_client <server hostname:port>
Example Usage
To connect multiple clients to a server, start by opening multiple terminals (or use tmux
).
Terminal 1: runs server
on port 5000 with the default number of workers:
$ ./server 5000
Your server will then be listening on <hostname>:5000
; the server’s full address will be displayed in the terminal.
Terminal 2: runs simple_client
to connect to the server, using the address displayed above (in this example, the hostname is 65a4af2c28ae
):
$ ./simple_client 65a4af2c28ae:5000
Using the Client
Once connected to the server, you can use one or more clients to run standard key-value store operations, like get
, put
, append
, and delete
. You can also use two additional requests, multiget
and multiput
(for multiple key-value pairs). For specific usage instructions, type help
in the command prompt.
To exit the client (and any other executable), press Ctrl-D
(or Cmd-D
); this sends End-Of-File
(EOF
) to the program, causing it to exit cleanly. If your program hangs, use Ctrl-C
(or Cmd-C
) to force-quit the program.
Testing
We strongly recommend you test your code often while implementing, by compiling and running a server with a few clients (as shown above).
To run tests, you must be in the build/
directory. Then, run either make check concurrent_store
or ./run_tests concurrent_store
from inside the build/
directory, which will run all of the Concurrent Store tests.
To run the tests for an individual step #
in Concurrent Store, you can run either make check A#
or ./run_tests A#
. For example, to run the tests for step 1, you can run
$ make check A1
or
$ ./run_tests A1
The tests are in the tests/
directory, with each subdirectory corresponding to a part of the assignment. After implementing each step, the testing harness will run all tests related to that step. Specifically, the steps will run all tests in the following directories:
- Step 1:
kvstore_sequential_tests
- Step 2:
kvstore_parallel_tests
- Step 3:
queue_tests
- Step 4:
kvstore_sequential_tests
- Step 5:
kvstore_parallel_tests
To run an individual test, run that specific test’s executable. For instance, for test_put_get_delete
, run in the build/
directory:
./test_put_get_delete
The correctness tests can also optionally specify which type of KVStore to run (SimpleKvStore
or ConcurrentKvStore
) by passing in either simple
or concurrent
as an argument:
./test_put_get_delete simple
The tests default to the ConcurrentKvStore
.
If you want to writ, add new files to the subdirectories in tests/
. Be sure to take a look at the functions in test_utils/test_utils.hpp
— they will make writing your own tests much easier.
Assignment
Background
Concurrency in C++
Concurrency is useful because it allows processes to do multiple things seemingly at the same time: for example, a process that is waiting for a slow operation (such as fetching data over the internet) can use the processor for something else until it completes. When a computer has multiple processors, concurrency also allows for parallelism, i.e., the computer actually does multiple things at the same time.
However, concurrency also introduces a new category of undefined behavior: race conditions. A primary goal for any concurrent program is to avoid undefined behavior due to race conditions between threads. A program that is free of such race conditions is known as thread-safe, which means any data shared between threads is protected by appropriate synchronization, in case threads try to access or modify it simultaneously. Some of the most common synchronization primitives are mutexes and condition variables, which you will be using in this assignment. Additionally, you will see atomic variables.
Optional: Refresher on synchronization primitives
Mutexes
A mutex provides mutual exclusion between threads. Specifically, the mutex ensures that no more than one thread can be active inside a critical section at a time. Before entering the critical section (which, for example, accesses a piece of shared data), threads attempt to lock the “associated” mutex, and unlock it once they are done. This guarantees that only the one thread that successfully locked the mutex can modify the data. Since a thread will hang until it can lock the mutex, deadlocks may occur if the mutex is never unlocked.
Condition Variables
Condition variables allow a thread to wait for a condition to be met. Changes in that condition will be caused by another thread, which is also responsible for notifying the threads that wait on the condition variable. Example conditions are “Is the queue still empty?” or “Has the number of clients exceeded a threshold?”. Instead of a thread continuously locking, checking, and unlocking over and over again, it can wait on a condition variable until another thread signals that it’s worth trying now.
For a thread to wait on a condition variable with the C++ std::condition_variable
, the thread needs to first lock the associated mutex, and then pass that mutex to the condition variable. The thread will then be suspended and the mutex unlocked automatically, until the thread is woken up. When it is woken up because the wait condition is now met, the thread is guaranteed to have the mutex locked again.
Think about why this is necessary! If a thread is waiting for a condition, but the mutex stays locked while it is waiting, then nothing will be able to change the shared variable (e.g. a buffer or queue) and the condition will never be met.
Atomic Variables
In C++, we can use the std::atomic
library to declare atomic variables. The compiler translates operations on these variables into atomic instructions, which always execute without racing with any other processor’s accesses.
A specific type of mutexes that might be helpful to you are “unique locks”. These are a specific type of lock that automatically unlock once the lock is out of scope.
Example of Unique Locks
The std::unique_lock
automatically unlocks a mutex once it goes out of scope.
void foo() {
std::unique_lock guard(mtx);
}
Step 1: Single-Threaded Simple KV Store
In this section, you’ll implement a single-threaded simple key-value store. By “single-threaded,” we mean that only one thread can safely operate on the KVStore at a time. This means that for Step 1, you should not worry about concurrency. You may safely assume that only one thread will ever be operating on your KVStore at a time.
Take a look at how we define a SimpleKvStore
(in kvstore/simple_kvstore.hpp
):
class SimpleKvStore : public KvStore {
public:
SimpleKvStore() = default;
~SimpleKvStore() = default;
bool Get(const GetRequest *req, GetResponse *res) override;
bool Put(const PutRequest *req, PutResponse *) override;
bool Append(const AppendRequest *req, AppendResponse *) override;
bool Delete(const DeleteRequest *req, DeleteResponse *res) override;
bool MultiGet(const MultiGetRequest *req, MultiGetResponse *res) override;
bool MultiPut(const MultiPutRequest *req, MultiPutResponse *) override;
std::vector<std::string> AllKeys() override;
private:
};
Task: Define your internal key-value store. What data structure might be useful here? You may find this reference page useful.
Now it’s time to implement each of these methods. First, look at how we define the Request
and Response
structs in net/server_commands.hpp
. The Request
object contains the information being requested. The Response
object is where your methods should store the response to the request (if a response is needed). For the AllKeys
operation, return a vector of the keys. For the other operations, return true
if the operation is successful; otherwise, return false
.
Give me an example!
A call to Get(const GetRequest *req, GetResponse *res)
takes in a pointer to a GetRequest
, which has a key
field. Your method should get the value corresponding to key
from your internal key-value store and store it in the value
field of the GetResponse
that res
points to.
If a Response
struct is empty, that means you don’t need to provide an explicit confirmation that you fulfilled the request. For example, for a call to Put(const PutRequest *req, PutResponse *res)
, you should insert the key
and value
from the PutRequest that req
points to into your internal key-value store. Since PutResponse
is an empty struct, you don’t need to do anything with it.
Your key-value store will support the following operations:
Function |
Arguments |
Summary |
Get |
<key> |
Gets the value associated with <key> from the key-value store. Fails if it does not exist. |
Put |
<key> , <value> |
Sets <key> to <value> ; replaces existing values. |
Append |
<key> , <value> |
Appends <value> to the current value associated with <key> . Creates a new key-value pair if it does not already exist. |
Delete |
<key> |
Deletes <key> from the key-value store, returning the last saved value. Fails if it does not exist. |
MultiGet |
<key_1> ... <key_n> |
Gets the values associated with keys <key_1> ... <key_n> from the key-value store. Fails if any key does not exist. |
MultiPut |
<key_1> ... <key_n>, <value_1> ... <value_n> |
Sets <key_1> ... <key_n> to <value_1> ... <value_n> , respectively. |
AllKeys |
None |
Returns all of the keys from the database. |
Task: Implement the methods in kvstore/simple_kvstore.cpp
.
When you are finished, you should pass all kvstore_sequential_tests
. You can check that your implementation is correct by running in the build/
directory:
$ make check A1
If you’re failing an individual test, you can run it directly, via
$ ./test_name simple
Tests default to using the ConcurrentKvStore
(which you’ll implement in Steps 4 and 5), so you’ll need to explicitly tell each test to use the SimpleKvStore
by passing in simple
as an argument.
Step 2: Concurrent Simple KV Store
We would like our server to be able to handle multiple requests in parallel. However, our simple key-value store can’t safely handle concurrent requests yet! In this step, you’ll now make your simple key-value store thread-safe.
Implementation
Task: Add synchronization primitive(s) to SimpleKvStore
in kvstore/simple_kvstore.hpp
to make it thread-safe. You should use at most one mutex in this part. (We will explore more efficient locking schemes in later steps.)
Task: Modify the methods in kvstore/simple_kvstore.cpp
to enforce synchronized access, using the primitive(s) you defined in the previous task.
When you are finished, you should pass all kvstore_parallel_tests
. You can check that your implementation is correct by running in the build/
directory:
$ make check A2
Hints
- The key aspect of this key-value store is that any number of threads can be calling the same method at the same time. Always think: “What’s the worst possible sequence of events if multiple threads are running this code?” Keep this idea front and center when implementing the
SimpleKvStore
API functions here, and throughout the rest of the project.
- You want to make sure that the
MultiGet
, MultiPut
, and AllKeys
operations are atomic. Much like atomic instructions, an atomic operation should be executed as a single, indivisible unit, without interference or interleaving from other threads. That is, once started, an atomic operation should not be impacted by any other operations that may be executing simultaneously.
Atomic operations are often used in concurrent programming to ensure the integrity of shared data. In this context of your simple key-value store, other operations should not modify the store while the MultiGet
, MultiPut
, and AllKeys
operations are running.
Step 3: Synchronized Queue
Now, your simple key-value store can safely handle concurrent requests. However, at the moment the server can’t handle concurrent requests yet! In this part, you’ll implement a synchronized queue, which the server will use to concurrently handle multiple clients and send their requests to your simple key-value store.
Your server will utilize a generic synchronized queue. The synchronized queue will allow multiple threads to add and remove elements safely, and wait for new elements to arrive. It is also handy to distribute requests from different clients across multiple threads!
Context
Synchronized?
A significant part of concurrency lies in making sure that multiple threads can safely access the same data structure concurrently. One common approach is to create a mutex for the data structure, lock it whenever it is accessed, and unlock it afterward. While effective, having to lock and unlock each time can clutter your code, and forgetting to do so will end up causing problems.
A synchronized data structure (or thread-safe data structure) is a data structure built specifically to be accessed concurrently by multiple threads, and does not need to be protected by an external mutex.
Generic?
You may have noticed the use of template <typename T>
before class or function declarations for the queue. Its purpose is to allow you to define a class or struct that works with some arbitrary type, T
, where T
is defined the moment when you instantiate the class. This allows us to make a synchronized_queue
of int
s, char
s, long
s, string
s, or any other struct! (This is how vectors and maps support arbitrary types.) We call a class or struct like this a generic, because it is not bound to a specific type.
Templates is just the way C++ does generics, and they can get very… involved. Don’t worry about it! You only need to know what we have already described.
More On C++ Templates
How does the server use the synchronized queue?
Take a look at server/server.cpp
. In start()
, we run this line:
this->client_listener = std::thread(&KvServer::accept_clients_loop, this);
which spawns a thread to run accept_clients_loop
. While the server isn’t stopped, accept_clients_loop
accepts incoming clients and adds them to this->conn_queue
, which is a synchronized queue.
Going back to start()
, the server then spawns “worker threads,” which each run work_loop
:
this->workers.resize(this->n_workers);
for (auto &&worker : this->workers) {
worker = std::thread(&KvServer::work_loop, this);
}
work_loop
pops a client off this->conn_queue
, retrieves all the client’s requests, then passes each to process_request
and sends the corresponding response back to the client. process_request
calls the method in your KVStore corresponding to the request type (e.g., for a GetRequest get_req
, it calls this->store.Get(get_req, &get_res)
.
In summary, the server uses the synchronized queue to hold clients. It pushes new clients onto the queue in accept_clients_loop
, and pops clients off the queue in work_loop
to process their requests. To process the requests, the worker threads call process_request
, which in turn calls the KVStore methods you implemented in step 1. With a synchronized queue and multiple worker threads, there can be multiple requests made to your KVStore concurrently.
Queue Layout
This is the declaration of our synchronized_queue
(in server/synchronized_queue.hpp
):
template <typename T>
class synchronized_queue {
public:
size_t size();
bool pop(T *elt);
void push(T elt);
std::vector<T> flush();
void stop();
private:
std::queue<T> q;
std::mutex mtx;
std::condition_variable cv;
std::atomic_bool is_stopped{false};
};
Tip
Wherever you see T
, mentally replace it with int
, char
, or your favorite struct. Once you instantiate the queue (say, as a synchronized_queue<int>
), the compiler actually turns those T
s into int
s! Doing this mental replacement might make the code easier to understand.
Here’s a link on how this actually occurs, if you want to go down the rabbithole.
As you can see, our synchronized_queue
is nothing more than an underlying queue. Your task is to implement these methods and use the given synchronization primitives to make them thread-safe.
Implementation
Task: In server/synchronized_queue.cpp
, implement the function bodies for pop
, push
, flush
, stop
, and size
, and use the provided synchronization primitive(s) to support synchronized access. Be sure to review the function headers in server/synchronized_queue.hpp
.
You will find the std::queue
documentation very helpful!
When you are finished, you should pass all queue_tests
. You can check that your implementation is correct by running in the build/
directory:
$ make check A3
Step 4: Single-Threaded Bucket-Based KV Store
Overview
Operations on your simple key-value store are thread-safe and concurrent, but there is no actual parallelism. Since two threads cannot modify the store at the same time, it is thread-safe. However, the entire map gets locked every time a thread wants to perform an operation on the store, which means that there are no parallel (i.e. simultaneous) operations.
This single mutex protection strategy is an example of coarse-grained locking: one lock protects access to a large section of code or an entire data structure (in this case, a key-value store). Coarse-grained locks are easier to implement and reason about, but also lead to significant lock contention (when a thread tries to acquire a lock already owned by another thread) and impose a limit on system concurrency. On the other hand, fine-grained locking attempts to reduce lock contention by splitting the data structure or code into smaller, potentially disjoint pieces that each have their own lock. Multiple threads can then operate on separate pieces in parallel. Fine-grained locks benefit from the idea that threads working on disjoint pieces do not need to exclude each other from the entire object in order to maintain thread safety; for instance, in a binary search tree, operations on the left subtree do not necessarily affect operations on the right subtree.
Fine-grained locking considerations
Not all data structures are particularly amenable to fine-grained locking. For example, finding a smaller lock granularity in Step 3’s synchronized_queue
that maintains thread-safety and correctness while improving performance is significantly more difficult (but still possible!) than, say, a binary search tree.
Finer-grained locking schemes also introduce system complexity and additional overhead from maintaining more locks; as a result, fine-grained locking is not always more performant, and premature optimization through fine-grained locks can introduce nasty concurrency bugs. As an example, an extreme implementation of fine-grained locking within a key-value store might associate a lock with each key-value pair. It might be informative to consider why associating a lock with each key-value pair is both inefficient and can introduce subtle concurrency bugs, depending on the underlying store implementation!
When considering lock granularity (both in this assignment and in future programs), it’s important to find a suitable granularity that balances performance and system complexity.
Example of Coarse vs. Fine-grained locking
A data structure you could implement coarse or fine-grained locking would be an array.
Coarse-grained locking would imply locking the entire array when any of it needs to be modified. This means that even if only a small part of the array will be modified, no other threads can access any part of the array.
This would look something like having an array-specific mutex.
my_array_mtx.lock();
my_array[0] = 123;
my_array_mtx.unlock();
my_array_mtx.lock();
for(int i = 0; i < sizeof(my_array) / sizeof(int); i++){
my_array[i] = i + 3;
}
my_array_mtx.unlock();
Alternatively, you could have fine-grained locking. This would involve ensuring that, instead of the entire array being unavailable, only the specific components changed will be unavailable to other threads. Thus, every “slot” in the array would need a corresponding mutex. While this allows for greater flexibility for threads, this structure may be a lot heavier on the system (because one would need to create more mutexes).
This would look something like:
my_array_mtx_array[0].lock();
my_array[0] = 123;
my_array_mtx_array[0].unlock();
for(int i = 0; i < sizeof(my_array) / sizeof(int); i++){
my_array_mtx_array[i].lock();
my_array[i] = i + 3;
my_array_mtx_array[i].unlock();
}
Notice in the above example, the scalability isn’t too great either – this is because we chose a too-fine grain 🌾 Thus, it’s always important to find a balance as mentioned above.
Another very cool example of fine and coarse-grain locking is locking a linked list! Feel free to examine this link for more information.
Implementation
If the single mutex locking strategy from Step 2 is too coarse-grained, and one lock for each key-value pair is too fine-grained, how can we design an efficient and scalable KVStore? We need to revisit our internal store design to support a new locking scheme–one where there’s more than one lock, but fewer locks than the number of keys.
For this part, you will implement the key-value store API as a bucket-based hash table.
Take a look at how we define a ConcurrentKvStore
(in kvstore/concurrent_kvstore.hpp
):
class ConcurrentKvStore : public KvStore {
public:
ConcurrentKvStore() = default;
~ConcurrentKvStore() = default;
bool Get(const GetRequest *req, GetResponse *res) override;
bool Put(const PutRequest *req, PutResponse *res) override;
bool Append(const AppendRequest *req, AppendResponse *res) override;
bool Delete(const DeleteRequest *req, DeleteResponse *res) override;
bool MultiGet(const MultiGetRequest *req, MultiGetResponse *res) override;
bool MultiPut(const MultiPutRequest *req, MultiPutResponse *res) override;
std::vector<std::string> AllKeys() override;
private:
DbMap store;
};
Notice that your concurrent key-value store will support the same operations as the SimpleKvStore
, so refer to Step 1 for descriptions of each method. Your internal store is defined as type DbMap
. We have already implemented the DbMap
struct for you. Your responsibility is to write the methods in kvstore/concurrent_kvstore.cpp
.
For this part, you may assume that only one thread will operate on your bucket-based KVStore at a time. We will add support for synchronized access in Step 5.
You are required to keep your internal store as type DbMap
, and to maintain the internal structure of the DbMap
(i.e., no changing this line):
std::array<std::list<DbItem>, BUCKET_COUNT> buckets;
However, you are not required to use getIfExists
, insertItem
, or removeItem
if you don’t want to. We provide them because we think they will be helpful helper methods for you to work with your internal store, but if you’d rather write different ones (or do all of your work in kvstore/concurrent_kvstore.cpp
), feel free.
Task: Use the given DbMap
implementation to fill in the function bodies for Get
, Put
, Append
, Delete
, MultiGet
, MultiPut
, and AllKeys
in kvstore/concurrent_kvstore.cpp
.
When you are finished, you should pass all kvstore_sequential_tests
. You can check that your implementation is correct by running in the build/
directory:
make check A4
If you’re failing an specific test, you can run it individually:
./test_name
For Steps 4 and 5, you can either specify concurrent
as an argument or just leave it blank (as in the example above).
Step 5: Concurrent Bucket-Based KV Store
Overview
In this part, you’ll modify your bucket-based key-value store implementation to support both safe and concurrent access.
Readers-writer locks
Recall that a data race can occur when a piece of shared data is not correctly protected. However, the key issue behind data races is mutability. On concurrent access and modification, instruction interleaving may occur and memory can become inconsistent; this is not an issue when different threads are only reading a piece of shared data.
As a result, one potential optimization is a readers-writer lock, or RWLock (std::shared_mutex
in C++). RWLocks allow for concurrent access for read-only operations, but exclusive access for write operations. Before entering a critical section, threads may attempt to acquire either a shared (or read) lock or an exclusive (or write) lock; multiple threads may hold a shared lock at once, but like regular mutexes, only one thread may hold an exclusive lock. To ensure safe concurrent or exclusive access, reader threads cannot acquire a shared lock until there is no writer thread holding an exclusive lock.
As with mutexes, it is up to the programmer to properly protect shared data and lock critical sections within the program. RWLocks present an additional challenge over mutexes: programmers must also ensure that threads acquire the correct type of lock before entering a critical section or potentially modifying data. For example, acquiring a shared lock and then modifying shared data is undefined behavior. As you implement key-value store functions, keep in mind which functions are amenable to concurrent access, and which require exclusive access.
Example of RWLocks
In addition to the lock
and unlock
functions, the std::shared_mutex
class provides two additional methods, lock_shared
and unlock_shared
:
std::shared_lock shared_mtx;
shared_mtx.lock_shared();
shared_mtx.unlock_shared();
shared_mtx.lock();
shared_mtx.unlock();
In addition, like regular mutexes, C++ provides a wrapper class, std::shared_lock
, that automatically unlocks the shared mutex once it goes out of scope. This has the same API as std::unique_lock
:
void foo() {
std::shared_lock guard(shared_mtx);
}
Optional: Why not just use RWLocks all the time?
Especially in applications that experience read-heavy workloads, RWLocks permit more concurrency over regular mutexes. However, RWLocks should not be used as a drop-in replacement for mutexes: RWLocks come with additional overhead over simple mutexes, as they must contain additional state and logic to handle readers and writers. RWLocks also must avoid reader or writer starvation, where one operation is disproportionately favored. Wikipedia’s article provides an introduction to potential RWLock implementations and problems.
Task: Add synchronization primitives to DbMap
to synchronize access to your buckets. We’d recommend using RWLocks!
You must use more than one mutex to guard your store.
Hints
Think about a more fine-grained approach that makes concurrent access to your store more efficient! See Part 4 for an explanation of coarse-grained vs. fine-grained locking.
Task: Modify the methods in kvstore/concurrent_kvstore.cpp
to enforce synchronized access, using the primitives you defined in the previous task.
When you are finished, you should pass all kvstore_parallel_tests
. You can check that your implementation is correct by running in the build/
directory:
$ make check A5
Hints
You want to make sure that the MultiGet
, MultiPut
, and AllKeys
operations are atomic. When implementing these methods, be mindful of your locking order. Specifically, you will want to enforce a hierarchical locking scheme and acquire all relevant locks before accessing or modifying the store. Consider why this is necessary, both to ensure atomic operations and to prevent deadlocks.
Much like atomic instructions, an atomic operation should be executed as a single, indivisible unit, without interference or interleaving from other threads. That is, once started, an atomic operation should not be impacted by any other operations that may be executing simultaneously.
Atomic operations are often used in concurrent programming to ensure the integrity of shared data. In this context of a key-value store, other operations should not modify relevant buckets while the MultiGet
, MultiPut
, and AllKeys
operations are running.
At this point, your implementation should pass all of the Part A tests! You can test this by running in the build/
directory:
$ ./run_tests concurrent_store
or
$ make check concurrent_store
Debugging with Sanitizers and GDB
Debugging concurrent code can be trickier than debugging single-threaded code. As threads interact with the shared data structures, it can be difficult to track down data races and deadlocks. Luckily, GDB and thread sanitizers can be really helpful.
Thread Sanitizers
Google’s ThreadSanitizer for C and C++ can help you detect data races, deadlocks, and thread leaks. To run the code with thread sanitizers, compile it with:
make TSAN=1
When compiling tests, you will need to specify TSAN=1
too:
make check TSAN=1
or
./test.sh TSAN=1 <A1|A2|A3|A4|A5>
Example
Consider the following example:
int count = 0;
int main() {
std::thread p1(child);
std::thread p2(child);
p1.join();
p2.join();
printf("count %d\n", count);
}
int child() {
while(count < 100) {
std::this_thread::sleep_for(std::chrono::milliseconds(1));
count += 1;
}
return 0;
}
When you run this code using a thread sanitizer, it will produce the following output:

Note: Open the image in a new tab for a larger image.
- The red message indicates what error is occurring (in this case, a data race)
- The blue messages indicate conflicting memory accesses with a stack trace. In our case, the conflict occurs on line 25 and line 27. It includes which thread is performing each access, and if the thread holds a mutex it will also indicate it.
- Below these messages, the sanitizer might also indicate:
-
- specific mutexes, when they were aquired, which threads acquired them, and sometimes when they were created
-
- threads, and when they were created
- Refer to this ThreadSanitizerReportFormat for more information on the report format.
GDB
Sometimes if your program hangs due to a deadlock, the thread sanitizer may not be able to return any reports. In these cases, gdb has some more helpful tools. Here are some useful gdb commands:
info threads
: prints out existing threads, the Light Weight Processor (LWP) they’re running on, and the line of code they’re executing. The current thread will have an asterisk next to it.
3 Thread 0x7ffff3bff700 (LWP 9167) "example" __lll_lock_wait ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
4 Thread 0x7ffff33fe700 (LWP 9003) "example" __lll_lock_wait ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
thread <thread-id>
: switches to thread <thread-id>
(gdb) thread 4
[Switching to thread 4 (Thread 0x7ffff33fe700 (LWP 9003))]
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
p <mutex>
: Prints the fields of a mutex. Helpful for determining the owner of the lock identified by its Light Weight Processor.
CTRL+Z
: Stops the execution of the program. If your program is hanging likely due to a deadlock, this allows you to examine the current state of the program. For example, if one of your threads is blocked on a mutex, you can switch to that thread, see which thread owns the mutex, and then switch to that thread.
bt
: Prints out a stacktrace. Often times, threads are blocked on a system call (locking a mutex, waiting on a condition variable, or reading/writing to a file descriptor). Backtrace is useful for discovering where in the codebase the blocking call was called.
f <frame-number>
: Switches to the provided frame number. Allows you to switch to frames within files you’re familiar with and determine which function called blocking system calls.
Handing In and Grading
Handin Instructions
As before, you will hand in your code using Git. In the kvstore/
subdirectory of your project repository, you MUST fill in the text file called README.md.
By 6:00pm on Wednesday, May 3rd, you must have a commit with all steps completed.
Remind me again what the README should contain?
The README.md file will include the following:
- Any design decisions you made and comments for graders, under “Design Overview”. If there’s nothing interesting to say, just list “None”.
- Any collaborators and citations for help that you received, under “Collaborators”. CS 300 encourages collaboration, but we expect you to acknowledge help you received, as is standard academic practice.
- Notes on approximately how long it took you to complete the project. We won’t use this for grading, but we collect the data to calibrate for future years.
Grading Breakdown
- 20% (20 points) for passing the sequential simple KVStore tests (A1, 2 pts per test).
- 10% (10 points) for passing the parallel simple KVStore tests (A2, 2 pts per test).
- 16% (16 points) for passing the queue tests (A3, 2 pts per test).
- 20% (20 points) for passing the sequential advanced KVStore tests (A4, 2 pts per test).
- 34% (34 points) for passing the concurrent advanced KVStore tests (A5, 6 pts per test and 4 pts for passing them all).
Now head to the grading server, make sure that you have the “KVStore” page configured correctly with your project repository, and check that your KVStore runs on the grading server as expected.
Congratulations, you’ve completed the first part of the fifth CS 300 project! 
Acknowledgements: This project was developed for CS 300 by Richard Tang, Carolyn Zech, and Malte Schwarzkopf.
« Back to the main CS 0300 website
Project 5A: Concurrent Store 🗂🗃🗄
Due: Wednesday, May 3 at 6:00 PM EST
Introduction
A key-value store is a storage system in which data (a value) is addressed by a key; common examples are Python’s dictionaries or Java/C++ hash maps. Unlike file systems or other structures where objects are arranged in a hierarchical directory structure, keys map directly to their values. Key-value stores are popular for large-scale web applications because they are particularly amenable to performance and scalability optimizations, such as concurrent access/modification and sharding. Such optimizations are paramount to the success of Internet-scale applications like Facebook, AirBnB, and Twitter, which must handle thousands of connection requests and petabytes of data every day.
In this assignment, you will be exploring ways to implement and optimize a concurrent key-value store. Connections will be able to get, put, append, and delete key-value pairs, and dynamically add or remove servers to distribute load. This assignment will consist of two parts:
Optional: Interested in real-world use cases of key-value stores and sharding?
Learning Objectives
This assignment will help you:
Assignment Installation
Ensure that your project repository has a handout remote. Type:
If this reports an error, run:
Then run:
This will merge our Project 5 stencil code with your repository.
Important Note: This project requires a newer version of C++ (
C++20
) that the container’s compilers don’t completely support; you will thus need to install a newer C++ toolchain. Please run the following commands before proceeding:Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the
kvstore
directory in the working copy of your projects repository.Infrastructure Help
Stencil and Initial State
The stencil includes implementations for the client and server, as well as some utility functions and classes you may find useful. We are providing you with a fully implemented client and server, so you are responsible for implementing the simple key-value store, the synchronized queue class, and the concurrent key-value store.
You may find it helpful to frequently refer back to this table in both parts of the assignment; the support code is substantial, but we will indicate both here and within the stencil which utilities may be useful in your implementation.
build/
tests/
andtest_utils/
client/
andrepl/
KvServer
andShardmaster
s. You should not modify these.cmd/
common/
Shard
struct, as well as some useful helpers for managing or operating on them. Additionally, contains miscellaneous utility functions, such as color printing and string hashing. You will needutils.hpp
for the concurrent key-value store, andshard.hpp
for Distributed Store. (You might also findconfig.{hpp, cpp}
helpful for Distributed Store.)kvstore/
KvStore
interface, as well as a simple synchronized key-value store implementation. Implement your simple and concurrent key-value store here.net/
server/
KvServer
class, and supporting data structures. Implement your synchronized queue here. You will also modify your server to make it sharding-aware in Distributed Store.shardmaster/
Shardmaster
class, and support code to process persistent connections. Implement your shardmaster in Distributed Store here.Compile and Run
To compile,
cd
into thebuild/
directory and runmake
. This will create two executables:simple_client
andserver
.The server will not function properly until after you implement part 3!
To run the server, specify the port, and optionally the number of workers:
To run the simple client, specify the server address:
Example Usage
To connect multiple clients to a server, start by opening multiple terminals (or use
tmux
).Terminal 1: runs
server
on port 5000 with the default number of workers:Your server will then be listening on
<hostname>:5000
; the server’s full address will be displayed in the terminal.Terminal 2: runs
simple_client
to connect to the server, using the address displayed above (in this example, the hostname is65a4af2c28ae
):Using the Client
Once connected to the server, you can use one or more clients to run standard key-value store operations, like
get
,put
,append
, anddelete
. You can also use two additional requests,multiget
andmultiput
(for multiple key-value pairs). For specific usage instructions, typehelp
in the command prompt.To exit the client (and any other executable), press
Ctrl-D
(orCmd-D
); this sendsEnd-Of-File
(EOF
) to the program, causing it to exit cleanly. If your program hangs, useCtrl-C
(orCmd-C
) to force-quit the program.Testing
We strongly recommend you test your code often while implementing, by compiling and running a server with a few clients (as shown above).
To run tests, you must be in the
build/
directory. Then, run eithermake check concurrent_store
or./run_tests concurrent_store
from inside thebuild/
directory, which will run all of the Concurrent Store tests.The tests are in the
tests/
directory, with each subdirectory corresponding to a part of the assignment. After implementing each step, the testing harness will run all tests related to that step. Specifically, the steps will run all tests in the following directories:kvstore_sequential_tests
kvstore_parallel_tests
queue_tests
kvstore_sequential_tests
kvstore_parallel_tests
To run an individual test, run that specific test’s executable. For instance, for
test_put_get_delete
, run in thebuild/
directory:The correctness tests can also optionally specify which type of KVStore to run (
SimpleKvStore
orConcurrentKvStore
) by passing in eithersimple
orconcurrent
as an argument:The tests default to the
ConcurrentKvStore
.If you want to write your own tests, add new files to the subdirectories in
tests/
. Be sure to take a look at the functions intest_utils/test_utils.hpp
— they will make writing your own tests much easier.Assignment
Background
Concurrency in C++
Concurrency is useful because it allows processes to do multiple things seemingly at the same time: for example, a process that is waiting for a slow operation (such as fetching data over the internet) can use the processor for something else until it completes. When a computer has multiple processors, concurrency also allows for parallelism, i.e., the computer actually does multiple things at the same time.
However, concurrency also introduces a new category of undefined behavior: race conditions. A primary goal for any concurrent program is to avoid undefined behavior due to race conditions between threads. A program that is free of such race conditions is known as thread-safe, which means any data shared between threads is protected by appropriate synchronization, in case threads try to access or modify it simultaneously. Some of the most common synchronization primitives are mutexes and condition variables, which you will be using in this assignment. Additionally, you will see atomic variables.
Optional: Refresher on synchronization primitives
Mutexes
A mutex provides mutual exclusion between threads. Specifically, the mutex ensures that no more than one thread can be active inside a critical section at a time. Before entering the critical section (which, for example, accesses a piece of shared data), threads attempt to lock the “associated” mutex, and unlock it once they are done. This guarantees that only the one thread that successfully locked the mutex can modify the data. Since a thread will hang until it can lock the mutex, deadlocks may occur if the mutex is never unlocked.
Condition Variables
Condition variables allow a thread to wait for a condition to be met. Changes in that condition will be caused by another thread, which is also responsible for notifying the threads that wait on the condition variable. Example conditions are “Is the queue still empty?” or “Has the number of clients exceeded a threshold?”. Instead of a thread continuously locking, checking, and unlocking over and over again, it can wait on a condition variable until another thread signals that it’s worth trying now.
For a thread to wait on a condition variable with the C++
std::condition_variable
, the thread needs to first lock the associated mutex, and then pass that mutex to the condition variable. The thread will then be suspended and the mutex unlocked automatically, until the thread is woken up. When it is woken up because the wait condition is now met, the thread is guaranteed to have the mutex locked again.Think about why this is necessary! If a thread is waiting for a condition, but the mutex stays locked while it is waiting, then nothing will be able to change the shared variable (e.g. a buffer or queue) and the condition will never be met.
Atomic Variables
In C++, we can use the
std::atomic
library to declare atomic variables. The compiler translates operations on these variables into atomic instructions, which always execute without racing with any other processor’s accesses.A specific type of mutexes that might be helpful to you are “unique locks”. These are a specific type of lock that automatically unlock once the lock is out of scope.
Example of Unique Locks
The
std::unique_lock
automatically unlocks a mutex once it goes out of scope.Step 1: Single-Threaded Simple KV Store
In this section, you’ll implement a single-threaded simple key-value store. By “single-threaded,” we mean that only one thread can safely operate on the KVStore at a time. This means that for Step 1, you should not worry about concurrency. You may safely assume that only one thread will ever be operating on your KVStore at a time.
Take a look at how we define a
SimpleKvStore
(inkvstore/simple_kvstore.hpp
):Task: Define your internal key-value store. What data structure might be useful here? You may find this reference page useful.
Now it’s time to implement each of these methods. First, look at how we define the
Request
andResponse
structs innet/server_commands.hpp
. TheRequest
object contains the information being requested. TheResponse
object is where your methods should store the response to the request (if a response is needed). For theAllKeys
operation, return a vector of the keys. For the other operations, returntrue
if the operation is successful; otherwise, returnfalse
.Give me an example!
A call to
Get(const GetRequest *req, GetResponse *res)
takes in a pointer to aGetRequest
, which has akey
field. Your method should get the value corresponding tokey
from your internal key-value store and store it in thevalue
field of theGetResponse
thatres
points to.If a
Response
struct is empty, that means you don’t need to provide an explicit confirmation that you fulfilled the request. For example, for a call toPut(const PutRequest *req, PutResponse *res)
, you should insert thekey
andvalue
from the PutRequest thatreq
points to into your internal key-value store. SincePutResponse
is an empty struct, you don’t need to do anything with it.Your key-value store will support the following operations:
Get
<key>
<key>
from the key-value store. Fails if it does not exist.Put
<key>
,<value>
<key>
to<value>
; replaces existing values.Append
<key>
,<value>
<value>
to the current value associated with<key>
. Creates a new key-value pair if it does not already exist.Delete
<key>
<key>
from the key-value store, returning the last saved value. Fails if it does not exist.MultiGet
<key_1> ... <key_n>
<key_1> ... <key_n>
from the key-value store. Fails if any key does not exist.MultiPut
<key_1> ... <key_n>, <value_1> ... <value_n>
<key_1> ... <key_n>
to<value_1> ... <value_n>
, respectively.AllKeys
Task: Implement the methods in
kvstore/simple_kvstore.cpp
.When you are finished, you should pass all
kvstore_sequential_tests
. You can check that your implementation is correct by running in thebuild/
directory:If you’re failing an individual test, you can run it directly, via
Tests default to using the
ConcurrentKvStore
(which you’ll implement in Steps 4 and 5), so you’ll need to explicitly tell each test to use theSimpleKvStore
by passing insimple
as an argument.Step 2: Concurrent Simple KV Store
We would like our server to be able to handle multiple requests in parallel. However, our simple key-value store can’t safely handle concurrent requests yet! In this step, you’ll now make your simple key-value store thread-safe.
Implementation
Task: Add synchronization primitive(s) to
SimpleKvStore
inkvstore/simple_kvstore.hpp
to make it thread-safe. You should use at most one mutex in this part. (We will explore more efficient locking schemes in later steps.)Task: Modify the methods in
kvstore/simple_kvstore.cpp
to enforce synchronized access, using the primitive(s) you defined in the previous task.When you are finished, you should pass all
kvstore_parallel_tests
. You can check that your implementation is correct by running in thebuild/
directory:Hints
SimpleKvStore
API functions here, and throughout the rest of the project.MultiGet
,MultiPut
, andAllKeys
operations are atomic. Much like atomic instructions, an atomic operation should be executed as a single, indivisible unit, without interference or interleaving from other threads. That is, once started, an atomic operation should not be impacted by any other operations that may be executing simultaneously.Atomic operations are often used in concurrent programming to ensure the integrity of shared data. In this context of your simple key-value store, other operations should not modify the store while the
MultiGet
,MultiPut
, andAllKeys
operations are running.Step 3: Synchronized Queue
Now, your simple key-value store can safely handle concurrent requests. However, at the moment the server can’t handle concurrent requests yet! In this part, you’ll implement a synchronized queue, which the server will use to concurrently handle multiple clients and send their requests to your simple key-value store.
Your server will utilize a generic synchronized queue. The synchronized queue will allow multiple threads to add and remove elements safely, and wait for new elements to arrive. It is also handy to distribute requests from different clients across multiple threads!
Context
Synchronized?
A significant part of concurrency lies in making sure that multiple threads can safely access the same data structure concurrently. One common approach is to create a mutex for the data structure, lock it whenever it is accessed, and unlock it afterward. While effective, having to lock and unlock each time can clutter your code, and forgetting to do so will end up causing problems.
A synchronized data structure (or thread-safe data structure) is a data structure built specifically to be accessed concurrently by multiple threads, and does not need to be protected by an external mutex.
Generic?
You may have noticed the use of
template <typename T>
before class or function declarations for the queue. Its purpose is to allow you to define a class or struct that works with some arbitrary type,T
, whereT
is defined the moment when you instantiate the class. This allows us to make asynchronized_queue
ofint
s,char
s,long
s,string
s, or any other struct! (This is how vectors and maps support arbitrary types.) We call a class or struct like this a generic, because it is not bound to a specific type.Templates is just the way C++ does generics, and they can get very… involved. Don’t worry about it! You only need to know what we have already described.
More On C++ Templates
How does the server use the synchronized queue?
Take a look at
server/server.cpp
. Instart()
, we run this line:which spawns a thread to run
accept_clients_loop
. While the server isn’t stopped,accept_clients_loop
accepts incoming clients and adds them tothis->conn_queue
, which is a synchronized queue.Going back to
start()
, the server then spawns “worker threads,” which each runwork_loop
:work_loop
pops a client offthis->conn_queue
, retrieves all the client’s requests, then passes each toprocess_request
and sends the corresponding response back to the client.process_request
calls the method in your KVStore corresponding to the request type (e.g., for aGetRequest get_req
, it callsthis->store.Get(get_req, &get_res)
.In summary, the server uses the synchronized queue to hold clients. It pushes new clients onto the queue in
accept_clients_loop
, and pops clients off the queue inwork_loop
to process their requests. To process the requests, the worker threads callprocess_request
, which in turn calls the KVStore methods you implemented in step 1. With a synchronized queue and multiple worker threads, there can be multiple requests made to your KVStore concurrently.Queue Layout
This is the declaration of our
synchronized_queue
(inserver/synchronized_queue.hpp
):Tip
Wherever you see
T
, mentally replace it withint
,char
, or your favorite struct. Once you instantiate the queue (say, as asynchronized_queue<int>
), the compiler actually turns thoseT
s intoint
s! Doing this mental replacement might make the code easier to understand.Here’s a link on how this actually occurs, if you want to go down the rabbithole.
As you can see, our
synchronized_queue
is nothing more than an underlying queue. Your task is to implement these methods and use the given synchronization primitives to make them thread-safe.Implementation
Task: In
server/synchronized_queue.cpp
, implement the function bodies forpop
,push
,flush
,stop
, andsize
, and use the provided synchronization primitive(s) to support synchronized access. Be sure to review the function headers inserver/synchronized_queue.hpp
.You will find the
std::queue
documentation very helpful!When you are finished, you should pass all
queue_tests
. You can check that your implementation is correct by running in thebuild/
directory:Step 4: Single-Threaded Bucket-Based KV Store
Overview
Operations on your simple key-value store are thread-safe and concurrent, but there is no actual parallelism. Since two threads cannot modify the store at the same time, it is thread-safe. However, the entire map gets locked every time a thread wants to perform an operation on the store, which means that there are no parallel (i.e. simultaneous) operations.
This single mutex protection strategy is an example of coarse-grained locking: one lock protects access to a large section of code or an entire data structure (in this case, a key-value store). Coarse-grained locks are easier to implement and reason about, but also lead to significant lock contention (when a thread tries to acquire a lock already owned by another thread) and impose a limit on system concurrency. On the other hand, fine-grained locking attempts to reduce lock contention by splitting the data structure or code into smaller, potentially disjoint pieces that each have their own lock. Multiple threads can then operate on separate pieces in parallel. Fine-grained locks benefit from the idea that threads working on disjoint pieces do not need to exclude each other from the entire object in order to maintain thread safety; for instance, in a binary search tree, operations on the left subtree do not necessarily affect operations on the right subtree.
Fine-grained locking considerations
Not all data structures are particularly amenable to fine-grained locking. For example, finding a smaller lock granularity in Step 3’s
synchronized_queue
that maintains thread-safety and correctness while improving performance is significantly more difficult (but still possible!) than, say, a binary search tree.Finer-grained locking schemes also introduce system complexity and additional overhead from maintaining more locks; as a result, fine-grained locking is not always more performant, and premature optimization through fine-grained locks can introduce nasty concurrency bugs. As an example, an extreme implementation of fine-grained locking within a key-value store might associate a lock with each key-value pair. It might be informative to consider why associating a lock with each key-value pair is both inefficient and can introduce subtle concurrency bugs, depending on the underlying store implementation!
When considering lock granularity (both in this assignment and in future programs), it’s important to find a suitable granularity that balances performance and system complexity.
Example of Coarse vs. Fine-grained locking
A data structure you could implement coarse or fine-grained locking would be an array.
Coarse-grained locking would imply locking the entire array when any of it needs to be modified. This means that even if only a small part of the array will be modified, no other threads can access any part of the array.
This would look something like having an array-specific mutex.
Alternatively, you could have fine-grained locking. This would involve ensuring that, instead of the entire array being unavailable, only the specific components changed will be unavailable to other threads. Thus, every “slot” in the array would need a corresponding mutex. While this allows for greater flexibility for threads, this structure may be a lot heavier on the system (because one would need to create more mutexes).
This would look something like:
Notice in the above example, the scalability isn’t too great either – this is because we chose a too-fine grain 🌾 Thus, it’s always important to find a balance as mentioned above.
Another very cool example of fine and coarse-grain locking is locking a linked list! Feel free to examine this link for more information.
Implementation
If the single mutex locking strategy from Step 2 is too coarse-grained, and one lock for each key-value pair is too fine-grained, how can we design an efficient and scalable KVStore? We need to revisit our internal store design to support a new locking scheme–one where there’s more than one lock, but fewer locks than the number of keys.
For this part, you will implement the key-value store API as a bucket-based hash table.
Take a look at how we define a
ConcurrentKvStore
(inkvstore/concurrent_kvstore.hpp
):Notice that your concurrent key-value store will support the same operations as the
SimpleKvStore
, so refer to Step 1 for descriptions of each method. Your internal store is defined as typeDbMap
. We have already implemented theDbMap
struct for you. Your responsibility is to write the methods inkvstore/concurrent_kvstore.cpp
.For this part, you may assume that only one thread will operate on your bucket-based KVStore at a time. We will add support for synchronized access in Step 5.
You are required to keep your internal store as type
DbMap
, and to maintain the internal structure of theDbMap
(i.e., no changing this line):However, you are not required to use
getIfExists
,insertItem
, orremoveItem
if you don’t want to. We provide them because we think they will be helpful helper methods for you to work with your internal store, but if you’d rather write different ones (or do all of your work inkvstore/concurrent_kvstore.cpp
), feel free.Task: Use the given
DbMap
implementation to fill in the function bodies forGet
,Put
,Append
,Delete
,MultiGet
,MultiPut
, andAllKeys
inkvstore/concurrent_kvstore.cpp
.When you are finished, you should pass all
kvstore_sequential_tests
. You can check that your implementation is correct by running in thebuild/
directory:If you’re failing an specific test, you can run it individually:
For Steps 4 and 5, you can either specify
concurrent
as an argument or just leave it blank (as in the example above).Step 5: Concurrent Bucket-Based KV Store
Overview
In this part, you’ll modify your bucket-based key-value store implementation to support both safe and concurrent access.
Readers-writer locks
Recall that a data race can occur when a piece of shared data is not correctly protected. However, the key issue behind data races is mutability. On concurrent access and modification, instruction interleaving may occur and memory can become inconsistent; this is not an issue when different threads are only reading a piece of shared data.
As a result, one potential optimization is a readers-writer lock, or RWLock (
std::shared_mutex
in C++). RWLocks allow for concurrent access for read-only operations, but exclusive access for write operations. Before entering a critical section, threads may attempt to acquire either a shared (or read) lock or an exclusive (or write) lock; multiple threads may hold a shared lock at once, but like regular mutexes, only one thread may hold an exclusive lock. To ensure safe concurrent or exclusive access, reader threads cannot acquire a shared lock until there is no writer thread holding an exclusive lock.As with mutexes, it is up to the programmer to properly protect shared data and lock critical sections within the program. RWLocks present an additional challenge over mutexes: programmers must also ensure that threads acquire the correct type of lock before entering a critical section or potentially modifying data. For example, acquiring a shared lock and then modifying shared data is undefined behavior. As you implement key-value store functions, keep in mind which functions are amenable to concurrent access, and which require exclusive access.
Example of RWLocks
In addition to the
lock
andunlock
functions, thestd::shared_mutex
class provides two additional methods,lock_shared
andunlock_shared
:In addition, like regular mutexes, C++ provides a wrapper class,
std::shared_lock
, that automatically unlocks the shared mutex once it goes out of scope. This has the same API asstd::unique_lock
:Optional: Why not just use RWLocks all the time?
Especially in applications that experience read-heavy workloads, RWLocks permit more concurrency over regular mutexes. However, RWLocks should not be used as a drop-in replacement for mutexes: RWLocks come with additional overhead over simple mutexes, as they must contain additional state and logic to handle readers and writers. RWLocks also must avoid reader or writer starvation, where one operation is disproportionately favored. Wikipedia’s article provides an introduction to potential RWLock implementations and problems.
Task: Add synchronization primitives to
DbMap
to synchronize access to your buckets. We’d recommend using RWLocks!You must use more than one mutex to guard your store.
Hints
Think about a more fine-grained approach that makes concurrent access to your store more efficient! See Part 4 for an explanation of coarse-grained vs. fine-grained locking.
Task: Modify the methods in
kvstore/concurrent_kvstore.cpp
to enforce synchronized access, using the primitives you defined in the previous task.When you are finished, you should pass all
kvstore_parallel_tests
. You can check that your implementation is correct by running in thebuild/
directory:Hints
You want to make sure that the
MultiGet
,MultiPut
, andAllKeys
operations are atomic. When implementing these methods, be mindful of your locking order. Specifically, you will want to enforce a hierarchical locking scheme and acquire all relevant locks before accessing or modifying the store. Consider why this is necessary, both to ensure atomic operations and to prevent deadlocks.Much like atomic instructions, an atomic operation should be executed as a single, indivisible unit, without interference or interleaving from other threads. That is, once started, an atomic operation should not be impacted by any other operations that may be executing simultaneously.
Atomic operations are often used in concurrent programming to ensure the integrity of shared data. In this context of a key-value store, other operations should not modify relevant buckets while the
MultiGet
,MultiPut
, andAllKeys
operations are running.At this point, your implementation should pass all of the Part A tests! You can test this by running in the
build/
directory:or
Debugging with Sanitizers and GDB
Debugging concurrent code can be trickier than debugging single-threaded code. As threads interact with the shared data structures, it can be difficult to track down data races and deadlocks. Luckily, GDB and thread sanitizers can be really helpful.
Thread Sanitizers
Google’s ThreadSanitizer for C and C++ can help you detect data races, deadlocks, and thread leaks. To run the code with thread sanitizers, compile it with:
When compiling tests, you will need to specify
TSAN=1
too:or
Example
Consider the following example:When you run this code using a thread sanitizer, it will produce the following output:
Note: Open the image in a new tab for a larger image.
GDB
Sometimes if your program hangs due to a deadlock, the thread sanitizer may not be able to return any reports. In these cases, gdb has some more helpful tools. Here are some useful gdb commands:
info threads
: prints out existing threads, the Light Weight Processor (LWP) they’re running on, and the line of code they’re executing. The current thread will have an asterisk next to it.thread <thread-id>
: switches to thread <thread-id>p <mutex>
: Prints the fields of a mutex. Helpful for determining the owner of the lock identified by its Light Weight Processor.CTRL+Z
: Stops the execution of the program. If your program is hanging likely due to a deadlock, this allows you to examine the current state of the program. For example, if one of your threads is blocked on a mutex, you can switch to that thread, see which thread owns the mutex, and then switch to that thread.bt
: Prints out a stacktrace. Often times, threads are blocked on a system call (locking a mutex, waiting on a condition variable, or reading/writing to a file descriptor). Backtrace is useful for discovering where in the codebase the blocking call was called.f <frame-number>
: Switches to the provided frame number. Allows you to switch to frames within files you’re familiar with and determine which function called blocking system calls.Handing In and Grading
Handin Instructions
As before, you will hand in your code using Git. In the
kvstore/
subdirectory of your project repository, you MUST fill in the text file called README.md.By 6:00pm on Wednesday, May 3rd, you must have a commit with all steps completed.
Remind me again what the README should contain?
The README.md file will include the following:Grading Breakdown
Now head to the grading server, make sure that you have the “KVStore” page configured correctly with your project repository, and check that your KVStore runs on the grading server as expected.
Congratulations, you’ve completed the first part of the fifth CS 300 project!
Acknowledgements: This project was developed for CS 300 by Richard Tang, Carolyn Zech, and Malte Schwarzkopf.