« Back to the main CS 300 website
Project 5: Vunmo 
Due April 7, 2021 at 6:00pm AoE
Introduction
In this assignment, you will be implementing the multi-threaded backend of a Venmo-like banking service, as well as a useful synchronized data structure. The banking service will follow a client-server model, where your server will be able to support multiple clients in parallel. Clients will be able to pay and charge other clients, as well as withdraw, deposit, and check their balance.
Learning Objectives
- Understand the fundamentals of threading, specifically starting, managing, and synchronizing threads.
- Face problems introduced by concurrency, like deadlocks and race conditions on shared datastructures.
- Familiarize yourself with using synchronization primitives, namely mutexes and conditional variables.
Installation
Ensure that your project repository has a handout
remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s21-projects.git
Then run:
$ git pull
$ git pull handout master
This will merge our Project 5 (Vunmo) stencil code with your repository.
Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the vunmo
directory in the working copy of your projects repository.
Infrastructure Help
Conceptual Questions
Instructions
- Write your answers to the following questions in your README.
- We strongly recommend that you answer these questions as you complete the project, rather than waiting until after finishing the project code. Answering the questions will help you immensely in understanding the concepts behind the code that you’ll be writing. We do not require an earlier submission of conceptual question answers for this project.
General Questions
- Discuss the tradeoffs between spawning a thread per request versus using a worker pool consisting of a fixed number of threads. Consider how performance will vary as the number of users increases.
- Describe end-to-end what happens when a client types in the command to withdraw money. Take a look at the functions documented in
vunmo-server.hh
, and explain which functions will get invoked. While we’re not looking for a description of implementation details, tell us how these functions work together to process client request(s).
- A popular online food ordering service, PackSnass, includes a feature which allows users to collect rewards and share them with their friends. The PackSnass infrastructure contains the
user_t
struct (to store user data) and send_rewards
function (to transfer rewards) listed below:typedef struct user {
int id;
int rewards;
std::mutex mtx;
} user_t;
void send_rewards(user_t* sender, user_t* receiver){
sender->mtx.lock();
receiver->mtx.lock();
sender->rewards--;
receiver->rewards++;
sender->mtx.unlock();
receiver->mtx.unlock();
}
The PackSnass servers have multiple worker threads, so user requests can be executed in parallel. As such, what might go wrong if two users, Alice and Zachary simultaneously attempt to send each other rewards? How might you change send_rewards
to fix this problem?
Responsible Computing Questions
Imagine that you work for a company that designs control systems for large-scale water treatment plants. A water treatment plant removes contaminants from water that humans will consume and factories will use. Representatives from the Providence city government are thinking about purchasing your system. However, the officials complain that your system has poor scalability when the number of contamination sensors to monitor gets large. You debug the scaling problem and learn that it is caused by the use of coarse-grained locking. A city official suggests that you modify the system to use fine-grained locking. However, the redesign would need to be finished quickly, or else you will lose the government contract.
- Explain the tradeoffs between fine-grained locking and coarse-grained locking, considering performance, development effort, and correctness considerations. (1 sentence)
- How would these differences affect residents of Providence? Describe one impact of coarse-grained locking, and one impact of fine-grained locking. These impacts could be positive or negative. (2–3 sentences)
This question was adapted from “Ethical Tradeoffs in System Design: Performance versus Correctness” by Embedded EthiCS @ Harvard, available under a Creative Commons Attribution 4.0 International License.
Stencil
The stencil includes implementations for the client and server, as well as some utility functions and classes you may find useful. We are providing you with a fully implemented client, so you are responsible for implementing key functions in the server, as well as the synchronized queue class.
Name |
Summary |
synchronized_queue.cc |
Implements synchronized queue class. |
vunmo-server.{hh, cc} |
Implements server class for Vunmo. |
vunmo-client.{hh, cc} |
Implements client class for Vunmo. |
server.cc |
Instantiates and runs the Vunmo server. |
client.cc |
Instantiates and runs the Vunmo client. |
common.hh |
Structs and enums used by both server and client. |
networking_helpers.{hh, cc} |
Helper functions that deal with the networking. |
Compile and Run
To compile, run make
or make all
in your project directory. This will create two executables: server
and client
.
To run the server, specify the port and optionally the number of workers:
$ ./server [port] <workers>
To run the client, specify the server’s host and port, and (optionally supply a client id
).
$ ./client [host] [port] <id>
Example
To connect two clients to a server, start by opening three terminals (or use tmux).
Terminal 1: runs server with default number of workers
$ ./server 1234
Terminal 2: runs client with ID 17 and connects to server
$ ./client localhost 1234 17
Terminal 3: runs client with random ID and connects to server
$ ./client localhost 1234
Using the Client
Once connected to the server, you can use one or more clients to run any of the following commands:
Command |
Use |
balance |
Check current balance |
deposit [amount] |
Increases balance by amount |
withdraw [amount] |
Decreases balance by amount |
pay [client] [amount] |
Transfers amount to client |
charge [client] [amount] |
Notifies client that you request amount |
Whenever a client gets paid or receives a charge from another client, they will be displayed a notification. If it is a charge, they will be prompted for y/n
, and will respond to the server accordingly.
Testing
While you write your code we strongly recommend you test it often by compiling and running a server with a few clients (as shown above). This will also give you a better understanding of what behaviour is expected. We are also providing compiled demos for the server
and client
so you can compare your own output with these (we won’t be testing your output directly, so it doesn’t have to be identical word for word).
Additional, you can run the test suite we provide with the run_tests
script. This script will run tests inside the tests
directory, which contains multiple categories of system and unit tests. (You can also invoke these tests via the familiar make check
command.)
Basic Invocation:
$ ./run_tests [section]
Options:
-s
or --short
: stops at the first failure
-q
or --quiet
: only outputs final score
-c
or --clean
: removes all test executables
-t
or --timeout
: sets test timeout (default 15)
-z
or --tsan
: enabled thread sanitizer
Example
Runs queue
section of tests and short circuits on first failure, with thread sanitizer enabled.
$ ./run_tests -s -z queue
Concurrency in C++
The main goal for any concurrent program is to avoid undefined behavior due to race conditions between threads. A program that is free of such race conditions is known as thread-safe, which means any data shared between threads is protected by appropriate synchronization in case threads try to access or modify it simultaneously. Some of the most common synchronization primitves are mutexes and condition variables, which you will be using in this project. Additionally, you will see atomic variables.
Synchronization Data Structures
Mutexes
A mutex provides mutual exclusion between threads. Specifically, the mutex ensures that no more than one thread can be active inside a critical section at a time. Before entering the critical section (which, e.g., accesses a piece of shared data), threads attempt to lock the “associated” mutex, and unlock it once they are done. This guarantees that only the one thread that successfully locked the mutex can modify the data. Since a thread will hang until it can lock the mutex, deadlocks may occur if the mutex is never unlocked.
Condition Variables
Condition variables allow a thread to wait for a condition to be met. Changes in that condition will be caused by another thread, which is also responsible for notifying the threads that wait on the condition variable. Example conditions are “Is the queue still empty?” or “Has the number of clients exceeded a threshold?”. Instead of a thread continuously locking, checking, and unlocking over and over again, it can wait on a condition variable until another thread signals that it’s worth trying now.
For a thread to wait on a condition variable with the C++ std::condition_variable
, the thread needs to first lock the associated mutex, and then pass that mutex to the condition variable. The thread will then be suspended and the mutex unlocked automatically, until the thread is woken up. When it is woken up because the wait condition is now met, the thread is guaranteed to have the mutex locked again.
Think about why this is necessary! If a thread is waiting for a condition, but the mutex stays locked while it is waiting, then nothing will be able to change the shared variable (e.g., a buffer or queue) and the condition will never be met.
Example
Here is a basic example with two threads, involving cookies because I was hungry when I wrote this.
The Variables:
int num_cookies = 0;
std::mutex num_cookies_mtx;
std::condition_variable num_cookies_cv;
Thread 1:
void bake_cookie() {
num_cookies_mtx.lock()
num_cookies++;
if (num_cookies == 1) {
num_cookies_cv.notify_one();
}
num_cookies_mtx.unlock();
}
Thread 2:
void i_am_hungry(){
std::unique_lock<std::mutex> lock(num_cookies_mtx);
while (num_cookies == 0) {
num_cookies_cv.wait(lock);
}
num_cookies--;
}
Thread 1 will check if there are cookies, and wait until there are. Thread 2 will add a cookie, and notify any waiting threads that it is time for them to wake up because the condition they were waiting for has been met.
One small caveat is that instead of checking in an if
statement, it is necessary to check in a while
statement. This is because due to an absolutely whack implementation of condition variables, waiting threads may spuriously wake up, even if they have not been signalled. There is a historical “reason” for this, but now we just have to live with it.
Atomic Variables
There are some times where a shared variable will just be read and checked, so every access would just be surrounded by mutexes. To make this faster and simpler, we can use an atomic variable instead. These are variables that do not require mutexes, because every interaction with it is guaranteed to be atomic, or “all in one, uninterrupted action”, implemented via atomic machine instructions. In this project, we will use atomic booleans to indicate whether things have stopped. Some important functions in std::atomic
you will want to use are load
and store
.
Concurrency Trouble
When working with concurrency, debugging takes on a whole new meaning. Because the order of events is now not the same every time, you may occasionally see different behaviour when executing the same program. The two main things to look out for are data races and deadlocks. We strongly recommend you use the thread sanitizer to catch these bugs early. It will save you massive headaches, and will make debugging a lot easier.
Data Races
This is when a piece of shared data is not correctly protected, and two or more threads access or modify it. This may cause many different issues! For example, a thread might try to pull data off of an empty queue because it thought the queue wasn’t empty, or you might end up dereferencing an object pointer that was just freed by another thread. In some more bizarre cases, like when dealing with maps, this may sometimes cause infinite loops or other very unexpected behaviour, because of how these data structures are implemented.
Deadlocks
These occur when a thread locks a mutex and never unlocks it: all threads waiting for it to unlock will hang forever. For example, if two threads currently each have a mutex locked, and try to acquire a lock on each other’s mutexes, they will hang forever because no one will unlock those mutexes now. (Another example is a thread trying to lock a mutex twice. This happens surprisingly easily with chains of function calls!)
To avoid deadlocks, make sure you never forget to unlock a mutex, and have some consistent order in which threads acquire and release locks.
Synchronized Queue
In this project, you’ll implementing a generic synchronized queue, which is used both by the server and client implementations. The synchronized queue allows multiple threads to add and remove elements safely. It’s handy when multiple worker threads rush to handle requests from different clients!
Context
Synchronized?
A significant part of concurrency lies in making sure that multiple threads can safely access the same data structure concurrently. One common approach is to create a mutex for the data structure, lock it whenever it is accessed, and unlock it afterward. While effective, having to lock and unlock each time can clutter your code, and forgetting to do so will end up causing problems.
A synchronized data structure (or thread-safe data structure) is a data structure built specifically to be accessed concurrently by multiple threads, and does not need to be protected by an external mutex.
Generic? (C++ Templates)
If you have read through the stencil you may have noticed the use of template <typename T>
before class or function declarations for the queue. Its purpose is to allow you to define a class or struct that works with some arbitrary type, T
, where T
is defined the moment when you instantiate the class. This allows us to make a synchronized_queue
of ints
, chars
, longs
, strings
, or any other struct! (This is how vectors and maps support arbitrary types.) We call a class or struct like this a generic, because it is not bound to a specific type.
Templates is just the way C++ does generics, and they can get very… involved. Don’t worry about it! You only need to know what we have already described.
More On C++ Templates
Queue Layout
This is the declaration of our synchronized_queue
:
template <typename T>
class synchronized_queue {
public:
std::size_t size();
bool pop(T *elt);
void push(T elt);
std::vector<T> flush();
void stop();
private:
std::queue<T> q;
std::mutex mtx;
std::condition_variable cv;
std::atomic_bool is_stopped{false};
};
Tip
Wherever you see T
, just mentally replace it with int
, char
, or your favorite struct. Once you instantiate the queue (say, as a synchronized_queue<int>
), the compiler actually turns those T
into int
! Doing this mental replacement might make the code easier to understand.
Here is more information on templates if you want to go down the rabbithole.
As you can see, our synchronized_queue
is nothing more than an underlying queue, a mutex to protect it, and a condition variable for threads to wait if the queue is empty (ignoring is_stopped
).
In the project, you will implement the public methods of the synchronized queue.
Stopping
The is_stopped
boolean exists in order to ensure all waiting threads are gone when a std::condition_variable
is destroyed. When an instantiated synchronized_queue
object goes out of scope (in this project, that is at the end of the program) its fields are automatically destroyed. There cannot be any threads waiting on a condition variable when it is destroyed; if there are, this will cause an error. This means that before the queue goes out of scope, you must signal all threads waiting on the condition variable, and prevent any new ones from waiting on it. This is where is_stopped
comes into play: once this variable is true
, no new threads should wait on the condition variable.
Note: You need to consider the is_stopped
variable when implementing pop
(and stop
, of course).
Thread-Safety
The key aspect of this queue is that any numbers of threads can be calling the same method at the same time. Always think: “What’s the worst possible sequence of events if multiple threads are running this code?” Keep this idea front and center when implementing the queue’s functions (and throughout the rest of the project).
Tip
When writing your code, consider any two consecutive lines of code. After a thread has finished exeuction the first line but before it has started to execute the next, any number of events may have happened (performed by other threads). What could have changed? What guarantees do I have? What mutexes is the thread holding?
Implementation
Task: Implement the synchronized queue.
In synchronized_queue.cc
, fill in the function bodies for pop
, push
, flush
, and stop
. When you are finished, check that your implementation is correct by running:
$ ./run_tests queue
Server
There are many ways to approach implementing a server: which one a developer chooses comes down to their choice on the tradeoffs between simplicity, efficiency, and scalability. The server you will implement for this project has been designed to be scalable and efficient (which means you’ll have fun implementing it).
Context
Efficiency
A simple single-threaded, single-client server could still work with multiple clients by handling them only one at a time. However, this would be very inefficient, especially if your computer has multiple processors. To take advantage of these processors, we need concurrency. With concurrency and the parallelism of multiple processors comes speed and efficiency, but also the need for synchronization when accessing shared data.
Scalability
One popular approach is to create a new thread every time a client connects, and have that thread be solely responsible for reading and processing the client’s request. While efficient, this approach does not work if your server expects a very large number of simultaneous clients, as the number of threads may grow very large. (Recall that each thread needs its own stack.)
A way to avoid creating a thread per client is to have a fixed number of worker threads and a single work queue. Each time a client sends a request, the request is added to the work queue and can be picked up by any of the worker threads. This approach can scale more easily because the number of threads is now independent of the number of clients. This is the approach we chose to use for this project.
Server Design
The server will consist of three main operational parts which will be initiated upon calling start
:
- One thread to listen for and accept new client connections (
accept_clients_loop
).
- One thread to receive and read client requests, and put them on the work queue (
receive_requests_loop
).
- Several worker threads to process the requests on the queue, and respond to the clients who issued them (
work_loop
).
These three components will run concurrently and work together by sharing data like accepted clients and accounts. It is very important that you synchronize their access to this shared data using synchronization primitves.
Starting the Server: start
This is the first function called after a server is created, and the moment when the three operational parts mentioned above are started. (Since you haven’t implemented the loops yet, the threads will do nothing, which is ok for now!) Take a look at the comments in the stencil for details.
Stopping the Server: stop
So soon? It’s important to clean up everything you create, and starting with start
and stop
is a good exercise. stop
is responsible for joining all un-detached threads and deallocating all memory that was allocated during the execution of your program.
You may notice step 3 in the stencil asks that you flush and delete all requests in the queue. Even though you haven’t put anything in the queue yet, you will in some of your other functions!
We have written some of stop
for you, so you just need to complete it.
Task: Implement start
and stop
.
When you are finished, check that you pass test1_start_stop
by running:
$ ./run_tests -s server_part1
Accepting Clients: accept_clients_loop
Clients talk to the server over the network, and specifically over a TCP connection. TCP connections (and network connections in general) are represented by file descriptors in Linux processes. In the project, a client connects to the server by opening a new connection, which causes the server to instantiate a client_conn_t
to store the file descriptors associated with the connection. If it is a new client, the server also creates an account_t
to hold their balance. Connections and accounts will be stored in the server’s client_conns
and accounts
maps respectively, where they are mapped to from the client’s id
.
Both the accounts
and client_conns
maps have an associated mutex, and so do each client_conn_t
and account_t
, so make sure you are always synchronizing your access when using these. When a thread locks one of the maps, no other threads can access anything in it. For this reason, in order to preserve concurrency, a thread should only lock the maps for as long as it needs to insert, remove, or look something up in one of the maps. In addition, try to avoid holding any more mutexes than absolutely necessary at any given time.
Task: Implement accept_clients_loop
.
When you are finished, check that you pass test2_accept_clients
by running:
$ ./run_tests -s server_part1
Receiving Requests: receive_requests_loop
Once a client has connected, it may begin sending requests in the form of request_t
structs. The fetch_requests
function will check client connections’ file descriptors, read from them, and return the new requests in a vector. The receive_requests_loop
is responsible for putting these requests into the work queue, so workers can start to process them.
Processing Requests: work_loop
Each worker thread is responsible for pulling requests off the queue (in work_loop
) and then updating account balances and sending responses and notifications to the corresponding clients (in process_request
).
Since each thread will be concurrently interacting with multiple accounts, you need to be mindful when accessing these accounts. To make this easier, we have provided functions stubs and descriptions for get_account
and get_two_accounts
, which you will be implementing and using.
Each request_t
has an associated request_type
, which can be one of BALANCE
, DEPOSIT
, WITHDRAWAL
, PAYMENT
, or CHARGE
. We are providing the handlers to modify the balances accordingly, but you are responsible for calling the appropriate handler for each type, as well as correctly accessing the client’s accounts.
If a request is a PAYMENT
or a CHARGE
, the server needs to send a notification to the client getting paid or requested.
Note: All handlers, helpers, and documentation of functions you are implementing can be found in the vunmo-server.hh
file.
Task: Implement work_loop
, process_request
, get_account
, and get_two_accounts
.
When you are finished, check that you pass test3_process_requests
by running:
$ ./run_tests -s server_part1
Moving Forward: The server_part1
section tests the most basic integrity of your server, and not is really checking for concurrency. server_part2
, by contrast, tests your concurrency. You can now see how your server handles a large number of clients all making many requests per second (these tests are in no particular order):
$ ./run_tests server_part2
While the basic_load_test
should not take longer than 15 seconds, you can let it run for longer by specifying a larger test timeout with -t [timeout]
.
Graduate students taking CSCI 1310 should complete one of the following questions.
Using concurrency for performance always requires reasoning about the tradeoff between the cost of synchronization and the performance benefit introduced by parallel execution.
- Consider the design of the Vunmo service, and your implementation. What types of usage patterns (e.g., transfer patterns between users) are likely to scale well to many threads, and which will scale poorly?
- Measure what the performance benefit of concurrency is in this assignment! To do so, you could: (1) create a single-threaded copy of Vunmo, without synchronization, and measure its performance compared to the concurrent implementation; or (2) you could measure the time requests spend in different parts of the system (e.g., the queue, the worker processing, etc.) and reason about which of these parts are parallelizable. To measure time, you might want to look into the
gettimeofday
system call.
Debugging with Sanitizers and GDB
Debugging concurrent code can be trickier than debugging single-threaded code. As threads interact with the shared data structures, it can be difficult to track down data races and deadlocks. Luckily, GDB and thread sanitizers can be really helpful.
Running a Single Test
To run single test, compile it and then run the executable it creates:
$ make <path-to-test>
$ ./<path-to-test>
Example
$ make tests/queue/serial_push_all_flush
$ ./tests/queue/serial_push_all_flush
$ gdb tests/queue/serial_push_all_flush
Thread Sanitizers
Google’s ThreadSanitizer for C and C++ can help you detect data races, deadlocks, and thread leaks. To run the code with thread sanitizers, compile it with:
make TSAN=1
ThreadSanitizer Example
Consider the following example:
int count = 0;
int main() {
std::thread p1(child);
std::thread p2(child);
p1.join();
p2.join();
printf("count %d\n", count);
}
int child() {
while(count < 100) {
std::this_thread::sleep_for(std::chrono::milliseconds(1));
count += 1;
}
return 0;
}
When you run this code using a thread sanitizer, it will produce the following output:

Note: Open the image in a new tab for a larger image.
- The red message indicates what error is occurring (in this case, a data race)
- The blue messages indicate conflicting memory accesses with a stack trace. In our case, the conflict occurs on line
25
and line 27
. It includes which thread is performing each access, and if the thread holds a mutex it will also indicate it.
- Below these messages, the sanitizer might also indicate:
- specific mutexes, when they were aquired, which threads acquired them, and sometimes when they were created
- threads, and when they were created
- Refer to this ThreadSanitizerReportFormat for more information on the report format.
GDB
Sometimes if your program hangs due to a deadlock, the thread sanitizer may not be able to return any reports. In these cases, gdb has some more helpful tools. Here are some useful gdb
commands:
info threads
: prints out existing threads, the Light Weight Processor (LWP) they’re running on, and the line of code they’re executing. The current thread will have an asterisk next to it.
3 Thread 0x7ffff3bff700 (LWP 9167) "example" __lll_lock_wait ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
4 Thread 0x7ffff33fe700 (LWP 9003) "example" __lll_lock_wait ()
at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
thread <thread-id>
: switches to thread <thread-id>
(gdb) thread 4
[Switching to thread 4 (Thread 0x7ffff33fe700 (LWP 9003))]
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
p <mutex>
: Prints the fields of a mutex. Helpful for determining the owner of the lock identified by its Light Weight Processor.
$2 = {<std::__mutex_base> = {_M_mutex = pthread_mutex_t = {Type = Normal,
Status = Acquired, possibly with waiters, Owner ID = 9167, Robust = No, Shared = No,
Protocol = None}}, <No data fields>}
CTRL+Z
: Stops the execution of the program. If your program is hanging likely due to a deadlock, this allows you to examine the current state of the program. For example, if one of your threads is blocked on a mutex, you can switch to that thread, see which thread owns the mutex, and then switch to that thread.
bt
: Prints out a stacktrace. Often times, threads are blocked on a system call (locking a mutex, waiting on a condition variable, or reading/writing to a file descriptor). Backtrace is useful for discovering where in the codebase the blocking call was called.
f <frame-number>
: Switches to the provided frame number. Allows you to switch to frames within files you’re familiar with and determine which function called blocking system calls.
Handing In & Grading
Handin instructions
As before, you will hand in your code using Git. In the vunmo/
subdirectory of your project repository, you MUST fill in the text file called README.md
.
Remind me again what the README.md
should contain?
The README.md
file will include the following:
- Any design decisions you made and comments for graders, under "Design Overview". If there's nothing interesting to say, just list "None".
- Any collaborators and citations for help that you received, under "Collaborators". CS 300 encourages collaboration, but we expect you to acknowledge help you received, as is standard academic practice.
- Your answers to the conceptual questions at the start of this handout under "Conceptual Questions".
- Notes on approximately how long it took you to complete the project. We won't use this for grading, but we collect the data to calibrate for future years.
Grading breakdown
- 80% (80 points) for passing the functional tests in categories queue (16 points), server_part1 (24 points), and server_part2 (40 points). If your tests pass on the grading server with and without sanitizers, you’ve probably got all these points.
- 12% (12 points) answers to conceptual questions (4 points each).
- 8% (8 points) answers to responsible computing (RC) questions.
- Up to 10 points for extra credit.
Now head to the grading server, make sure that you have the “Vunmo” page configured correctly with your project repository, and check that your Vunmo tests pass on the grading server as expected.
Congratulations, you’ve completed the fifth CS 300 project! 
Acknowledgements: The Vunmo project was developed for CS 300.
« Back to the main CS 300 website
Project 5: Vunmo
Due April 7, 2021 at 6:00pm AoE
Introduction
In this assignment, you will be implementing the multi-threaded backend of a Venmo-like banking service, as well as a useful synchronized data structure. The banking service will follow a client-server model, where your server will be able to support multiple clients in parallel. Clients will be able to pay and charge other clients, as well as withdraw, deposit, and check their balance.
Learning Objectives
Installation
Ensure that your project repository has a
handout
remote. Type:If this reports an error, run:
Then run:
This will merge our Project 5 (Vunmo) stencil code with your repository.
Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the
vunmo
directory in the working copy of your projects repository.Infrastructure Help
Conceptual Questions
Instructions
General Questions
vunmo-server.hh
, and explain which functions will get invoked. While we’re not looking for a description of implementation details, tell us how these functions work together to process client request(s).user_t
struct (to store user data) andsend_rewards
function (to transfer rewards) listed below: The PackSnass servers have multiple worker threads, so user requests can be executed in parallel. As such, what might go wrong if two users, Alice and Zachary simultaneously attempt to send each other rewards? How might you changesend_rewards
to fix this problem?Responsible Computing Questions
Imagine that you work for a company that designs control systems for large-scale water treatment plants. A water treatment plant removes contaminants from water that humans will consume and factories will use. Representatives from the Providence city government are thinking about purchasing your system. However, the officials complain that your system has poor scalability when the number of contamination sensors to monitor gets large. You debug the scaling problem and learn that it is caused by the use of coarse-grained locking. A city official suggests that you modify the system to use fine-grained locking. However, the redesign would need to be finished quickly, or else you will lose the government contract.
This question was adapted from “Ethical Tradeoffs in System Design: Performance versus Correctness” by Embedded EthiCS @ Harvard, available under a Creative Commons Attribution 4.0 International License.
Stencil
The stencil includes implementations for the client and server, as well as some utility functions and classes you may find useful. We are providing you with a fully implemented client, so you are responsible for implementing key functions in the server, as well as the synchronized queue class.
synchronized_queue.cc
vunmo-server.{hh, cc}
vunmo-client.{hh, cc}
server.cc
client.cc
common.hh
networking_helpers.{hh, cc}
Compile and Run
To compile, run
make
ormake all
in your project directory. This will create two executables:server
andclient
.To run the server, specify the port and optionally the number of workers:
$ ./server [port] <workers>
To run the client, specify the server’s host and port, and (optionally supply a client
id
).$ ./client [host] [port] <id>
Example
To connect two clients to a server, start by opening three terminals (or use tmux).
Terminal 1: runs server with default number of workers
$ ./server 1234
Terminal 2: runs client with ID 17 and connects to server
$ ./client localhost 1234 17
Terminal 3: runs client with random ID and connects to server
$ ./client localhost 1234
Using the Client
Once connected to the server, you can use one or more clients to run any of the following commands:
balance
deposit [amount]
amount
withdraw [amount]
amount
pay [client] [amount]
amount
toclient
charge [client] [amount]
client
that you requestamount
Whenever a client gets paid or receives a charge from another client, they will be displayed a notification. If it is a charge, they will be prompted for
y/n
, and will respond to the server accordingly.Testing
While you write your code we strongly recommend you test it often by compiling and running a server with a few clients (as shown above). This will also give you a better understanding of what behaviour is expected. We are also providing compiled demos for the
server
andclient
so you can compare your own output with these (we won’t be testing your output directly, so it doesn’t have to be identical word for word).Additional, you can run the test suite we provide with the
run_tests
script. This script will run tests inside thetests
directory, which contains multiple categories of system and unit tests. (You can also invoke these tests via the familiarmake check
command.)Basic Invocation:
$ ./run_tests [section]
Options:
-s
or--short
: stops at the first failure-q
or--quiet
: only outputs final score-c
or--clean
: removes all test executables-t
or--timeout
: sets test timeout (default 15)-z
or--tsan
: enabled thread sanitizerExample
Runs
queue
section of tests and short circuits on first failure, with thread sanitizer enabled.$ ./run_tests -s -z queue
Concurrency in C++
The main goal for any concurrent program is to avoid undefined behavior due to race conditions between threads. A program that is free of such race conditions is known as thread-safe, which means any data shared between threads is protected by appropriate synchronization in case threads try to access or modify it simultaneously. Some of the most common synchronization primitves are mutexes and condition variables, which you will be using in this project. Additionally, you will see atomic variables.
Synchronization Data Structures
Mutexes
A mutex provides mutual exclusion between threads. Specifically, the mutex ensures that no more than one thread can be active inside a critical section at a time. Before entering the critical section (which, e.g., accesses a piece of shared data), threads attempt to lock the “associated” mutex, and unlock it once they are done. This guarantees that only the one thread that successfully locked the mutex can modify the data. Since a thread will hang until it can lock the mutex, deadlocks may occur if the mutex is never unlocked.
Condition Variables
Condition variables allow a thread to wait for a condition to be met. Changes in that condition will be caused by another thread, which is also responsible for notifying the threads that wait on the condition variable. Example conditions are “Is the queue still empty?” or “Has the number of clients exceeded a threshold?”. Instead of a thread continuously locking, checking, and unlocking over and over again, it can wait on a condition variable until another thread signals that it’s worth trying now.
For a thread to wait on a condition variable with the C++
std::condition_variable
, the thread needs to first lock the associated mutex, and then pass that mutex to the condition variable. The thread will then be suspended and the mutex unlocked automatically, until the thread is woken up. When it is woken up because the wait condition is now met, the thread is guaranteed to have the mutex locked again.Think about why this is necessary! If a thread is waiting for a condition, but the mutex stays locked while it is waiting, then nothing will be able to change the shared variable (e.g., a buffer or queue) and the condition will never be met.
Example
Here is a basic example with two threads, involving cookies because I was hungry when I wrote this.
The Variables:
Thread 1:
Thread 2:
Thread 1 will check if there are cookies, and wait until there are. Thread 2 will add a cookie, and notify any waiting threads that it is time for them to wake up because the condition they were waiting for has been met.
One small caveat is that instead of checking in an
if
statement, it is necessary to check in awhile
statement. This is because due to an absolutely whack implementation of condition variables, waiting threads may spuriously wake up, even if they have not been signalled. There is a historical “reason” for this, but now we just have to live with it.Atomic Variables
There are some times where a shared variable will just be read and checked, so every access would just be surrounded by mutexes. To make this faster and simpler, we can use an atomic variable instead. These are variables that do not require mutexes, because every interaction with it is guaranteed to be atomic, or “all in one, uninterrupted action”, implemented via atomic machine instructions. In this project, we will use atomic booleans to indicate whether things have stopped. Some important functions in
std::atomic
you will want to use areload
andstore
.Concurrency Trouble
When working with concurrency, debugging takes on a whole new meaning. Because the order of events is now not the same every time, you may occasionally see different behaviour when executing the same program. The two main things to look out for are data races and deadlocks. We strongly recommend you use the thread sanitizer to catch these bugs early. It will save you massive headaches, and will make debugging a lot easier.
Data Races
This is when a piece of shared data is not correctly protected, and two or more threads access or modify it. This may cause many different issues! For example, a thread might try to pull data off of an empty queue because it thought the queue wasn’t empty, or you might end up dereferencing an object pointer that was just freed by another thread. In some more bizarre cases, like when dealing with maps, this may sometimes cause infinite loops or other very unexpected behaviour, because of how these data structures are implemented.
Deadlocks
These occur when a thread locks a mutex and never unlocks it: all threads waiting for it to unlock will hang forever. For example, if two threads currently each have a mutex locked, and try to acquire a lock on each other’s mutexes, they will hang forever because no one will unlock those mutexes now. (Another example is a thread trying to lock a mutex twice. This happens surprisingly easily with chains of function calls!)
To avoid deadlocks, make sure you never forget to unlock a mutex, and have some consistent order in which threads acquire and release locks.
Synchronized Queue
In this project, you’ll implementing a generic synchronized queue, which is used both by the server and client implementations. The synchronized queue allows multiple threads to add and remove elements safely. It’s handy when multiple worker threads rush to handle requests from different clients!
Context
Synchronized?
A significant part of concurrency lies in making sure that multiple threads can safely access the same data structure concurrently. One common approach is to create a mutex for the data structure, lock it whenever it is accessed, and unlock it afterward. While effective, having to lock and unlock each time can clutter your code, and forgetting to do so will end up causing problems.
A synchronized data structure (or thread-safe data structure) is a data structure built specifically to be accessed concurrently by multiple threads, and does not need to be protected by an external mutex.
Generic? (C++ Templates)
If you have read through the stencil you may have noticed the use of
template <typename T>
before class or function declarations for the queue. Its purpose is to allow you to define a class or struct that works with some arbitrary type,T
, whereT
is defined the moment when you instantiate the class. This allows us to make asynchronized_queue
ofints
,chars
,longs
,strings
, or any other struct! (This is how vectors and maps support arbitrary types.) We call a class or struct like this a generic, because it is not bound to a specific type.Templates is just the way C++ does generics, and they can get very… involved. Don’t worry about it! You only need to know what we have already described.
More On C++ Templates
Queue Layout
This is the declaration of our
synchronized_queue
:template <typename T> class synchronized_queue { public: std::size_t size(); bool pop(T *elt); void push(T elt); std::vector<T> flush(); void stop(); private: std::queue<T> q; std::mutex mtx; std::condition_variable cv; std::atomic_bool is_stopped{false}; };
Tip
Wherever you see
T
, just mentally replace it withint
,char
, or your favorite struct. Once you instantiate the queue (say, as asynchronized_queue<int>
), the compiler actually turns thoseT
intoint
! Doing this mental replacement might make the code easier to understand.Here is more information on templates if you want to go down the rabbithole.
As you can see, our
synchronized_queue
is nothing more than an underlying queue, a mutex to protect it, and a condition variable for threads to wait if the queue is empty (ignoringis_stopped
).In the project, you will implement the public methods of the synchronized queue.
Stopping
The
is_stopped
boolean exists in order to ensure all waiting threads are gone when astd::condition_variable
is destroyed. When an instantiatedsynchronized_queue
object goes out of scope (in this project, that is at the end of the program) its fields are automatically destroyed. There cannot be any threads waiting on a condition variable when it is destroyed; if there are, this will cause an error. This means that before the queue goes out of scope, you must signal all threads waiting on the condition variable, and prevent any new ones from waiting on it. This is whereis_stopped
comes into play: once this variable istrue
, no new threads should wait on the condition variable.Note: You need to consider the
is_stopped
variable when implementingpop
(andstop
, of course).Thread-Safety
The key aspect of this queue is that any numbers of threads can be calling the same method at the same time. Always think: “What’s the worst possible sequence of events if multiple threads are running this code?” Keep this idea front and center when implementing the queue’s functions (and throughout the rest of the project).
Tip
When writing your code, consider any two consecutive lines of code. After a thread has finished exeuction the first line but before it has started to execute the next, any number of events may have happened (performed by other threads). What could have changed? What guarantees do I have? What mutexes is the thread holding?
Implementation
Task: Implement the synchronized queue.
In
synchronized_queue.cc
, fill in the function bodies forpop
,push
,flush
, andstop
. When you are finished, check that your implementation is correct by running:$ ./run_tests queue
Server
There are many ways to approach implementing a server: which one a developer chooses comes down to their choice on the tradeoffs between simplicity, efficiency, and scalability. The server you will implement for this project has been designed to be scalable and efficient (which means you’ll have fun implementing it).
Context
Efficiency
A simple single-threaded, single-client server could still work with multiple clients by handling them only one at a time. However, this would be very inefficient, especially if your computer has multiple processors. To take advantage of these processors, we need concurrency. With concurrency and the parallelism of multiple processors comes speed and efficiency, but also the need for synchronization when accessing shared data.
Scalability
One popular approach is to create a new thread every time a client connects, and have that thread be solely responsible for reading and processing the client’s request. While efficient, this approach does not work if your server expects a very large number of simultaneous clients, as the number of threads may grow very large. (Recall that each thread needs its own stack.)
A way to avoid creating a thread per client is to have a fixed number of worker threads and a single work queue. Each time a client sends a request, the request is added to the work queue and can be picked up by any of the worker threads. This approach can scale more easily because the number of threads is now independent of the number of clients. This is the approach we chose to use for this project.
Server Design
The server will consist of three main operational parts which will be initiated upon calling
start
:accept_clients_loop
).receive_requests_loop
).work_loop
).These three components will run concurrently and work together by sharing data like accepted clients and accounts. It is very important that you synchronize their access to this shared data using synchronization primitves.
Starting the Server:
start
This is the first function called after a server is created, and the moment when the three operational parts mentioned above are started. (Since you haven’t implemented the loops yet, the threads will do nothing, which is ok for now!) Take a look at the comments in the stencil for details.
Stopping the Server:
stop
So soon? It’s important to clean up everything you create, and starting with
start
andstop
is a good exercise.stop
is responsible for joining all un-detached threads and deallocating all memory that was allocated during the execution of your program.You may notice step 3 in the stencil asks that you flush and delete all requests in the queue. Even though you haven’t put anything in the queue yet, you will in some of your other functions!
We have written some of
stop
for you, so you just need to complete it.Task: Implement
start
andstop
.When you are finished, check that you pass
test1_start_stop
by running:$ ./run_tests -s server_part1
Accepting Clients:
accept_clients_loop
Clients talk to the server over the network, and specifically over a TCP connection. TCP connections (and network connections in general) are represented by file descriptors in Linux processes. In the project, a client connects to the server by opening a new connection, which causes the server to instantiate a
client_conn_t
to store the file descriptors associated with the connection. If it is a new client, the server also creates anaccount_t
to hold their balance. Connections and accounts will be stored in the server’sclient_conns
andaccounts
maps respectively, where they are mapped to from the client’sid
.Both the
accounts
andclient_conns
maps have an associated mutex, and so do eachclient_conn_t
andaccount_t
, so make sure you are always synchronizing your access when using these. When a thread locks one of the maps, no other threads can access anything in it. For this reason, in order to preserve concurrency, a thread should only lock the maps for as long as it needs to insert, remove, or look something up in one of the maps. In addition, try to avoid holding any more mutexes than absolutely necessary at any given time.Task: Implement
accept_clients_loop
.When you are finished, check that you pass
test2_accept_clients
by running:$ ./run_tests -s server_part1
Receiving Requests:
receive_requests_loop
Once a client has connected, it may begin sending requests in the form of
request_t
structs. Thefetch_requests
function will check client connections’ file descriptors, read from them, and return the new requests in a vector. Thereceive_requests_loop
is responsible for putting these requests into the work queue, so workers can start to process them.Processing Requests:
work_loop
Each worker thread is responsible for pulling requests off the queue (in
work_loop
) and then updating account balances and sending responses and notifications to the corresponding clients (inprocess_request
).Since each thread will be concurrently interacting with multiple accounts, you need to be mindful when accessing these accounts. To make this easier, we have provided functions stubs and descriptions for
get_account
andget_two_accounts
, which you will be implementing and using.Each
request_t
has an associatedrequest_type
, which can be one ofBALANCE
,DEPOSIT
,WITHDRAWAL
,PAYMENT
, orCHARGE
. We are providing the handlers to modify the balances accordingly, but you are responsible for calling the appropriate handler for each type, as well as correctly accessing the client’s accounts.If a request is a
PAYMENT
or aCHARGE
, the server needs to send a notification to the client getting paid or requested.Note: All handlers, helpers, and documentation of functions you are implementing can be found in the
vunmo-server.hh
file.Task: Implement
work_loop
,process_request
,get_account
, andget_two_accounts
.When you are finished, check that you pass
test3_process_requests
by running:$ ./run_tests -s server_part1
Moving Forward: The
server_part1
section tests the most basic integrity of your server, and not is really checking for concurrency.server_part2
, by contrast, tests your concurrency. You can now see how your server handles a large number of clients all making many requests per second (these tests are in no particular order):$ ./run_tests server_part2
While the
basic_load_test
should not take longer than 15 seconds, you can let it run for longer by specifying a larger test timeout with-t [timeout]
.Extra Credit
Graduate students taking CSCI 1310 should complete one of the following questions.
Using concurrency for performance always requires reasoning about the tradeoff between the cost of synchronization and the performance benefit introduced by parallel execution.
gettimeofday
system call.Debugging with Sanitizers and GDB
Debugging concurrent code can be trickier than debugging single-threaded code. As threads interact with the shared data structures, it can be difficult to track down data races and deadlocks. Luckily, GDB and thread sanitizers can be really helpful.
Running a Single Test
To run single test, compile it and then run the executable it creates:
$ make <path-to-test> $ ./<path-to-test>
Example
# to compile the serial_push_all_flush test $ make tests/queue/serial_push_all_flush # to run the test $ ./tests/queue/serial_push_all_flush # to run the test in gdb $ gdb tests/queue/serial_push_all_flush
Thread Sanitizers
Google’s ThreadSanitizer for C and C++ can help you detect data races, deadlocks, and thread leaks. To run the code with thread sanitizers, compile it with:
ThreadSanitizer Example
Consider the following example:
When you run this code using a thread sanitizer, it will produce the following output:
Note: Open the image in a new tab for a larger image.
25
and line27
. It includes which thread is performing each access, and if the thread holds a mutex it will also indicate it.GDB
Sometimes if your program hangs due to a deadlock, the thread sanitizer may not be able to return any reports. In these cases, gdb has some more helpful tools. Here are some useful
gdb
commands:info threads
: prints out existing threads, the Light Weight Processor (LWP) they’re running on, and the line of code they’re executing. The current thread will have an asterisk next to it.3 Thread 0x7ffff3bff700 (LWP 9167) "example" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 4 Thread 0x7ffff33fe700 (LWP 9003) "example" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
thread <thread-id>
: switches to thread <thread-id>p <mutex>
: Prints the fields of a mutex. Helpful for determining the owner of the lock identified by its Light Weight Processor.$2 = {<std::__mutex_base> = {_M_mutex = pthread_mutex_t = {Type = Normal, Status = Acquired, possibly with waiters, Owner ID = 9167, Robust = No, Shared = No, Protocol = None}}, <No data fields>}
CTRL+Z
: Stops the execution of the program. If your program is hanging likely due to a deadlock, this allows you to examine the current state of the program. For example, if one of your threads is blocked on a mutex, you can switch to that thread, see which thread owns the mutex, and then switch to that thread.bt
: Prints out a stacktrace. Often times, threads are blocked on a system call (locking a mutex, waiting on a condition variable, or reading/writing to a file descriptor). Backtrace is useful for discovering where in the codebase the blocking call was called.f <frame-number>
: Switches to the provided frame number. Allows you to switch to frames within files you’re familiar with and determine which function called blocking system calls.Handing In & Grading
Handin instructions
As before, you will hand in your code using Git. In the
vunmo/
subdirectory of your project repository, you MUST fill in the text file calledREADME.md
.Remind me again what the
README.md
should contain?The
README.md
file will include the following:Grading breakdown
Now head to the grading server, make sure that you have the “Vunmo” page configured correctly with your project repository, and check that your Vunmo tests pass on the grading server as expected.
Congratulations, you’ve completed the fifth CS 300 project!
Acknowledgements: The Vunmo project was developed for CS 300.