« Back to the main CS 131 website

Project 5: Distributed Store

Due May 8, 2020 at 6:00pm ET
TA Hours: April 28 – May 11
(Note on late days: May 12th at 11:59pm ET is the latest you can turn in this assignment for credit)


In this assignment you’ll build a distributed, sharded key-value storage system, similar to those used to power major websites like Facebook, Twitter, and AirBnB! These sites have so many users that no single server can store all their data, or handle all the requests for data. Instead, they carefully split the data (and the requests accessing it) across many servers. This idea is called sharding.

Outside of a programming context, think about how large events or concerts might speed up their check-in process by having separate tables to check-in attendees with names starting with A-K and L-Z. In systems programming, this idea is called “sharding”, as it splits the overall key space (e.g., A-Z) into many shards that can each be stored and managed on different servers.

Optional: Interested in real-world use cases of sharding?

If you’re curious how what you’re building relates to real-world industry use, take a look at these papers:

  • Scaling Memcache at Facebook by Nishtala et al., which describes how Facebook deploys the memcached key-value store at scale. The store you’re implementing in part 1 of this assignment is somewhat similar to memcached.
  • Slicer: Auto-Sharding for Datacenter Applications by Adya et al., which describes Google’s auto-sharding system (called “Slicer”). The shardmaster you will implement in the second part of this assignment is similar to Google’s Slicer.

Key-Value Stores

A key-value store is a storage system in which data (a value) is addressed by a key. Similar to a Python dictionary or a Java/C++ hashmap, the key maps directly to the value. For example, the key cs131_students might map to a list of student CS logins ([login1, login2, login3]). Note that this is unlike, say, a filesystem, in which files are arranged in a hierarchical directory structure, and you would need to navigate a path like /courses/cs131/students to get to tge data. Key-value stores are popular for large-scale web applications because they are particularly amenable to sharding and other techniques for high performance.

The WeensyDB we saw in lectures is a key-value store. Examples of key-value stores used in industry include memcached, Redis, RocksDB, and LevelDB.


A shard is a subset of the key/value pairs stored in the overall key-value store. For example, if our key space is the integers[0, 100], the keys in the range[0, 50] might be one shard, the keys [51, 60] another shard, etc.

When thinking about sharding, it’s useful to understand the difference between vertical scaling and horizontal scaling. Vertical scaling is the act of scaling up a service or system by adding more computing power (CPU, RAM, etc.) to an existing machine. You might, for example then run more threads to handle additional requests, like you did in the Vunmo project. Horizontal scaling, by contrast, is scaling by adding more machines. While vertical scaling seems simpler at first, it’s easy to run into hardware limits (at some point you can’t have more processors on a machine). In contrast, in a world where cloud providers like AWS, Azure, and Google Compute Engine sell you virtual machines on demand, you can get a practically unlimited amount of compute power, memory, and storage space through vertical scaling.

In contrast to multithreading (a vertical scaling technique), sharding is a horizontal scaling technique. By having multiple key-value servers, each handling operations for just a few of the shards, we increase our system throughput (operations per unit time) by just adding more key-value servers. Naturally this isn’t free – having all these servers work together requires coordination to make sure clients contact the right server. Moreover, we would like to be able to add and remove servers as needed without having to restart the entire system. We’ll build toward such a dynamic system in this assignment!

Conceptual Questions

  1. Describe the roles of the controller, the key-value servers, and the clients in this distributed system. How are they involved in processing a client request for key k?
  2. This assignment will use gRPC and Protocol Buffers (protobufs), which you will already have seen in Lab 8. These libraries help specify the format of messages between servers, and deliver the messages. In Vunmo, messages were just structs. What’s the benefit of using gRPC/Protobuf to define how our processes communicate, as opposed to the approach taken in Vunmo?
  3. Currently, the clients contact the shardmaster to get the current configuration of shards and then make RPCs to key-value servers. Clients will only re-query the shardmaster if a request to a server fails because the shard containing they key was moved away from that server. Instead of this model, the shardmaster could also be in charge of making requests to the key-value servers on clients’ behalf and pass the responses back to the clients. Why will this design result in lower system throughput?
  4. Please answer this question after completing the project. Our final shard distribution algorithm gave us an even distribution of the key space among our key-value servers. However, it’s possible that some keys/shards are queried much more frequently than others (they’re ‘hot’), meaning that servers holding these keys/shards will get a disproportionate number of requests. For a balanced load, we want each key-value server to be receive approximately the same amount of client traffic. Sketch a design (without implementing it) for a new shardmaster that distributes load better than the version you’ve built, and give examples of how your new scheme is an improvement. You can change any messages, add new ones, or even add new RPC services. The key-value servers can be modified as well.


Assignment installation

Ensure that your project repository has a handout remote. Type:

$ git remote show handout

If this reports an error, run:

$ git remote add handout https://github.com/csci1310/cs131-s20-projects.git

Then run:

$ git pull
$ git pull handout master

This will merge our Project 5 stencil code with your repository.

Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the distributed directory in the working copy of your projects repository.

Infrastructure Help

Stencil & Initial State

Clone the repo (in your VM) and run ./setup.sh. This will install all required dependencies, so answer yes to all prompts (it might also take a while).

Stencil Breakdown

Files Purpose/Use
common/common.{h, cc} Contains the definition of the shard_t struct, as well as some useful helpers for dealing with them. Be sure to check this out while implementing your shardmaster. Feel free to add helper functions here.
protos/{shardmaster, shardkv}.proto Protobufs for our RPC services.
clients/{simple_}client Clients for interacting with your shardmaster and shardkvs.
simple_shardkv/simpleshardkv.{h, cc} Implement your simple shardkv server (part 1) here.
shardmaster/shardmaster.{h, cc} Implement your shardmaster (part 2) here.
shardkv/shardkv.{h, cc} Implement your new shardkv here (part 3).
test_utils/test_utils.{h, cc} Contains utility functions to help you write your own tests. See the testing section for more info.
tests/ See “How to Test”.
build/ See “How to Build”.
dynamic_shardmaster/dynamic_shardmaster.{h, cc} Optionally implement part 4 here (for extra credit).

How to Build

This project uses CMake, a tool that automatically generates Makefiles for complex C/C++ projects.

cd into the build directory and run cmake -DCMAKE_BUILD_TYPE=debug .. followed by make to build your code. You will only have to run cmake once, subsequently just invoking make should be sufficient.

If you want to build a faster release build, run the same command with argument -DCMAKE_BUILD_TYPE=release. Note that you will have to run CMake again if you want to switch back to a debug build.

How to Test

To run the tests, run either make check or ./test.sh from inside the build directory.

The tests are in the tests directory, with each subdirectory corresponding to a part of the roadmap. After implementing part 1, you should pass the simple_shardkv tests. After part 2, the shardmaster_tests should pass and after part 3, integrated_tests should pass.

If you want to write your own tests just add new files to the subdirectories in tests (or create your own subdirectory within tests and put them there). Be sure to take a look at the functions in test_utils/test_utils.h – they will make writing your own tests much easier. Note that if you write your own tests you will need to rerun cmake -DCMAKE_BUILD_TYPE=debug .. in the build directory before they will be built by make.

Assignment Roadmap

1. A simple key-value server


Your first task is to implement a simple key-value server, called simple_shardkv. You can think of this component as the code that each shard in the system needs to run. Clients will use Remote Procedure Calls (RPCs) to interact with your server. RPCs are a way of making function or method calls between two processes (even if they’re on different machines!), typically without the programmer having to specify exactly how that communication takes place. Your server should support the following RPCs:

Name Arguments Behavior
Get A key Returns associated value, or an error if the key is not present.
Put A key and a value Maps the specified key to the specified value, overwriting any previous value.
Append A key and a value Appends the specified value to the previously existing value for the given key. If the key wasn’t present, this call should be equivalent to a Put.

Note: In this project, the keys are 32-bit unsigned integers and the values are strings.

We will use gRPC along with Protobuf to help us set up the RPCs. Protobufs (protocol buffers) are a convenient way to specify message formats, and we’ll use them to define our RPC services. gRPC lets us take a .proto file and generate a lot of the boilerplate code we’d need to otherwise write to implement the RPCs. Take a look at protos/shardkv.proto (the protobuf file) and compare it tobuild/shardkv.grpc.pb.{h, cc} and build/shardkv.pb.{h, cc} (the generated files, can be built by running make from the build directory) if you want to see exactly how much code gets generated. (Check out Lab 8 for more information about Protobufs and gRPC.)


Take a look at simple_shardkv/main.cc first. While you shouldn’t modify this file, it’s useful to understand what’s going on. We read in the specified port, construct an instance of the SimpleShardkvServer class, and then register that instance as a listener on the specified port. Finally, we call server->Wait() which actually starts listening for RPCs.

Task SimpleShardKV:

  1. Complete the definition of the SimpleShardkvServer class (in simple_shardkv/simpleshardkv.h) to include the fields required to manage the server’s data. Think about what sort of data structures might be useful to achieve the Get, Put, and Append functionality. For example, a structure to assign keys to values or a mutex to protect data might be useful.
  2. Modify the SimpleShardkvServer class (in simple_shardkv/simpleshardkv.cc) to support the RPCs described in the table above. Clients will send Get/Put/Append RPCs to your server, and your job is to handle them appropriately (i.e., return/update the requested data or respond with an error).

Once your done with this step, you should be passing the simple_shardkv tests!

  1. Be sure to support multiple clients making concurrent RPC calls, and ensure your operations are thread-safe when they need to be. A good rule of thumb is to assume all RPC methods can be invoked at any time in any order by clients, so anytime you access data think about whether you need a mutex for it!
  2. Recall that the request and reponse structures are Protobufs, so if you want to pass data back to the client after a Get, you’ll need to modify the GetResponse structure. Check out the Protobuf documentation for the methods available (this is easier than trying to read the auto-generated code).

Testing it out with a Client

Each simple_shardkv server is standalone, and only knows about the requests it receives. This is already sufficient to run a simple distributed key-value store: you can run two simple_shardkv servers and arrange for the clients to send all requests for the first half of the keyspace to the first server, and all requests for the second half of the key space to the second server.

We provide you with a client (clients/simple_client) that implements its own, hard-coded sharding scheme to interact with several simple_shardkv servers.

The client will construct an RPC “stub” on its end. A stub is dummy object that provides the interface specified by Shardkv::Service. When the client makes method calls on the stub, your SimpleShardkvServer methods will be invoked, even if the client and your server run on different machines!

The simple_client uses a scheme that shards keys by using key % n where n is the number of servers to chose the server for a key. For example, if you connected the client to two simple_shardkv servers, the client will ensure that the Put, Get, and Append RPCs on even keys are sent to the first server and the RPCs on odd keys are sent to the second server. You tell the client on startup how many servers there are and at what address and port they listen.

To run a simple_shardkv server (from thebuild folder):

$ ./simple_shardkv <PORT>

To run the simple_client (from the client directory):


From the course VM, to connect two simple_shardkv servers and one client, start by running vagrant ssh in three different terminals (or use tmux or screen to run multiple virtual terminals within a single terminal). The examples assume that your course VM’s hostname is ubuntu-bionic.

Terminal 1: runs server with host ubuntu-bionic on port 13101.

$ ./simple_shardkv 13101

Terminal 2: runs server with host ubuntu-bionic on port 13102.

$ ./simple_shardkv 13102

Terminal 3: runs a client that connects to the two servers.

$ ./simple_client ubuntu-bionic:13101 ubuntu-bionic:13102
put 10 "golden_snitch"
get 10
Get returned: "golden_snitch"

While this scheme does distribute load across servers, it has a few problems. One problem is that we have no capacity to add new key-value servers or deal with existing ones leaving, and that all server addresses and ports need to be known when starting the client. Don’t fret, though! You’ll write a better shard scheme in Part 3!

2. Shardmaster v1


So how can we handle updating clients about key-value servers joining or leaving? To do so, you’ll implement a controller program called “Shardmaster”, which is responsible for assigning shards to servers. Clients can now query your shardmster to find out which server is responsible for the key(s) they are interested in, before directly querying those servers. See the shardmaster directory for the stencil, and the common directory for useful utilities.


Your shardmaster should support keys in the range [MIN_KEY, MAX_KEY] (defined in common.h).

Here’s a table of the RPCs your shardmaster should support (take a look at protos/shardmaster.proto to see the message specifications):

Name Arguments Behavior
Join The address a new key-value server listens on The shardmaster should update how it has distributed shards to account for the new server.
Leave The address the leaving key-value server is listening on The shardmaster should update how it has distributed shards to account for the server leaving. Note that the key-value server itself isn’t necessarily shut down – it’s just not part of our configuration anymore.
Move A shard and the address of a key-value server Updates the shardmaster’s configuration to assign the specified shard to the specified server. This shouldn’t cause any other changes to how shards are distributed.
Query None Returns the current distribution of shards across key-value servers.

Administrators can use the Join/Leave RPCs to add/remove servers. Both of these operations should cause the shardmaster to adjust how it has distributed shards – either giving the new server shards from an existing server, or redistributing the shards from a leaving server. Together with the Query RPC, this makes it possible for clients to adapt to a changing set of key-value servers in the system, as the shardmaster can always tell clients what the most up-to-date configuration is.

The Move RPC transfers a shard between two servers and shouldn’t cause any other configuration changes. We will use it to test your shardmaster.

Note that in this part of the project, you are only expected to adapt the configuration data on the shardmaster. In Part 3, you will actually move keys between key-value servers.

Shard redistribution for Join and Leave

We’ve alluded to this in our explanation of the Join and Leave RPCs, but how exactly should redistributing shards work? You’ll have more freedom to explore this in subsequent parts, but for now Join and Leave should redistribute keys using the following approach.

Maintain a list of key-value servers on the shardmaster and regardless of the current configuration, assign them each an equal proportion of the keyspace, with servers that joined earlier getting lower keys. If the keys don’t evenly distribute, we distribute the remaining across the servers who joined first. Below is an example of how that works.

Redistribution Walthrough

Lets say our keys range from 0 to 100. A server running on durmstrang:4000 joins. At this point, our shardmaster configuration should look like this.

KV Server Shards
durmstrang:4000 [0, 100]

Now, let’s say we get two more servers joining, hogwarts:9999 and gringotts:713. When hogwarts:9999 joins, our shardmaster should look like this:

KV Server Shards
durmstrang:4000 [0, 50]
hogwarts:9999 [51, 100]

Now when gringotts:713 joins, our shardmaster must now partition our keyspace into three, giving us:

KV Server Shards
durmstrang:4000 [0, 33]
hogwarts:9999 [34, 67]
gringotts:713 [68, 100]

Now lets say that hogwarts:9999 leaves then rejoins. After it leaves, we have:

KV Server Shards
durmstrang:4000 [0, 50]
gringotts:713 [51, 100]

and when it rejoins we get:

KV Server Shards
durmstrang:4000 [0, 33]
gringotts:713 [34, 67]
hogwarts:9999 [68, 100]

Now lets walk through an example with Move. Let’s say durmstrang:4000, gringotts:713, and hogwarts:9999 join (in that order), and we get:

KV Server Shards
durmstrang:4000 [0, 33]
hogwarts:9999 [34, 67]
gringotts:713 [68, 100]

Now, we move [20, 55] to gringotts:713, giving us:

KV Server Shards
durmstrang:4000 [0, 19]
hogwarts:9999 [56, 67]
gringotts:713 [20, 55] [68, 100]

Now, if beauxbatons:1234 joins, we get:

KV Server Shards
durmstrang:4000 [0, 25]
hogwarts:9999 [26, 50]
gringotts:713 [51, 75]
beauxbatons:1234 [76, 100]

As you can see, the redistribution ignores any previous moves that were made. If instead of beauxbatons:1234 joining gringots:713 left, we’d instead have:

KV Server Shards
durmstrang:4000 [0, 50]
hogwarts:9999 [51, 100]

Once again, the prior move was ignored.

While this key distributed scheme works, it’s not always ideal. You’ll explore an improved schema in conceptual question 4 and you can implement it as an extra credit exercise in part 4!

Task: StaticShardmaster

  1. Complete the definition of the StaticShardmaster class (in shardmaster/shardmaster.h) to include the fields necessary for the shardmaster to keep track of its servers. Remember to consider thread safety!
  2. Modify the StaticShardmaster class (in shardmaster/shardmaster.c) to support the RPCs described in the table above. More details of these functions can be found in the stencil. (You may want to implement and test Join, Leave, and Query before implementing Move.)

Once you are done with this task, you should be passing the simple_kv and shardmaster_tests tests!


Move RPCs might not move entire existing shards. For example, we might have a server A that is responsible for the shard [0, 100]. If we move [25, 75] from A to server B, A should be left with the shards [0, 24] and [76, 100] while B contains [25, 75]. Moves might also transfer keys that are distributed on two different servers – think about edge cases!

Testing it out with a Client

The new client you’ll use to test this part is clients/client. The client can invoke all the RPCs specified in the above table on your shardmaster (run help for the commands you need), as well as Get/Put/Append.

This client initially queries the shardmaster for its current configuration of how shards are distributed among servers, then directly makes each RPC call on whatever server the shardmaster said has the key in question. Remember, if the configuration of shards has changed you may need to query the shardmaster for the updated configuration to find keys!

To run a shardmaster (from thebuild folder):

$ ./shardmaster <PORT>

To run the client (from theclient directory):


From the course VM, to connect one shardmaster, one simple_shardkv server, and one client, start by opening three terminals (or use tmux/screen). Again, this assumes your course VM’s hostname is ubuntu-bionic.

Terminal 1: runs shardmaster with host ubuntu-bionic on port 13100

$ ./shardmaster 13100

Terminal 2: runs server with host ubuntu-bionic on port 13101

$ ./simple_shardkv 13101

Terminal 3: runs the client, which connects to the shardmaster and then joins two key-value servers (one of which is fake here, so requests to it fail):

$ ./client ubuntu-bionic 13100
join fake_server:1234
join ubuntu-bionic:13101
Shard {0, 500} on server fake_server:1234
Shard {501, 1000} on server ubuntu-bionic:13101
put 502 "successful"
Put server: ubuntu-bionic:13101
put 500 "failure"
Put server: fake_server:1234
method Put failed with status code 14
the error message was: DNS resolution failed

You can provide fake addresses in your Join/Leave commands – the client doesn’t actually verify if key-value servers are running on the addresses until you try to Get/Put/Append. This means you can test your key redistribution without updating your key-value servers (the next part!).

3. Applying configuration changes to the key-value servers

So far, the shardmaster sounds nice, but it doesn’t actually move keys between key-value servers. In other words, we haven’t touched on how the key-value servers handle configuration changes. If a subset of keys on server1 gets moved to server2, somehow we need to send the key-value pairs stored previously on server1 to their new home!

You’ll now write another key-value server called shardkv (see the shardkv directory for the stencil). The logic for handling Get/Put/Append requests shouldn’t change much, but your key-value store should now:

When transferring data from one shardkv server to another via Put RPCs, it’s important to retry until success (grpc::Status::OK) – we don’t want key-value pairs to be lost! The diagram in the walkthrough below shows an example of why this is important.

Message Flow Walkthrough

In the above diagram, both shardkv servers query the shardmaster and get the current configuration of shards. A client then sends a Move request, moving the shard [5, 10] from Shardkv1 to Shardkv2. Lets say that Shardkv1 has data for the key 7, which is in the moved shard.

When Shardkv1 queries the shardmaster for the second time, it notices that it is no longer responsible for the shard [5, 10]. Consequently, Shardkv1 attempts to transfer the data for key 7 to Shardkv2 via a Put RPC. However, Shardkv2 has not yet queried the shardmaster, so it is unaware that it is now responsible for keys in the range [5, 10]. However, Shardkv2 realizes that it is responsible for key 7 the next time it queries the shardmaster, which means that Shardkv1’s next Put RPC will succeed.


  1. Complete the definition of the ShardkvServer class (in shardkv/shardkv.h) to include the fields necessary for your server. Just like the SimpleShardkvServer, your implementation should include a key-value store, but now you will also need to keep track of the server’s assigned shards from the shardmaster.
  2. Modify the ShardkvServer class (in shardkv/shardkv.cc) to support the RPCs Get, Put, and Append. You should now also implement QueryShardmaster, which should query the shardmaster for the distribution of keys.

Once you are done with this task, you should be passing all of the provided tests!


When retrying until success, it’s useful to have a timeout between each attempt. You can use look at the shardkv constructor in shardkv/shardkv.cc to see an example of waiting (sleeping) for a timeout in C++.

Note: QueryShardmaster is an RPC that the ShardkvServer periodically calls on itself, but other servers could also invoke this RPC remotely to make a ShardkvServer update its configuration!

You now run the shardkv executable (from the build directory) instead of simple_shardkv for each key-value server:


See the previous tasks for examples.

4. Implementing a Dynamic Shardmaster

You have a fully functional distributed system capable of reconfiguration to move shards between servers now. But it still implements the simple scheme that partitions the key space equally between your shardkv servers.

In the real world, keys (and therefore shards) are not equally popular. For example, celebrities on Twitter or Facebook have many more followers than most other users. If a shard stores a celebrity’s timeline, it will receive many more requests than others, which could overlay that shard!

Now go back to conceptual question #4, and think about how you could adapt your shardmaster to implement a smarter redistribution scheme that takes into account how busy the shards are.

There is no single right answer here; we will award credit for any reasonable scheme.

Extra Credit:
For extra credit, implement the scheme that you described in conceptual question #4. You’ll need to provide proof of your improvements (for example, subjecting your new system and your old one to similar loads and collecting statistics on how frequently each server is contacted) to show how you’ve improved system throughput.


All the debugging techniques you’ve learned throughout the course apply to this project too. Finding bugs in distributed systems can be challenging, but we are sure you can do it! :+1:

Troubleshooting failure messages:

Here are some common error outputs from the test suite, and hints for how do deal with them.

Hanging Tests:

If a specific test remains running for more than about a minute, this is a sign that your implementation contains a deadlock. For this, GDB’s multi-threaded debugging tools will be useful. Again, you can run the particular test in GDB with the command gdb {test_name} from the /build directory.

After letting the program run in GDB for a bit, you can use Ctrl-C to pause the execution of all threads. After doing this, you can use the command info threads to see a list of all of the running threads and where they are in execution.

If it seems like a lot of threads are waiting on a mutex (their function is listed as _lock_wait), you can select one of these threads (with the command thread <t_id>), and look at its backtrace. If you look at the frame just before the frame for the lock() function, you should be able to see which of your mutexes it is waiting for. You can get information about this mutex with the command p <mutex>. Some of the more useful information that this shows you is the ID of the thread that currently holds the mutex!

Handing In & Grading

Handin instructions

As before, you will hand in your code using Git. In the sharding/ subdirectory of your project repository, you MUST fill in the text file called README.md.

Remind me again what the README.md should contain?
The README.md file will include the following:
  1. Any design decisions you made and comments for graders, under "Design Overview". If there's nothing interesting to say, just list "None".
  2. Any collaborators and citations for help that you received, under "Collaborators". CS 131 encourages collaboration, but we expect you to acknowledge help you received, as is standard academic practice.
  3. Your answers to the conceptual questions at the start of this handout under "Conceptual Questions".
  4. Notes on approximately how many hours you spent actively working on the assignment. We won't use this for grading, but we collect the data to calibrate for future years.

Grading breakdown

Now head to the grading server, make sure that you have the “Distributed Store” page configured correctly with your project repository, and check that your tests pass on the grading server as expected.

Congrats! You’re officially done with the last project for the class :clap:

Acknowledgements: This project was developed for CS 131.