Project 6: Distributed Store

Due April 22, 2021 at 6:00pm AoE
(Note on late hours: owing to the tight grading schedule, you cannot use any late hours on this project.)

Introduction

In this assignment, you’ll implement a distributed, sharded key-value storage system that stores user data and post data for a new social media platform called Bacefook. Many major websites like Twitter, AirBnB, and Facebook have so many users that no single server can store all their data, or handle all the requests for data. Instead, they carefully split the data (and the requests accessing it) across many servers. This idea is called sharding.

Outside of a programming context, think about how large events or concerts might speed up their check-in process by having separate tables to check-in attendees with names starting with A-K and L-Z. In systems programming, this idea is called “sharding”, as it splits the overall key space (e.g., A-Z) into many shards that can each be stored and managed on different servers.

Optional: Interested in real-world use cases of sharding?

If you’re curious how what you’re building relates to real-world industry use, take a look at these papers:

Scaling Memcache at Facebook by Nishtala et al., which describes how Facebook deploys the memcached key-value store at scale. The store you’re implementing in part 1 of this assignment is somewhat similar to memcached.

Slicer: Auto-Sharding for Datacenter Applications by Adya et al., which describes Google’s auto-sharding system (called “Slicer”). The shardmaster you will implement in the second part of this assignment is similar to Google’s Slicer.

How Bacefook Works

On Bacefook, users can choose their names and create posts on their timelines as well as other user’s timelines. Surprisingly, these timelines are very similar to “walls” on Facebook! Even though there can be multiple users with the same user names, every user_id is unique. The same goes for posts: even though there can be multiple posts with the same text, every post has a unique post_id. Although you are not asked to implement the timeline functionality, this concept may be helpful for a future feature of your distributed store.

Key-Value Stores

A key-value store is a storage system in which data (a value) is addressed by a key. Similar to a Python dictionary or a Java/C++ hashmap, the key maps directly to the value. For example, the key cs300_students might map to a list of student CS logins ([login1, login2, login3]). Note that this is unlike, say, a filesystem, in which files are arranged in a hierarchical directory structure, and you would need to navigate a path like /courses/cs300/students to get to the data. Key-value stores are popular for large-scale web applications because they are particularly amenable to sharding and other techniques for high performance.

The WeensyDB we saw in lectures is a key-value store. Examples of key-value stores used in industry include memcached, Redis, RocksDB, and LevelDB.

For Bacefook, there would be at least two kinds of key-value pairs. For users, the key user_id would map to the user’s name. For posts, the key post_id would map to post text.

Example:

Key Value Pair Structure	Example
`user_id` → name	“user_14” → “malte”
`post_id` → post text	“post_59” → “Hello, I am using Bacefook”
`user_id_posts` → ??	“user_14_posts” → ??

Note: The third key value pair is necessary to implement one of the features (think about how timelines and append work). What should the given key be mapped to?

Sharding

A shard is a subset of the key/value pairs stored in the overall key-value store. For example, if our key space is the strings[AA, ZZ], the keys in the range[AA, AZ] might be one shard, the keys [BA, ZZ] another shard, etc.

When thinking about sharding, it’s useful to understand the difference between vertical scaling and horizontal scaling. Vertical scaling is the act of scaling up a service or system by adding more computing power (CPU, RAM, etc.) to an existing machine. You might, for example then run more threads to handle additional requests, like you did in the Vunmo project. Horizontal scaling, by contrast, is scaling by adding more machines. While vertical scaling seems simpler at first, it’s easy to run into hardware limits (at some point you can’t have more processors on a machine). In contrast, in a world where cloud providers like AWS, Azure, and Google Compute Engine sell you virtual machines on demand, you can get a practically unlimited amount of compute power, memory, and storage space through vertical scaling.

In contrast to multithreading (a vertical scaling technique), sharding is a horizontal scaling technique. By having multiple key-value servers, each handling operations for just a few of the shards, we increase our system throughput (operations per unit time) by just adding more key-value servers. Naturally this isn’t free – having all these servers work together requires coordination to make sure clients contact the right server. Moreover, we would like to be able to add and remove servers as needed without having to restart the entire system. We’ll build toward such a dynamic system in this assignment!

GDPR

The General Data Protection Regulation (GDPR) is an EU law implemented in 2018 that protects the security and privacy of user data. This law affects any corporation that collects data from people in the European Union. These new regulations require all personal data to be protected by companies, from names and addresses to web data like IP addresses and cookie data. In order to protect personal data, companies must design information systems with privacy in mind, such as anonymizing any identifiable information and giving users the right to request the erasure of their data.

Although Bacefook is based in the United States, many of its users live in the EU. As a result, you must ensure that your storage system complies with GDPR regulations.

Conceptual Questions

Write your answers to the following questions in your README.

We strongly recommend that you answer these questions as you complete the project, rather than waiting until after finishing the project code. Answering the questions will help you immensely in understanding the concepts behind the code that you’ll be writing. However, please answer Question 4 only after completing the rest of the project.

The due date for the conceptual questions is the same as the project due date.

Describe the roles of the controller, the key-value servers, and the clients in this distributed system. How are they involved in processing a client request for key k?
This assignment will use gRPC and Protocol Buffers (protobufs), which you will already have seen in Lab 8. These libraries help specify the format of messages between servers, and deliver the messages. In Vunmo, messages were just structs. What’s the benefit of using gRPC/Protobuf to define how our processes communicate, as opposed to the approach taken in Vunmo?
Currently, the clients contact the shardmaster to get the current configuration of shards and then make RPCs to key-value servers. Clients will only re-query the shardmaster if a request to a server fails because the shard containing they key was moved away from that server. Instead of this model, the shardmaster could also be in charge of making requests to the key-value servers on clients’ behalf and pass the responses back to the clients. Why will this design result in lower system throughput?
Please answer this question after completing the project. Our final shard distribution algorithm gave us an even distribution of the key space among our key-value servers. However, it’s possible that some keys/shards are queried much more frequently than others (they’re ‘hot’), meaning that servers holding these keys/shards will get a disproportionate number of requests. For a balanced load, we want each key-value server to be receive approximately the same amount of client traffic. Sketch a design (without implementing it) for a new shardmaster that distributes load better than the version you’ve built, and give examples of how your new scheme is an improvement. You can change any messages, add new ones, or even add new RPC services. The key-value servers can be modified as well.

SRC Questions

Before answering the questions, think about the privacy policies you have encountered. How many of those did you actually read? Were they easy to understand? Now, read this article about the comprehensibility of privacy policies.

With these typical privacy policies in mind, do you think that users are giving informed consent? (1-2 sentences)
Discuss the value of privacy legislation. Provide one example where privacy legislation could be beneficial, and one example where privacy legislation would fail to achieve the desired effect. These examples could be hypothetical or from real life. (3-5 sentences)

When we think about protecting user data, we often think about data stored by large corporations, such as social media data or financial data. However, every company that stores user data of EU citizens must comply with GDPR standards, including small businesses. Read this article about the impact of GDPR on small businesses.

What is the problem here, and what is one potential solution? (3-5 sentences)

Assignment

Assignment installation

Ensure that your project repository has a handout remote. Type:

$ git remote show handout

If this reports an error, run:

$ git remote add handout https://github.com/csci0300/cs300-s21-projects.git

Then run:

$ git pull
$ git pull handout master

This will merge our Project 5 stencil code with your repository.

Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be doing all your work for this project inside the distributed directory in the working copy of your projects repository.

Infrastructure Help

Stencil & Initial State

Clone the repo (in your VM) and run ./setup.sh. This will install all required dependencies, so answer yes to all prompts (it might also take a while).

Stencil Breakdown

Files	Purpose/Use
`common/common.{h, cc}`	Contains the definition of the `shard_t` struct, as well as some useful helpers for dealing with them. Be sure to check this out while implementing your `shardmaster`. Feel free to add helper functions here.
`protos/{shardmaster, shardkv}.proto`	Protobufs for our RPC services.
`clients/{simple_}client`	Clients for interacting with your `shardmaster` and `shardkv`s.
`simple_shardkv/simpleshardkv.{h, cc}`	Implement your simple shardkv server (part 1) here.
`shardmaster/shardmaster.{h, cc}`	Implement your shardmaster (part 2) here.
`shardkv/shardkv.{h, cc}`	Implement your new shardkv here (part 3).
`test_utils/test_utils.{h, cc}`	Contains utility functions to help you write your own tests. See the testing section for more info.
`tests/`	See “How to Test”.
`build/`	See “How to Build”.
`dynamic_shardmaster/dynamic_shardmaster.{h, cc}`	Optionally implement part 4 here (for extra credit).

How to Build

This project uses CMake, a tool that automatically generates Makefiles for complex C/C++ projects.

cd into the build directory and run cmake -DCMAKE_BUILD_TYPE=debug .. followed by make to build your code. You will only have to run cmake once, subsequently just invoking make should be sufficient.

If you want to build a faster release build, run the same command with argument -DCMAKE_BUILD_TYPE=release. Note that you will have to run CMake again if you want to switch back to a debug build.

How to Test

To run the tests, run either make check or ./test.sh from inside the build directory.

The tests are in the tests directory, with each subdirectory corresponding to a part of the roadmap. After implementing part 1, you should pass the simple_shardkv tests. After part 2, the shardmaster_tests should pass and after part 3, integrated_tests should pass.

If you want to write your own tests just add new files to the subdirectories in tests (or create your own subdirectory within tests and put them there). Be sure to take a look at the functions in test_utils/test_utils.h – they will make writing your own tests much easier. Note that if you write your own tests you will need to rerun cmake -DCMAKE_BUILD_TYPE=debug .. in the build directory before they will be built by make.

Assignment Roadmap

1. A simple key-value server

Overview

Your first task is to implement a simple key-value server, called simple_shardkv. You can think of this component as the code that each shard in the system needs to run. Clients will use Remote Procedure Calls (RPCs) to interact with your server. RPCs are a way of making function or method calls between two processes (even if they’re on different machines!), typically without the programmer having to specify exactly how that communication takes place. Your server should support the following RPCs:

Name	Arguments	Behavior
Get	A key (`user_<id>`, `post_<id>`, or `user_<id>_posts`)	Returns associated value, or an error if the key is not present. Given `user_<id>`, return name; given `post_<id>`, return post text; given `user_<id>_posts`, return the associated value
Put	A key (`user_<id>` or `post_<id>`) and a value (name or post text)	Maps the specified key to the specified value, overwriting any previous value. Assign a `user_<id>` to a name or assign a `post_<id>` to post text. Note: the put RPC also includes a user field that stores the user’s ID in the case the put is for a post. You do not need to use this value until part 2.
Append	A key (`user_<id>_posts`) and a value (`post_<id>`)	Appends the specified value to the previously existing value for the given key. If the key wasn’t present, this call should be equivalent to a Put.
Delete	A key (`post_<id>` or `user_<id>`)	Deletes the key-value pair with the specified `post_<id>` or `user_<id>`

Note: In this project, the keys and the values are strings.

We will use gRPC along with Protobuf to help us set up the RPCs. Protobufs (protocol buffers) are a convenient way to specify message formats, and we’ll use them to define our RPC services. gRPC lets us take a .proto file and generate a lot of the boilerplate code we’d need to otherwise write to implement the RPCs. Take a look at protos/shardkv.proto (the protobuf file) and compare it tobuild/shardkv.grpc.pb.{h, cc} and build/shardkv.pb.{h, cc} (the generated files, can be built by running make from the build directory) if you want to see exactly how much code gets generated. (Check out Lab 8 for more information about Protobufs and gRPC.)

Specification:

Take a look at simple_shardkv/main.cc first. While you shouldn’t modify this file, it’s useful to understand what’s going on. We read in the specified port, construct an instance of the SimpleShardkvServer class, and then register that instance as a listener on the specified port. Finally, we call server->Wait() which actually starts listening for RPCs.

Task SimpleShardKV:

Complete the definition of the SimpleShardkvServer class (in simple_shardkv/simpleshardkv.h) to include the fields required to manage the server’s data. Think about what sort of data structures might be useful to achieve the Get, Put, and Append functionality. For example, a structure to assign keys to values or a mutex to protect data might be useful.
Modify the SimpleShardkvServer class (in simple_shardkv/simpleshardkv.cc) to support the RPCs described in the table above. Clients will send Get/Put/Append RPCs to your server, and your job is to handle them appropriately (i.e., return/update the requested data or respond with an error).

Once your done with this step, you should be passing the simple_shardkv tests!

Hint:

Be sure to support multiple clients making concurrent RPC calls, and ensure your operations are thread-safe when they need to be. A good rule of thumb is to assume all RPC methods can be invoked at any time in any order by clients, so anytime you access data think about whether you need a mutex for it!
Recall that the request and reponse structures are Protobufs, so if you want to pass data back to the client after a Get, you’ll need to modify the GetResponse structure. Check out the Protobuf documentation for the methods available (this is easier than trying to read the auto-generated code).

Testing it out with a Client

Each simple_shardkv server is standalone, and only knows about the requests it receives. This is already sufficient to run a simple distributed key-value store: you can run two simple_shardkv servers and arrange for the clients to send all requests for the first half of the keyspace to the first server, and all requests for the second half of the key space to the second server.

We provide you with a client (clients/simple_client) that implements its own, hard-coded sharding scheme to interact with several simple_shardkv servers.

The client will construct an RPC “stub” on its end. A stub is dummy object that provides the interface specified by Shardkv::Service. When the client makes method calls on the stub, your SimpleShardkvServer methods will be invoked, even if the client and your server run on different machines!

The simple_client uses a scheme that shards keys by using id % n where n is the number of servers to chose the server for a key and id is the “id” field extracted from the key. For example, if you connected the client to two simple_shardkv servers, the client will ensure that the Put, Get, and Append RPCs on even keys are sent to the first server and the RPCs on odd keys are sent to the second server. You tell the client on startup how many servers there are and at what address and port they listen.

To run a simple_shardkv server (from thebuild folder):

$ ./simple_shardkv <PORT>

To run the simple_client (from the client directory):

$ ./simple_client <SIMPLE_SHARDKV_SERVER1> <SIMPLE_SHARDKV_SERVER2> ...

Example

From the course VM, to connect two simple_shardkv servers and one client, start by running vagrant ssh in three different terminals (or use tmux or screen to run multiple virtual terminals within a single terminal). The examples assume that your course VM’s hostname is ubuntu-bionic.

Terminal 1: runs server with host ubuntu-bionic on port 13101.

$ ./simple_shardkv 13101

Terminal 2: runs server with host ubuntu-bionic on port 13102.

$ ./simple_shardkv 13102

Terminal 3: runs a client that connects to the two servers.

$ ./simple_client ubuntu-bionic:13101 ubuntu-bionic:13102
put "post_10" "golden_snitch" "user_7"
get "post_10"
Get returned: "golden_snitch"

While this scheme does distribute load across servers, it has a few problems. One problem is that we have no capacity to add new key-value servers or deal with existing ones leaving, and that all server addresses and ports need to be known when starting the client. Don’t fret, though! You’ll write a better shard scheme in part 3!

2. Shardmaster v1

Overview:

So how can we handle updating clients about key-value servers joining or leaving? To do so, you’ll implement a controller program called “Shardmaster”, which is responsible for assigning shards to servers. Clients can now query your shardmster to find out which server is responsible for the key(s) they are interested in, before directly querying those servers. See the shardmaster directory for the stencil, and the common directory for useful utilities.

Specifications:

Your shardmaster will organize data by partitioning the integer ID range [MIN_KEY, MAX_KEY] (defined in common.h).

Here’s a table of the RPCs your shardmaster should support (take a look at protos/shardmaster.proto to see the message specifications):

Name	Arguments	Behavior
Join	The address a new key-value server listens on	The shardmaster should update how it has distributed shards to account for the new server.
Leave	The address the leaving key-value server is listening on	The shardmaster should update how it has distributed shards to account for the server leaving. Note that the key-value server itself isn’t necessarily shut down – it’s just not part of our configuration anymore.
Move	A shard and the address of a key-value server	Updates the shardmaster’s configuration to assign the specified shard to the specified server. This shouldn’t cause any other changes to how shards are distributed.
Query	None	Returns the current distribution of shards across key-value servers.
GDPR Delete	A numeric `user_id`	Deletes user information in accordance with the GDPR.

Administrators can use the Join/Leave RPCs to add/remove servers. Both of these operations should cause the shardmaster to adjust how it has distributed shards – either giving the new server shards from an existing server, or redistributing the shards from a leaving server. Together with the Query RPC, this makes it possible for clients to adapt to a changing set of key-value servers in the system, as the shardmaster can always tell clients what the most up-to-date configuration is.

The Move RPC transfers a shard between two servers and shouldn’t cause any other configuration changes. We will use it to test your shardmaster.

Finally, this new API also includes a GDPRDelete RPC that should remove all information related to a user from the key-value store. You’ll need to think about what information that includes, and how you can track it so that GDPRDelete can efficiently obtain that information and remove it.

Note that in this part of the project, you are only expected to adapt the configuration data on the shardmaster. In Part 3, you will actually move keys between key-value servers.

Shard redistribution for `Join` and `Leave`

We’ve alluded to this in our explanation of the Join and Leave RPCs, but how exactly should redistributing shards work? You’ll have more freedom to explore this in subsequent parts, but for now Join and Leave should redistribute keys using the following approach.

Maintain a list of key-value servers on the shardmaster and regardless of the current configuration, assign them each an equal proportion of the keyspace, with servers that joined earlier getting lower keys. If the keys don’t evenly distribute, we distribute the remaining across the servers who joined first. Below is an example of how that works.

Redistribution Walthrough

Lets say the ids used in our keys range from 0 to 100. A server running on durmstrang:4000 joins. At this point, our shardmaster configuration should look like this.

KV Server	Shards
`durmstrang:4000`	`[0, 100]`

Now, let’s say we get two more servers joining, hogwarts:9999 and gringotts:713. When hogwarts:9999 joins, our shardmaster should look like this:

KV Server	Shards
`durmstrang:4000`	`[0, 50]`
`hogwarts:9999`	`[51, 100]`

Now when gringotts:713 joins, our shardmaster must now partition our keyspace into three, giving us:

KV Server	Shards
`durmstrang:4000`	`[0, 33]`
`hogwarts:9999`	`[34, 67]`
`gringotts:713`	`[68, 100]`

Now lets say that hogwarts:9999 leaves then rejoins. After it leaves, we have:

KV Server	Shards
`durmstrang:4000`	`[0, 50]`
`gringotts:713`	`[51, 100]`

and when it rejoins we get:

KV Server	Shards
`durmstrang:4000`	`[0, 33]`
`gringotts:713`	`[34, 67]`
`hogwarts:9999`	`[68, 100]`

Now lets walk through an example with Move. Let’s say durmstrang:4000, gringotts:713, and hogwarts:9999 join (in that order), and we get:

KV Server	Shards
`durmstrang:4000`	`[0, 33]`
`hogwarts:9999`	`[34, 67]`
`gringotts:713`	`[68, 100]`

Now, we move [20, 55] to gringotts:713, giving us:

KV Server	Shards
`durmstrang:4000`	`[0, 19]`
`hogwarts:9999`	`[56, 67]`
`gringotts:713`	`[20, 55] [68, 100]`

Now, if beauxbatons:1234 joins, we get:

KV Server	Shards
`durmstrang:4000`	`[0, 25]`
`hogwarts:9999`	`[26, 50]`
`gringotts:713`	`[51, 75]`
`beauxbatons:1234`	`[76, 100]`

As you can see, the redistribution ignores any previous moves that were made. If instead of beauxbatons:1234 joining gringots:713 left, we’d instead have:

KV Server	Shards
`durmstrang:4000`	`[0, 50]`
`hogwarts:9999`	`[51, 100]`

Once again, the prior move was ignored.

While this key distributed scheme works, it’s not always ideal. You’ll explore an improved schema in conceptual question 4 and you can implement it as an extra credit exercise in part 4!

Task: StaticShardmaster

Complete the definition of the StaticShardmaster class (in shardmaster/shardmaster.h) to include the fields necessary for the shardmaster to keep track of its servers. Remember to consider thread safety!
Modify the StaticShardmaster class (in shardmaster/shardmaster.c) to support the RPCs described in the table above. More details of these functions can be found in the stencil. (You may want to implement and test Join, Leave, and Query before implementing Move.)

Once you are done with this task, you should be passing the simple_kv and shardmaster_tests tests!

Hints

Move RPCs might not move entire existing shards. For example, we might have a server A that is responsible for the shard [0, 100]. If we move [25, 75] from A to server B, A should be left with the shards [0, 24] and [76, 100] while B contains [25, 75]. Moves might also transfer keys that are distributed on two different servers – think about edge cases!

Testing it out with a Client

The new client you’ll use to test this part is clients/client. The client can invoke all the RPCs specified in the above table on your shardmaster (run help for the commands you need), as well as Get/Put/Append.

This client initially queries the shardmaster for its current configuration of how shards are distributed among servers, then directly makes each RPC call on whatever server the shardmaster said has the key in question. Remember, if the configuration of shards has changed you may need to query the shardmaster for the updated configuration to find keys!

To run a shardmaster (from thebuild folder):

$ ./shardmaster <PORT>

To run the client (from theclient directory):

$ ./client <SHARDMASTER_HOST> <SHARDMASTER_PORT> ...

Example

From the course VM, to connect one shardmaster, one simple_shardkv server, and one client, start by opening three terminals (or use tmux/screen). Again, this assumes your course VM’s hostname is ubuntu-bionic.

Terminal 1: runs shardmaster with host ubuntu-bionic on port 13100

$ ./shardmaster 13100

Terminal 2: runs server with host ubuntu-bionic on port 13101

$ ./simple_shardkv 13101

Terminal 3: runs the client, which connects to the shardmaster and then joins two key-value servers (one of which is fake here, so requests to it fail):

$ ./client ubuntu-bionic 13100
join fake_server:1234
join ubuntu-bionic:13101
query
Shard {0, 500} on server fake_server:1234
Shard {501, 1000} on server ubuntu-bionic:13101
put "user_502" "Bob"
Put server: ubuntu-bionic:13101
put "user_500" "Alice"
Put server: fake_server:1234
method Put failed with status code 14
the error message was: DNS resolution failed

You can provide fake addresses in your Join/Leave commands – the client doesn’t actually verify if key-value servers are running on the addresses until you try to Get/Put/Append. This means you can test your id redistribution without updating your key-value servers (the next part!).

3. Applying configuration changes to the key-value servers

So far, the shardmaster sounds nice, but it doesn’t actually move keys between key-value servers. In other words, we haven’t touched on how the key-value servers handle configuration changes. If a subset of ids on server1 gets moved to server2, somehow we need to send the key-value pairs stored previously on server1 to their new home!

You’ll now write another key-value server called shardkv (see the shardkv directory for the stencil). The logic for handling Get/Put/Append requests shouldn’t change much, but your key-value store should now:

Periodically query the shardmaster for the current configuration of shards. If your server sees that shards it previously stored are now on another server, it should use the Put RPC to transfer the appropriate key-value pairs and cease to serve the shards after the transfer completes.
Only respond to queries for keys that it handles as per its last contact with the shardmaster, and return an error for all other keys.

When transferring data from one shardkv server to another via Put RPCs, it’s important to retry until success (grpc::Status::OK) – we don’t want key-value pairs to be lost! The diagram in the walkthrough below shows an example of why this is important.

Message Flow Walkthrough

In the above diagram, both shardkv servers query the shardmaster and get the current configuration of shards. A client then sends a Move request, moving the shard [5, 10] from Shardkv1 to Shardkv2. Lets say that Shardkv1 has data for the key 7, which is in the moved shard.

When Shardkv1 queries the shardmaster for the second time, it notices that it is no longer responsible for the shard [5, 10]. Consequently, Shardkv1 attempts to transfer the data for key 7 to Shardkv2 via a Put RPC. However, Shardkv2 has not yet queried the shardmaster, so it is unaware that it is now responsible for keys in the range [5, 10]. However, Shardkv2 realizes that it is responsible for key 7 the next time it queries the shardmaster, which means that Shardkv1’s next Put RPC will succeed.

Task:

Complete the definition of the ShardkvServer class (in shardkv/shardkv.h) to include the fields necessary for your server. Just like the SimpleShardkvServer, your implementation should include a key-value store, but now you will also need to keep track of the server’s assigned shards from the shardmaster.
Modify the ShardkvServer class (in shardkv/shardkv.cc) to support the RPCs Get, Put, and Append. You should now also implement QueryShardmaster, which should query the shardmaster for the distribution of keys.

Once you are done with this task, you should be passing all of the provided tests!

Hints:

When retrying until success, it’s useful to have a timeout between each attempt. You can use look at the shardkv constructor in shardkv/shardkv.cc to see an example of waiting (sleeping) for a timeout in C++.

Note: QueryShardmaster is an RPC that the ShardkvServer periodically calls on itself, but other servers could also invoke this RPC remotely to make a ShardkvServer update its configuration!

You now run the shardkv executable (from the build directory) instead of simple_shardkv for each key-value server:

$ ./shardkv <PORT> <SHARDMASTER_HOSTNAME> <SHARDMASTER_PORT>

See the previous tasks for examples.

4. Implementing a Dynamic Shardmaster

You have a fully functional distributed system capable of reconfiguration to move shards between servers now. But it still implements the simple scheme that partitions the key space equally between your shardkv servers.

In the real world, like in Bacefook, keys (and therefore shards) are not equally popular. For example, celebrities on Twitter or Instagram have many more followers than most other users. If a shard stores a celebrity’s timeline, it will receive many more requests than others, which could overlay that shard!

Now go back to conceptual question #4, and think about how you could adapt your shardmaster to implement a smarter redistribution scheme that takes into account how busy the shards are.

There is no single right answer here; we will award credit for any reasonable scheme.

Extra Credit:
For extra credit, implement the scheme that you described in conceptual question #4. You’ll need to provide proof of your improvements (for example, subjecting your new system and your old one to similar loads and collecting statistics on how frequently each server is contacted) to show how you’ve improved system throughput.

Debugging

All the debugging techniques you’ve learned throughout the course apply to this project too. Finding bugs in distributed systems can be challenging, but we are sure you can do it!

Troubleshooting failure messages:

Here are some common error outputs from the test suite, and hints for how do deal with them.

STDERR: {test_name}: {path_to_test}:{line_number}: {function}: Assertion {test_function} failed
- This type of error means that your implementation runs without failure. However, your implementation’s behavoir does not match the expected output. GDB will be the most helpful tool for debugging this kind of failure.
- The executables that correspond to all tests can be found in the build directory and can be run in GDB with the command gdb {test_name}.
- The best place to start is to set a breakpoint at the test function you’re failing (for example test_query) and then step through this function to see where your implementation’s behavior differs from what is expected.
./test.sh: line 69: Segmentation fault (core dumped) "$EXEC" > "$TMP_STDOUT" 2> "$TMP_STDERR"
- This indicates that a segfault occured during the execution of the test, so you likely access invalid memory somewhere.
- You can use the Address Sanitizer to get more information about the segfault. To enabled it, rerun cmake with argument -DCMAKE_CXX_FLAGS="-fsanitize=address" followed by make.
- GDB is again going to help you debug this. You can run the failing test in GDB with the command gdb {test_name} from the /build directory to run a specific test in GDB.
- For a segfault, the best place to start is to just let the program run in GDB without any breakpoints. Once the segfault is reached, do a backtrace and look for the last frame in the trace that corresponds with code that you were responsible for writing. This function will be the best place to start looking in terms of tracking down the source of the fault!

Hanging Tests:

If a specific test remains running for more than about a minute, this is a sign that your implementation contains a deadlock. For this, GDB’s multi-threaded debugging tools will be useful. Again, you can run the particular test in GDB with the command gdb {test_name} from the /build directory.

After letting the program run in GDB for a bit, you can use Ctrl-C to pause the execution of all threads. After doing this, you can use the command info threads to see a list of all of the running threads and where they are in execution.

If it seems like a lot of threads are waiting on a mutex (their function is listed as _lock_wait), you can select one of these threads (with the command thread <t_id>), and look at its backtrace. If you look at the frame just before the frame for the lock() function, you should be able to see which of your mutexes it is waiting for. You can get information about this mutex with the command p <mutex>. Some of the more useful information that this shows you is the ID of the thread that currently holds the mutex!

Handing In & Grading

Handin instructions

As before, you will hand in your code using Git. In the sharding subdirectory of your project repository, you MUST fill in the text file called README.md.

Remind me again what the README.md should contain?

The README.md file will include the following:

Any design decisions you made and comments for graders, under "Design Overview". If there's nothing interesting to say, just list "None".
Any collaborators and citations for help that you received, under "Collaborators". CS 131 encourages collaboration, but we expect you to acknowledge help you received, as is standard academic practice.
Your answers to the conceptual questions at the start of this handout under "Conceptual Questions".
Notes on approximately how many hours you spent actively working on the assignment. We won't use this for grading, but we collect the data to calibrate for future years.

Grading breakdown

84% (84 points) for passing the functional tests in categories simple_shardkv (20 points), shardmaster (30 points), and integration (34 points). If your tests pass on the grading server, you’ve probably got all these points.
8% (8 points) answers to conceptual questions.
8% (8 points) answers to responsible computing questions.
Extra Credit: up to 15 points for coming up with your own dynamic shard scheme and proving that your scheme is more efficient.

Now head to the grading server. Make sure that you have the “Distributed Store” page configured correctly with your project repository, and check that your tests pass on the grading server as expected.

Congrats! You’re officially done with the last project for the class

Acknowledgements: This project was developed for CS 300.

Project 6: Distributed Store

Introduction

How Bacefook Works

Key-Value Stores

Sharding

GDPR

Conceptual Questions

SRC Questions

Assignment

Assignment installation

Stencil & Initial State

Stencil Breakdown

How to Build

How to Test

Assignment Roadmap

1. A simple key-value server

Overview

Specification:

Testing it out with a Client

2. Shardmaster v1

Overview:

Specifications:

Shard redistribution for Join and Leave

Testing it out with a Client

3. Applying configuration changes to the key-value servers

4. Implementing a Dynamic Shardmaster

Debugging

Troubleshooting failure messages:

Hanging Tests:

Handing In & Grading

Handin instructions

Grading breakdown

Shard redistribution for `Join` and `Leave`