Exercises: Distributed Systems
These exercises will help you prepare for quiz questions you may see for Block 4: Distributed Systems.
Acknowledgements: Some of these exercises were originally developed for Harvard's CS 61 course and were kindly shared by Eddie Kohler.
QUESTION DIST-1A. Which of the following system calls should a programmer expect to sometimes block (i.e., to return after significant delay)? Circle all that apply.
- None of these
QUESTION DIST-1B. Below are seven message sequence diagrams demonstrating the operation of a client–server RPC protocol. A request such as “get(X)” means “fetch the value of the object named X”; the response contains that value. Match each network property or programming strategy below with the diagram with which it best corresponds. You will use every diagram once.
#1—B, #2—A, #3—D, #4—C
(A—#2, B—#1, C—#4, D—#3)
QUESTION DIST-1C. List some resources that a DoS attack on a network server might exhaust.
At least: file descriptors, memory (stack), processes/threads. There’re a lot of correct answers, though! You can run out of virtual memory or even physical memory.
QUESTION DIST-1D. A server sets up a socket to listen on a connection. When a client wants to establish a connection, how does the server manage the multiple clients? In your answer indicate what system call or calls are used and what they do.
The server calls
accepton a listening file descriptor. This creates a new file descriptor that is particular to the connection with a particular client, giving the server uses a different fd for each client.
QUESTION DIST-1E: How are sockets different from pipes?
Sockets have two different modes of use, client and server, which require different system calls to set up, while pipes have only one. Sockets can be used across machines with protocols like UDP or TCP, while pipes connect processes on the same logical machine. Each file descriptor associated with a socket is also bidirectional, while a pipe has two unidirectional file descriptors: one for reading and the other for writing.
DIST-2. Scalability in Distributed Systems
QUESTION DIST-2A: Brett Bro is an engineer at a Silicon Valley startup that develops a new social network for pets. Anticipating exponential growth, Brett argues that the team should build their storage backend "for scale" from the outset.
Which of the following quotes from Brett in the team discussion are correct statements?
- "Going distributed immediately won't delay our launch date. Building a distributed system isn't any more complex than writing a concurrent program for one server."
- "We can increase resilience to failures by sending each
SETrequest to two randomly-chosen servers. Because of sharding, this will scale well, and because of replication, users will always see the latest data."
- "Sharding profile information by pet ID is a good idea, as it will increase our scalability for reads of the profile data."
- "Transactions that take locks for reading (
GETrequests) can release them after reading the necessary values, as having read the values under a lock guarantees atomicity."
- "We can use weak consistency to store and replicate friend requests because they are idempotent, so seeing and approving the request twice will have no detrimental effect other than user confusion."
1.: false because distributed systems must handle failures
2.: false because random choice isn't sharding; replication alone doesn't guarantee consistency
4.: false because other transactions could acquire the lock and write before the reading TX commits
QUESTION DIST-2B: True or false: sharding works best if a single server can independently handle each request.
QUESTION DIST-2C: True or false: RPCs are as fast as function calls.
false (network latency)
QUESTION DIST-2D: True or false: developers of distributed systems need to ensure they only write packet-sized data into sockets.
false (kernel splits stream into packets)
QUESTION DIST-2E: True or false: strong consistency typically reduces scalability of a replicated storage system.