Concurrency View as a PDF
Theme Song: I Will Wait
A database running on a single thread, supporting a single client, is not particularly interesting. We want to be able to run multiple operations at the same time and serve multiple clients to boot. In this assignment, we'll implement both fine-grain and resource-level locking to ensure that your database is safe to use concurrently.
Note: This is a challenging assignment with a lot of moving parts. Moreover, you won't be able to use many late days on this assignment. Start early and code incrementally. Starting at the last minute is a very bad idea. Please do not take this advice as a challenge.
Background Knowledge
Fine-Grain Locking
You may recall mutexes and reader-writers locks from the Pager assignment - in this assignment, we'll expand our usage of locks to make our B+Tree and hash table thread-safe using fine-grain locking.
Fine-grain locking is a locking technique that involves locking a part of a data structure to ensure safe access rather than locking the entire data structure. In B+Trees, this means only locking the nodes that we're currently looking at, and in a hash table, this means only locking the buckets we're currently looking at. Clearly this is desirable - now we can have multiple tenants safely traversing and modifying the data structure at the same time, leading to huge speed ups. However, getting fine-grain locking right is incredibly nuanced, as you'll learn in this assignment.
Fine-Grain Locking on Pages
In the stencil code that we've provided, Page
s have locks. Since the Page
represents logical units in both hash tables and B+Trees, this Lock
method will be instrumental in implementing the following two fine-grain locking schemes.
Fine-Grain Locking on Hash Tables
Hash tables are rather simple structures to lock; the only entrypoints into a bucket are through the lookup table. Therefore for each read or write, we only need to lock the lookup table, then the bucket!
On reads, we first acquire a read lock on the lookup table, then find our bucket. Next, we acquire a read lock on our bucket, then release our read lock on the lookup table, read our value, then release our read lock from the bucket.
On writes, we first acquire a read lock on the lookup table, then find and lock our bucket. We could have grabbed a write lock, but there's no need to grab a write lock on the lookup table unless we are sure that we are going to split; this is called optimistic locking, and can reduce unnecessary waiting for locks. After we've grabbed a write lock on the bucket, we check if we could potentially split; if so, we grab a write lock on the lookup table and complete our insert and split. If we don't split, we simply insert. Afterwards, we release all of our locks. You aren't required to perform optimistic locking in this assignment - it's perfectly find just to grab the write lock from the get go. However, do ensure that you release the write lock if you don't need to hold onto it - otherwise, it's not fine-grain locking at all!
Fine-Grain Locking on B+Trees
B+Trees are much more difficult structures to lock. There are few major concerns. Firstly, the structure of the tree can change under us as nodes split. Secondly, we don't want to be overly pessimistic in our locking, since holding a write lock on our root node locks all other clients out of the tree. Thirdly, table scans and queries get especially complicated, especially with resource locking below. And so on.
For sanity's sake, we will not be implementing fine-grain locking for selects, cursors, printing, or joins; we will focus on the main B+Tree operations: reading and writing. Primarily, we employ a scheme known as latch-crabbing or lock-crabbing, which ensures that the structure of the database won't change underneath us as we acquire locks. The following is a brief overview of lock-crabbing.
On reads, we first acquire a read lock on the root node. Then, we find the child node we want to go to, and grab a read lock on the child node. Only after locking the child node do we unlock the parent (root) node. This is called lock-crabbing, and is how we ensure that the shape of the tree doesn't change underneath us. Consider the case where we release the lock on the root before grabbing one on the child. In that split second, another thread could split the root node, making the child node obsolete. Crabbing avoids this issue entirely.
On writes, we first acquire a write lock on the root node. Then, we find the child node we want to go to, and grab a write lock on this child node. We only release the write lock on the root node if we can be sure that our child node will not split; if it can, then we hold onto the lock. As we recurse down, we hold locks on all parents that could potentially be affected by a node split. Eventually, we are guaranteed to unlock everything either after perfoming the write at a leaf node, or after a split is propagated up the tree. Because this algorithm is rather complicated, we've written a help doc here. Please use this and the associated helper functions in btree_subr.go
when implementing locking!
Transactions
Transactions are a way of grouping multiple actions into one, ACID-compliant package. That is to say, we are guarananteed that either all of the actions in a transaction succeed or none of them succeed, and that they are isolated from other transactions. Transactions acquire locks on the resources they are accessing to be sure that they can read and write safely. Critically, notice that the nature of transaction-level locks and data structure-level locks are very different. Transaction locks are completely unaware of the underlying representation of the data; we're only concerned in logical units to preserve the integrity of the external view of the database. On the other hand, data structure-level locks are completely unaware of the data its locking; only the structure of how the data is stored. Thus, these two locking schemes are completely orthogonal to one another, and yet, are both essential for a database serving multiple tenants concurrently.
Strict 2PL
Our transactions will adhere to strict two-phase locking. That is, we will acquire locks as we need them, but we will hold on to all of them until after we have committed our changes to the database. One great corrolary of this scheme is that we can be absolutely sure that there will not be cascading rollbacks; that is, if a transaction aborts, no other transaction will have to roll back because of it! This makes our lives a lot easier when implementing aborts, but does mean that our transactions may wait unnecessarily for resources to open.
Deadlock Avoidance
We want to be sure that our transactions don't end up creating a deadlock; one way to do this is by detecting cycles in a "waits-for" graph. While a transaction is waiting for another transaction to free a particular resource, we should add an edge between it and the offending transaction to the "waits-for" graph. If a cycle is detected in this graph, that means that we have deadlocked, and will not be able to proceed without dying. The last transaction to join the graph should be the one that rolls back. Critically, remember to remove edges between transactions once transactions die or are otherwise no longer waiting for a resource - otherwise, you may detect deadlocks that don't exist!
Assignment Spec
Overview
In this project, you'll implement fine-grain locking on Hash and B+Tree, then resource-level locking, and finally implement deadlock detection and avoidance! Note that the assignment parts are somewhat isolated from each other; feel free to work out of order on this assignment.
New REPL Commands
The transaction REPL now supports two new commands:
transaction [begin|commit]
- either starts or ends a transaction. Each client can have up to 1 transaction running at a time.lock <table> <key>
- grabs a write lock on a resource. Useful for debugging.
Multiple Clients
Since we need to deal with multiple clients, we need to run BumbleBase as a server-client application rather than just as a command-line application. Running ./bumble -p transaction
should now start a server at port 8335. Then, run ./bumble_client -p 8335
to connect to the database and start running queries as normal! Using a tool like tmux might help you manage multiple terminals.
In our implementation, each client can have up to one transaction running at a time. To simulate multiple transactions, we need multiple clients; hence, now, our database supports multiple clients through ./bumble_client
. To begin a transaction, we run transaction begin
, and to end it, transaction commit
. Commands issued without an active transaction will be treated as a transaction of one action (transaction begin
, [action], transaction commit
).
Stress Testing
Because it can be useful to clobber your database to detect deadlocks or unsafe behaviour using a shotgun approach, we've provided a new executable named bumble_stress
to help with this. We've also provided a set of sample workloads in the workloads/
directory to run with bumble_stress
- poke through workloads/README.md
to get a sense of what each workload is doing, and feel free to make your own workloads!
To stress test your database, build and run ./bumble_stress -index=<btree|hash> -workload=<w.txt> -n=<n> -verify
. The workload file should essentially mimic what would normally be piped through STDIN
, separated by newlines. The numthreads argument will specify the number of threads that will run the workload - to be clear, we split the workload across n
threads, not duplicate the workload for n
threads, meaning each line will only be run once, but on different threads. The index
flag determines which kind of index you'll be stress testing. Lastly, the verify
flag runs a verification check at the end of the stress test to ensure that the datastructure is still consistent after the run. No project
flag is required.
Stress testing is an especially experimental feature in the course. As a result, we will not be evaluating your code using this stress testing mechanism. Treat it as a way to discover bugs in your implementation and learn more, not as a metric for completion.
Part 0: Instrumenting Existing Code
Relevant files:
cmd/bumble/main.go
pkg/pager/page.go
pkg/hash/bucket.go
pkg/hash/table.go
pkg/btree/btree_subr.go
pkg/btree/btree.go
Once you download concurrency.zip
, you'll see a concurrency
subfolder; this should be placed as is in the pkg
folder. As well, once you download bumble_client.zip
and bumble_stress.zip
, you'll see bumble_client
and bumble_stress
subfolders respectively; these should be placed as is in the cmd
folder. Lastly, replace the corresponding files in your project with the files in the patch.zip
folder provided below (i.e. replace btree.go
, btree_subr.go
, and hash_subr.go
, and add the btree/verify.go
and hash/verify.go
files to the corresponding packages). Make sure that your linter doesn't throw any new errors in the main.go
folder. When finished, your filestructure should look like this:
./
- cmd/
- bumble/
- bumble_client/
- bumble_stress/
- pkg/
- btree/
- config/
- db/
- utils/
- hash/
- list/
- pager/
- repl/
- query/
- concurrency/
Next, you will need to uncomment all of the code blocks prefixed with [CONCURRENCY]
in your main.go
. If you're not sure if your main.go
is working, we've provided a copy that should work below.
Next, you'll want to change the test
target in your Makefile
to include the -race
flag. This will signal to the Go testing framework that you'd like to detect data races in your tests. You should end up with something like this:
test: go test ./test/* -v -race
Lastly, add the following function to your repl.go
file:
// Run the REPL. func (r *REPL) RunChan(c chan string, clientId uuid.UUID, prompt string) { // Get reader and writer; stdin and stdout if no conn. writer := os.Stdout replConfig := &REPLConfig{writer: writer, clientId: clientId} // Begin the repl loop! io.WriteString(writer, prompt) for payload := range c { // Emit the payload for debugging purposes. io.WriteString(writer, payload+"\n") // Parse the payload. fields := strings.Fields(payload) if len(fields) == 0 { io.WriteString(writer, prompt) continue } trigger := cleanInput(fields[0]) // Check for a meta-command. if trigger == ".help" { io.WriteString(writer, r.HelpString()) io.WriteString(writer, prompt) continue } // Else, check user commands. if command, exists := r.commands[trigger]; exists { // Call a hardcoded function. err := command(payload, replConfig) if err != nil { io.WriteString(writer, fmt.Sprintf("%v\n", err)) } } else { io.WriteString(writer, "command not found\n") } io.WriteString(writer, prompt) } // Print an additional line if we encountered an EOF character. io.WriteString(writer, "\n") }
Part 1: Fine-Grain Locking - Hash Tables
Relevant files:
pkg/hash/bucket.go
pkg/hash/table.go
Add locking the following functions:
func (table *HashTable) Find(key int64) (utils.Entry, error) func (table *HashTable) Split(bucket *HashBucket, hash int64) error func (table *HashTable) Insert(key int64, value int64) error func (table *HashTable) Update(key int64, value int64) error func (table *HashTable) Delete(key int64) error func (table *HashTable) Select() ([]utils.Entry, error)
Some hints:
- Be careful whether you use the read or write locks!
- Using
defer
can save a lot of headache if you know for certain that a function will unlock some resource when it returns. - Depending on your scheme, you may/should not have to add any locking to
Split
- think about what you should lock to keep the whole table safe, though! - Both
getBucket
andgetBucketByPN
now take a second parameter, which can be one ofNO_LOCK
,READ_LOCK
, orWRITE_LOCK
- do this instead of locking the page directly. Don't forget to unlock the page when you are done!
Part 2: Fine-Grain Locking - B+Trees
Relevant files:
pkg/btree/node.go
pkg/btree/btree.go
pkg/btree/btree_subr.go
func (node *LeafNode) insert(key int64, value int64, update bool) Split func (node *LeafNode) delete(key int64) func (node *LeafNode) get(key int64) (value int64, found bool) func (node *InternalNode) insert(key int64, value int64, update bool) Split func (node *InternalNode) delete(key int64) func (node *InternalNode) get(key int64) (value int64, found bool)
Some hints:
- Before anything else, read through the new helper methods in
btree_subr.go
! It will save you a LOT of time if you understand what they do. - We handle super node handling for you in the
btree.go
file. - Check out the
Insert
implementation inbtree.go
to see what else we handle for you. - We guarantee these are the only functions you need to change.
- We check whether we need to unlock parents at the begin of all of these function calls.
- Calling
getChildAt
with the second parameter set totrue
will lock the child page for you. After callinggetChildAt
and checking for errors, you should immediately callinitchild
to set the parent pointers correctly. - For all leaf node functions, we will definitely need to eventually unlock the current node and all parent nodes. Use the
force
parameter with theunlockParent
function to do this. You should also take a look atunlock
. - Lastly, think about whether we should unlock all parents in each case of a
split
. Critically, you'll have to look into the result ofnode.insertSplit
in eachinsert
call and react appropriately.
Part 3: Transactions
Relevant files:
pkg/concurrency/transaction.go
pkg/concurrency/lock.go
First, implement the following functions:
func (tm *TransactionManager) Lock(clientId uuid.UUID, table db.Index, resourceKey int64, lType LockType) error func (tm *TransactionManager) Unlock(clientId uuid.UUID, table db.Index, resourceKey int64, lType LockType) error
Some notes:
- Transactions are identified by their
clientId
; whenever a new client connects to the server, it is assigned an ID, and since each client can only have one transaction at a time, its ID serves as a uuid for its transaction, should it exist. - There are two abstractions surrounding how we represent locks. Firstly, a
Resource
is a unique identifier for a specific tuple in a database using its table and key. Secondly, aLockType
helps us keep track of which lock is currently being held on a particular resource. ATransaction
keeps track of which locktypes it is holding. - A
Transaction
should hold either a read or a write lock on a resource; never both and certainly not multiple of each. Thus, if a transaction requests a duplicate lock, ignore the duplicate, and if a transaction requests a write lock on a resource when it holds a read lock on the same resource, upgrade the lock carefully. - Remember to read-lock the
tmMtx
when you callGetTransaction
! - Remember to lock
Transaction
s when you access their resources! Lock
should check what lock we want to request, add an edge to ourpGraph
, check if there is a cycle and throw an error if there is, lock the resource if we can, and remove an edge from ourpGraph
.Unlock
should remove the lock from our transaction's resources and unlock the resource. Be sure to keep track of what locktype the resource had.
Part 4: Deadlock Avoidance
Relevant files:
pkg/concurrency/transaction.go
pkg/concurrency/deadlock.go
Implement the following function and instrument your prior transactions code with it:
func (g *Graph) DetectCycle() bool
Feel free to use any cycle detection algorithm you like. We recommend depth-first href=". You'll most likely have to define some helper functions to complete this function.
Testing
Run our unit tests on Gradescope. We've provided a subset of the tests here. Moreover, try out the bumble_stress
executable on a few workloads.
Getting started
To get started, download the following stencil package: concurrency, bumble_client, bumble_stress, patch, and workloads. We've provided an updated main.go
file here. See Part 0 on how to integrate this with the rest of your code.
Feedback
As this is a new course, we appreciate any feedback you have for us! If you enjoyed this assignment, hated this assignment, or have other thoughts to share, please do so here!