Helping Developers Build Reliable Distributed Systems
Distributed systems are increasingly pervasive; millions of users depend on them for employment, information, and entertainment. Unfortunately, building correct distributed systems is notoriously difficult. Distributed systems must function correctly in unreliable network environments where packets can be dropped and nodes can fail at any time. Distributed systems must maintain consistent views of data even when those data are replicated and shared across many machines. This talk presents two tools that help developers build correct distributed systems.
Oddity is a graphical debugger for distributed systems. Oddity allows developers to control the behavior of their systems and inspect how system state, both at individual machines and in the network, evolves over time. Oddity has been deployed to multiple distributed systems classes at UW in order to help students find bugs and better understand their systems.
Verdi is a framework for guaranteeing system correctness via formal verification. Verdi brings the strong guarantees provided by formal verification to distributed systems. In Verdi, network behaviors--including failure--are modeled formally using network semantics. Verdi introduces verified system transformers, which allow fault-tolerance mechanisms, expressed as transformations between pairs of network semantics, to be verified once and then used in multiple systems. My colleagues and I used Verdi to complete the first formal verification of the Raft consensus protocol, a crucial component of many real-world systems.