CSCI-2390: Privacy-Conscious Computer Systems

Projects

Final projects should seek to answer a research question through implementation of a new idea in a real system. This could take one of several forms:

Prototype a new, privacy-centered system design.
Apply a privacy-enhancing or privacy-preserving technique in an existing system, and measure its impact.
Conduct a study of privacy risks and deficiencies in existing software, and analyze what it would take to address them.

You may work on projects individually, or in groups of two to four students. Your project deliverables include a proposal, a progress report, a final paper describing design and implementation, your code, and a presentation I will post the final presentation and writeup to the course website (unless you explicitly want it kept confidential for a good reason).

Important dates

October 4, 2024: submit your project proposal (by 11pm).
November 3, 2024: submit an individual progress report via Google form (by 11pm).
December 3 and 5, 2024: presentation and demo.
December 13, 2024: submit your code and final report.

Project proposal

Please use the OSDI 2024 submission template. Your proposal should be a one-page summary of what your idea is, how you plan to go about investigating it, and what techniques you will apply (or need to learn about beyond the course material).

Project ideas

Here's a list of some starter ideas to get you thinking. Please feel free to pursue your own ideas! Click on the project idea to get some more information.

Reproducing results is an important part of research, and an awesome way to really learn how something works! Pick any of the systems we learn about in the course and implement your own version of it. Then try to reproduce their results! You may wish to simplify some aspects of the system to make the reproduction practical within the time available.

Several of the systems we read have complementary guarantees or cover different parts of the web ecosystem. Perhaps they can be combined to offer stronger protections! Some examples include: (1) combining K9db and Sesame to track policies per user and aggregate policies along with SQL queries, (2) applying IFC techniques to MPC frameworks to ensure computations are allowed and secure, and (3) tracking differential privacy budgets via Sesame policies.

Sesame is a system for policy enforcement for Rust web applications. In Sesame, each sensitive datum is stored within a privacy container with an associated policy. The datum is outside application reach, and can only be manipulated by Sesame or privacy regions approved by Sesame, or manually reviewed and approved by reviewers with sufficient permissions.

Applying Sesame to applications requires developers to specify their desired policies in Rust, and manually review and sign complex regions. Some ideas to reduce this effort include:

Create a small domain-specific language (DSL) for formally expressing privacy policies, including generating APIs for combining these policies and reasoning about their exact meaning.
Create a mechansim for verifying that a privacy region satisfies the requirements of the associated policy and context, e.g. by automatically analyzing the code via a theorem prover (e.g. Z3) or Rust verification tools (e.g. Verus).

The best way to evaluate whether a system or technique is effective is to apply it to real-world application! A thorough case study entails comparing the performance of the modified application relative to the original, reasoning about the application-level guarantees provided by the system, and discussing the overall experience of applying the system.

Examples include applying K9db, RuleKeeper, Sesame, or Riverbed to existing web applications. Building a Nostr client to mimic existing applications. Implementing a differentially-private version of Brown's Critical Review (or similar review aggregating applications), or a general MPC survey application (e.g. an MPC version of Google Forms or SurveyMonkey).

Paralegal is a verifier for privacy and security properties of Rust programs developed here in the ETOS group. Verification is a form of static analysis that analyzes the behavior of a program without running it. The Paralegal tool reasons about programs in terms of control and data-flow dependencies and is designed to verify properties with minimal specification effort.

In this project, you will apply Paralegal to an application to find potential bugs. Your task is to come up with privacy or security policies for the application and enforce it using Paralegal. This involves formalizing the policy as a constraint over permissible dependencies between important values in the program, then adding markers to the source code that identity those important values.

This is a good project if you are interested in compilers and programming languages, want to learn more about Rust, and enjoy finding bugs. Some familiarity or willingness to learn basic compiler concepts like data flow and control flow analysis is expected.

Do GDPR access and erasure rights apply to user data stored at Nostr relays? Does it make a difference whether the relay is paid and professionally hosted or whether it's a free relay hosted by some hobbyist in their basement? Can we use TEEs to provide deletion guarantees even when the relay is untrusted?

In this project, you will survey related GDPR cases studies around decenteralized or individually hosted applications, and form a conclusion about the degree to which GDPR applies to Nostr relays. You will reduce the compliance burden on relay administrator, by extending existing relay implementations with GDPR access and erasure functionality.

E2E encrypted messaging systems provide strong confidentiality for users, ensuring that eavesdroppers, such as hackers or hostile dictatorships, are unable to access the content of the conversation. However, such adversaries can identify that a particular user is sending such encrypted messages, and may compel them to decrypt or otherwise reveal the contents of these messages via the use of force.

Develop or extend an E2E encrypted messaging system to provide users with plausible deniability, either by hiding whether the messages were encrypted to begin with (e.g. via steganography) or using a deniable encryption scheme that allows ciphers to decrypt to different "decoy" plaintexts.

This project involves desiging a cryptographic mechanism that meets these requirements (with the help of the course teaching staff), and then implementing and evaluating that mechanism in a usable application. Some familiarity with cryptographic primitives (PK encryption and key exchange) is recommended.

We read a paper on evaluating the usability of DP tools in the class, which studies how data practitioners use existing DP tools and libraries, and how effective these tools are at helping practitioners correctly apply DP and understand its various parameters and guarantees.

Design a similar study to evaluate existing systems and their usability and effectiveness with application developers. For example, evaluating 2-3 existing MPC frameworks, or similar IFC systems.

Alternatively, you can help us continue development of Carousels, a resource estimation tool that aims to help non-experts implement correct and efficient MPC programs, and evaluate the effectiveness of this tool via a user study.

We will read a paper on evaluating the usability of DP Tools in this class. If you are considering a project of this kind, we suggest you take a look at that paper. We will hold your project to a similar standard in terms of study design, methodology, and analysis, although we will clearly expect that you will have far fewer participants.

We strongly suggest students taking on a project of this kind to have prior experience designing and running user studies. Also, note that the study design needs to be completed several weeks ahead of project presentations, so as to give enough time for the study participants to complete the study, and have enough time to analyze the collected data and form conclusions.

K9db offers privacy compliance by construction for database-backed web applications. However, the system has some limitations: it stores duplicate copies of jointly-owned records, it only supports relational databases, and it can only delete all of a user's data on request (rather than selected subsets of it). You can make it better!

Investigate an alternative design to K9db that does not create duplicate copies for records that are jointly owned, e.g., using wrapped encryption keys.
Extend K9db to support account and data recycling, by setting a time-to-live for pieces of data, and automatically deleting unused data that exceeds that threshold.

These ideas entail extending K9db's open-source implementation. Familiarity with C++ is very helpful, and some familiarity with the internals of a database is advantageous.