Due Thursday, May 11th at 6:00pm (EST)
You can use at most 56 late hours on this project!
When implementing a real-world storage system, it is essential to understand the legal frameworks that govern data privacy. In this assignment, you will implement features pertaining to different privacy regulations that exist in Europe and the US, with a particular focus on the European Union’s General Data Protection Regulation (GDPR) and the United States’ California Consumer Privacy Act (CCPA). Both these regulations are designed to ensure that users have agency over their data, including the right to both access and delete their information online.
Task: Please read the specific requirements of the GDPR and CCPA at the following links.
CCPA Right to Know – Users can request access to personal data from companies, as well as: the categories of their data, the categories of sources, the purpose of collecting that data, and information about what categories of data are being transferred to what types of third parties.
GDPR Right of Access (Article 15) – Users can receive a copy of their own data upon request, as well as the source of any data that was not collected from the subject. When relevant, they can also request: the reason their data has been processed, the categories of data being processed, who can access this data, and the duration for which their data will be stored. They must also be informed when their data is transferred to a third party.
Not much – for our application they share broadly similar data access rights!
CCPA Right to Delete – Users can request the deletion of all personal data at any time. Companies are not required to comply if the data is essential for: fulfilling the purpose for which it was collected, ensuring security and integrity, identifying and repairing bugs in functionality, avoiding infringing upon the rights of another consumer (e.g. to free speech), engaging in scientific, historical, or statistical research, “enabling solely internal uses that are reasonably aligned with the expectations of the consumer,” or complying with legal requirements.
GDPR Right to be Forgotten (Article 17) – Users can request the deletion of the personal data under six specific circumstances, including when it is no longer needed for the purpose it was collected, or the user has revoked consent. Companies are not required to comply if the data is essential for: avoiding infringing upon freedom of expression, complying with legal requirements, engaging in scientific or historical research, or reasons in the interest of public health.
The primary difference is that “the GDPR right only applies if the request meets one of six specific conditions while the CCPA right is broad. However, the CCPA also allows business to refuse the request on much broader grounds than the GDPR” (Practical Law).
This assignment will help you:
Ensure that your project repository has a handout remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci0300/cs300-s23-projects.git
$ git pull $ git pull handout main
This will merge our Project 5B stencil code with your repository.
In this part of the assignment, you’ll be working with a specialized and more complicated key-value store for a new social media platform, Tweeter. On Tweeter, users can choose their usernames, write posts that appear on their profiles, and respond to other users’ posts (which appear on both users’ profiles). Tweeters has users who are in the EU, so Tweeter must still comply with the GDPR’s right to access and the right to be forgotten.
In Tweeter’s database, there are five kinds of key-value pairs.
|Key Value Pair Structure||Example Return Value|
||“user_14” → “malte”|
||“post_59” → “Hello, Tweeter!”|
||“all_users” → “user_13,user_14,user_160,”|
||“user_14_posts” → “post_59,post_1,”|
||“post_59_replies” → “post_60, post_61”|
Even though there can be multiple users with the same usernames, every
user_id is unique. The same goes for posts: even though there can be multiple posts with the same text, every post has a unique
You’re in charge of making a decision on how to handle a particular user’s request to exercise their right to access and their right to be forgotten.
Task: Check your email! You should have received an email containing information about which stakeholder you have been assigned to complete this portion of the assignment.
You will write a function,
GDPRDelete(), that performs the delete request for your assigned stakeholder. There are several ways one might implement privacy-conscious deletion (think back to last week’s section for ideas…). While many different kinds of delete can be acceptable for your stakeholder pair, you cannot opt out of implementing some form of delete and reject their request, although your implementation may not meet all of their hopes and expectations. (In the real world, there are cases where deletion requests have been rejected completely, but for the scope of this assignment, we’ve selected the dataset and the stakeholders such that you should be able to implement some kind of delete, even if it is an unsatisfactory one for your stakeholder.)
The caveat with this particular assignment is that you are also accountable to an ethical auditor who is charged with scrutinizing your company’s handling of GDPR compliance. Therefore, when satisfying your stakeholder’s request for deletion in this assignment, you must also consider other stakeholders that could be affected by this deletion and whose legitimate claims and interests might be infringed upon.
The following introduces each stakeholder pair and explains the context in which the deletion request occurs:
The signature of
GDPRDelete is as follows:
bool GDPRDelete(std::string& user_id);
In other words, the function receives a single argument, which is the user ID of the data subject who is invoking the right to erasure (i.e., your data subject stakeholder). The function can use this argument to look up data related to the data subject in the KVStore, or to find the user’s identifier (e.g., “user_1”) in other data.
Note that the function does not receive information about the context in which the deletion happens (e.g., what other users’ data the data subject might want to delete, or what the other users’ views on this are). Your design could extend the KVStore with auxiliary metadata that captures relevant context (e.g., special key-value pairs that indicate users who are of special interest, such as public figures), and
GDPRDelete() may draw on this data. Using such metadata is not a requirement, however.
We will not grade you on the level of sophistication your
GDPRDelete function achieves, but rather on whether it works. As long as your written answers justify your choices, you will get full credit, even if the function itself is simple.
We provide you with the data stored on Tweeter’s instance of KVStore here. The same data is also in the
gdpr/database.txt, and you can load it into your KVStore as follows.
In one terminal, run this command from the
build directory to start a KVStore server listening on port 1234 (you can pick any number for the port, it just sets up a rendevous point for your client to connect with the server):
build$ ./server 1234 8
In another terminal, run the client and feed in the data:
build$ ./simple_client 127.0.0.1:1234 < ../gdpr/database.txt
After running this command, typing
print store into the first terminal (which runs the server) should show that your KVStore contains our dataset. Note that since the KVStore is an in-memory store, you will need to re-load the dataset every time you restart the server, as it loses its contents on shutdown.
Now, a client can connect to the server and make API requests; for instance, to fetch
user_1’s data from the server, start up a new client and make a
build$ ./simple_client 127.0.0.1:1234 get user_1
You might feel overwhelmed with the situation you’ve been presented with — we’ve engineered each situation such that the conflict is intentionally difficult to deal with. Because of this, we want you to consider the following questions before you touch any of the code. You don’t have to submit your answers for this, but we highly recommend writing down some notes for yourself.
Important note: we don’t expect you to find a solution that satisfies everyone — rather, the point is to make the most reasonable tradeoffs between the opposing parties’ claims.
Task 1: Implement
Task 2: Leave detailed (header and/or inline) comments in your code that explain what kind of delete you are implementing.
GDPRDelete() should operate on any keys (such as
post_ID, etc) you deem necessary to implement your chosen strategy of deletion.
To test your
GDPRDelete() function, you can use the
gdprdelete <user> command in the client as follows:
$ ./simple_client 127.0.0.1:1234 gdprdelete user_10
You can then use
print store in the server to see how your store contents have changed.
The right to know and the right to delete affect all stakeholders of a database: its users, its operators, and those who use the data for technology, studies, and historical records. Now that you have implemented your version of privacy-conscious access and deletion, you should consider the potential pitfalls of your design choices in practice.
Task: Answer the following questions in your README file:
A good response identifies the legitimate claims that each stakeholder may have, explains why those claims are important, and compares the importance of both claims. It provides concrete reasons for (fully or partially) prioritizing or rejecting individual stakeholders’ claims and how those trade-offs are reflected in the chosen implementation of the right to be forgotten. A good response, importantly, also touches upon the limitations of those choices.
Now head to the grading server, make sure that you have the “KVStore” page configured correctly with your project repository.
Congratulations, you’ve your last and final CS 300 project!
Please hand in the files and answers for this assignment via Git in your
cs300-s23-projects-YOURNAME repository. Put your answers into the
README.md file in the
kvstore/ subdirectory of your project repository, and also put all other files from this assignment into that directory.
By 6:00 PM on May, 11th, you must have filled in the file
README.md in the
kvstore directory in your projects repo, and pushed the code for your
Acknowledgements: This project was developed for CS 300 by Eva Schiller, Eva Lau, Colton Rusch, and Malte Schwarzkopf.