2952r: Systems Transforming Systems
Automatically augmenting existing systems with new capabilities—performance, security, and beyond!
The subject of the seminar is the development of new techniques,
tools, and systems for automatically augmenting existing software
systems with new capabilities — including, but not limited to,
parallelism, distribution, isolation, and security. Τhe seminar will
have a dual focus: 60% on advanced scientific topics in systems and 40%
on academic communication, especially technical writing. These two foci
will be structured as overlapping layers — systems in the foreground,
technical communication in the background. A key goal will be for
students, working in teams of 4–5 members, to develop papers worthy of
scientific publication; as such, the course is structured around
projects designed carefully, and with appropriate support, to result in
paper publications in the systems community.Quick Links: Schedule
& Topics, Reading &
References, Projects &
Milestones, Other Details.
For example, techniques and systems developed in the seminar will automatically secure the following code against entire classes of real software supply-chain attacks:
let fs = require("fs"), lp = require("leftpad");
fs.readFile('./book.txt', 'utf-8', (data, erro) => {
console.log(lp(data);
});
…or automatically transform the following processing pipeline—which calculcates the 10 highest temperatures in the U.S.—to run on as many computers as there are available:
As part of the primary focus, we will investigate and push the state-of-the-art in systems — building real systems, systematizing existing literature, and creating evaluation benchmarks and workloads. As part of the secondary focus, we will work on our academic and technical communication — with an emphasis on all aspects of writing and publishing academic papers. We will be using the different topics of the former as vehicles to explore, study, and practice the latter throughout the semester.
Schedule & Topics
Below is tentative schedule. The final schedule will depend on participant interests.
Date | Science (project # ) |
Writing (recipe; tips) | Milestone |
---|---|---|---|
1:9/4 | Introduction | Intro; examples | Q, foundations: 1, 2, 3 |
2:9/11 | Performance transforms | Paper overview; foundations | Project selection, team matching, writing warmup |
3:9/18 | Security transforms | Related work; basics | All presentations |
4:9/25 | Benchmarks (#2 , #7 ) |
Running example; correctness | Running example |
5:10/2 | μServices & scaleout (#3 ) |
Thesis & questions; actions | re:example, thesis notes |
6:10/9 | Fault tolerance (#9 ) |
Evaluation; characters | evaluation structure |
7:10/16 | Static analysis (#11 , #13 ) |
Figures; cohe{sion,rence} | re:evaluation, intro notes |
8:10/23 | Specification mining (#12 ) |
Introduction; emphasis | introduction |
9:10/30 | Combined program analysis (#1 ) |
Abstract & title; framing | re:intro, abstract, title, rewrite |
10:11/6 | Sandboxing (#4 ) |
Visuals; concision | visuals, e2e rewrite |
11:11/13 | Serverless (#8 ) |
Semantics & related; shape | related work |
12:11/20 | Compartmentalization (#5 ) |
Tech sections; Elegance | technical sections |
11/27 | – | ||
13:12/4 | Everything else & AMA | Conclusion; next steps | Full paper |
Week 1 focuses on project selection, situation within a project, and project organization and planning. Week 2 focuses on related work — all teams will prepare their paper presentations this week. Implementation starts around week 3 and proceeds in parallel to other milestones to ensure the team gets results in time for each result-oriented milestone.
Reading & References
The seminar combines a scientific, practical component—exploring and building state-of-the-art systems—and technical writing component.
Building & Exploring Systems
There is no required textbook for the scientific part of the course. Instead, we will draw from a variety of sources—mostly papers in software systems, programming languages, and computer security. Most of the papers reviewed in class will be project-specific — that is, teams will review their own literature and present it to the class. Examples of such papers include:
DiSh: Dynamic Shell-Script Distribution, by Tammam Mustafa, Pratyush Das, Konstsantinos Kallas, Nikos Vasilakis. USENIX NSDI 2023.
BinWrap: Hybrid Protection Against Native Node.js Add-ons, by George Christou, Grigoris Ntousakis, Eric Lahtinen, Sotiris Ioannidis, Vasileios P. Kemerlis, Nikos Vasilakis. ACM AsiaCCS, 2023.
Retrofitting Fine Grain Isolation in the Firefox Renderer, by Shravan Narayan, Craig Disselkoen, Tal Garfinkel, Nathan Froyd, Eric Rahm, Sorin Lerner, Hovav Shacham, Deian Stefan. USENIX Security, 2020.
Optimizing data-intensive computations in existing libraries with split annotations, by Shoumik Palkar, Matei Zaharia. ACM SOSP, 2019.
Program-mandering: Quantitative Privilege Separation, by Shen Liu, Dongrui Zeng, Yongzhe Huang, Frank Capobianco, Stephen McCamant, Trent Jaeger, Gang Tan. ACM CCS, 2019.
Scientific & Technical Writing
For the academic communication part of the course, we will draw from the following resources:
Elements of Style, 4th Edition, by Strunk and White, Pearson (ISBN 978-0205309023). You can buy a physical copy (e.g., $7 on Amazon) or read it online.
The Sense of Style, by Stephen Pinker, Viking (ISBN 978-0670025855). You can buy a physical copy (e.g., $20 on Amazon) or borrow a copy from the instructor.
Style: Lessons in Clarity and Grace, by Joseph Williams and Joseph Bizup, Pearson (12th edition, ISBN 978-0134080413). A physical copy is somewhat expensive (e.g., $40 on Amazon).
At times, we will use video or other media to get inspiration on or insights into other aspects of the course that are not easily covered through traditional textbooks or research papers — but also to work on our oral technical communication skills. Examples include videos on Node (see explicit thesis statement) and technical writing.
We will be using LaTeX likely with the acmart class. Students can collaborate using GitHub or using a service such as overleaf that simplifies LaTeX editing and collaborating over a web interface. A very short guide for typesetting with LaTeX is available for free.
Projects & Milestones
Broadly, a team will focus on developing one of three classes of papers:
Novel systems: It presents a new, real system, either by a global survey of an entire system or by a selective examination of specific themes embodied in the system. These papers describe a new system, exemplify its use, describe some of its novel ideas, and evaluate it on real workloads across several dimensions — including several performance metrics.
Systematization of knowledge: evaluate, systematize, and contextualize existing knowledge. These papers provide an important new viewpoint on an established, major research area. The heart of the SoK paper is analysis: analyzing the existing literature and providing insights that could not be obtained by simply reading each of the individual papers.
Benchmark suites: collect, systematize, and automate a set of programs for evaluating new systems. These papers (1) identify a need in terms of an established benchmark set in an area, (2) collect a set of benchmarks that are representative of that area, meaning that results on these benchmarks generalize to other programs and systems, (3) evaluate the completeness and generalizability of these benchmarks using existing papers, (4) and automate the infrastructure needed for others to use them.
Projects & Milestones. Projects will focus on
state-of-the-art systems. Sample projects inlude (i) Language-based
module-level compartmentalization, (ii) Ahead-of-Time black-box
performance analysis, (iii) Controlling component side-effects with
/bin/try
, and (iv) From distribution-oblivious systems to
scalable microservices. Milestones include all phases (and corresponding
sections) of writing a systems paper, starting with a project summary.
Example milestones include (i) overview of key works, (ii) the key idea
or thesis statement, (iii) the structure and contents of the system
evaluation, and (iv) a running example. A preliminary list of potential
projects and milestones can be found in this
document.
Policies & Expectations
Policies. Contrary to most classes, this one encourages collaboration: the subject of the class as well as some of the tools are complex—as a result, it may be necessary to collaborate as a class on figuring out how to best internalize the content, use existing tools, and produce the results you want. This collaboration additionally helps get different perspectives on the same subject, and even catch or correct misconceptions early on. You are thus encouraged to interact with each other as much as possible.
Other Details
To facilitate course interactions, we will set up a Discord server and a GitHub organization. In the first two weeks, participants need to complete a form that, apart from asking about background details, shares pointers several resources.
Prerequisites
The expectation is that students will have taken at least one systems course—at Brown CS0300/1310: Systems Fundamentals, CS0320: Software Engineering, CS0330/1330: Systems Intro, CS1260: Compilers & Analysis, CS1380: Distributed Systems, and CS1650: Software Security. Familiarity with a programming language, basic mathematical maturity, and excitement are important; the course will cover the rest. If unsure about taking this course, attend the first lecture.
Whereabouts & Contact
We meet on Wednesdays, 3–5:30pm, Salomon 003 and Zoom. Here is a Google Calendar with all the important info.