2952r: Systems Transforming Systems

Automatically augmenting existing systems with new capabilities—performance, security, and beyond!

The subject of the seminar is the development of new techniques, tools, and systems for automatically augmenting existing software systems with new capabilities — including, but not limited to, parallelism, distribution, isolation, and security. Τhe seminar will have a dual focus: 60% on advanced scientific topics in systems and 40% on academic communication, especially technical writing. These two foci will be structured as overlapping layers — systems in the foreground, technical communication in the background. A key goal will be for students, working in teams of 4–5 members, to develop papers worthy of scientific publication; as such, the course is structured around projects designed carefully, and with appropriate support, to result in paper publications in the systems community.Quick Links: Schedule & Topics, Reading & References, Projects & Milestones, Other Details.

For example, techniques and systems developed in the seminar will automatically secure the following code against entire classes of real software supply-chain attacks:

let fs = require("fs"), lp = require("leftpad");
fs.readFile('./book.txt', 'utf-8', (data, erro) => {
  console.log(lp(data);
});

…or automatically transform the following processing pipeline—which calculcates the 10 highest temperatures in the U.S.—to run on as many computers as there are available:

S="ftp://ftp.ncdc.noaa.gov/pub/data/noaa/2023" # This can point to our server, so that we don't overload NOAA's servers
curl -s $S | grep gz | tr -s ' ' | cut -d ' ' -f9 | sed "s;^;$S;" | xargs -n1 curl -s | gunzip | cut -c 89-92 | grep -v 999 | sort -rn | head

As part of the primary focus, we will investigate and push the state-of-the-art in systems — building real systems, systematizing existing literature, and creating evaluation benchmarks and workloads. As part of the secondary focus, we will work on our academic and technical communication — with an emphasis on all aspects of writing and publishing academic papers. We will be using the different topics of the former as vehicles to explore, study, and practice the latter throughout the semester.

Schedule & Topics

Below is tentative schedule. The final schedule will depend on participant interests.

Date Science (project #) Writing (recipe; tips) Milestone
1:9/4 Introduction Intro; examples Q, foundations: 1, 2, 3
2:9/11 Performance transforms Paper overview; foundations Project selection, team matching, writing warmup
3:9/18 Security transforms Related work; basics All presentations
4:9/25 Benchmarks (#2, #7) Running example; correctness Running example
5:10/2 μServices & scaleout (#3) Thesis & questions; actions re:example, thesis notes
6:10/9 Fault tolerance (#9) Evaluation; characters evaluation structure
7:10/16 Static analysis (#11, #13) Figures; cohe{sion,rence} re:evaluation, intro notes
8:10/23 Specification mining (#12) Introduction; emphasis introduction
9:10/30 Combined program analysis (#1) Abstract & title; framing re:intro, abstract, title, rewrite
10:11/6 Sandboxing (#4) Visuals; concision visuals, e2e rewrite
11:11/13 Serverless (#8) Semantics & related; shape related work
12:11/20 Compartmentalization (#5) Tech sections; Elegance technical sections
11/27
13:12/4 Everything else & AMA Conclusion; next steps Full paper

Week 1 focuses on project selection, situation within a project, and project organization and planning. Week 2 focuses on related work — all teams will prepare their paper presentations this week. Implementation starts around week 3 and proceeds in parallel to other milestones to ensure the team gets results in time for each result-oriented milestone.

Reading & References

The seminar combines a scientific, practical component—exploring and building state-of-the-art systems—and technical writing component.

Building & Exploring Systems

There is no required textbook for the scientific part of the course. Instead, we will draw from a variety of sources—mostly papers in software systems, programming languages, and computer security. Most of the papers reviewed in class will be project-specific — that is, teams will review their own literature and present it to the class. Examples of such papers include:

Scientific & Technical Writing

For the academic communication part of the course, we will draw from the following resources:

At times, we will use video or other media to get inspiration on or insights into other aspects of the course that are not easily covered through traditional textbooks or research papers — but also to work on our oral technical communication skills. Examples include videos on Node (see explicit thesis statement) and technical writing.

We will be using LaTeX likely with the acmart class. Students can collaborate using GitHub or using a service such as overleaf that simplifies LaTeX editing and collaborating over a web interface. A very short guide for typesetting with LaTeX is available for free.

Projects & Milestones

Broadly, a team will focus on developing one of three classes of papers:

Projects & Milestones. Projects will focus on state-of-the-art systems. Sample projects inlude (i) Language-based module-level compartmentalization, (ii) Ahead-of-Time black-box performance analysis, (iii) Controlling component side-effects with /bin/try, and (iv) From distribution-oblivious systems to scalable microservices. Milestones include all phases (and corresponding sections) of writing a systems paper, starting with a project summary. Example milestones include (i) overview of key works, (ii) the key idea or thesis statement, (iii) the structure and contents of the system evaluation, and (iv) a running example. A preliminary list of potential projects and milestones can be found in this document.

Policies & Expectations

Policies. Contrary to most classes, this one encourages collaboration: the subject of the class as well as some of the tools are complex—as a result, it may be necessary to collaborate as a class on figuring out how to best internalize the content, use existing tools, and produce the results you want. This collaboration additionally helps get different perspectives on the same subject, and even catch or correct misconceptions early on. You are thus encouraged to interact with each other as much as possible.

Other Details

To facilitate course interactions, we will set up a Discord server and a GitHub organization. In the first two weeks, participants need to complete a form that, apart from asking about background details, shares pointers several resources.

Prerequisites

The expectation is that students will have taken at least one systems course—at Brown CS0300/1310: Systems Fundamentals, CS0320: Software Engineering, CS0330/1330: Systems Intro, CS1260: Compilers & Analysis, CS1380: Distributed Systems, and CS1650: Software Security. Familiarity with a programming language, basic mathematical maturity, and excitement are important; the course will cover the rest. If unsure about taking this course, attend the first lecture.

Whereabouts & Contact

We meet on Wednesdays, 3–5:30pm, Salomon 003 and Zoom. Here is a Google Calendar with all the important info.