Skip to main content

Milestone 5: Distributed Execution Engine #

Collaboration: Individual milestone
Completion: About 12–16 hours Deadline: Tuesday Apr. 9, 2024 (11:59PM ET)
Latest handout version: CS1380:2024:M5
GitHub repo:

The goal of this milestone is to implement a scalable programming model and abstractions for processing large data sets in a distributed fashion.The core of the execution engine combines two higher order functions: a map function, which performs filtering and sorting and a reduce function, which performs a summarization operation. These functions operate over the distributed storage system implemented in the previous milestone, by serializing the distributed functions, attaching and running various tasks in parallel, managing all communications and data transfers between the various parts of the system.