Due Date: Thursday, February 27, 6 PM. You CANNOT use late hours on the written homework.
Read the following paper excerpts, and answer the questions below. Submit your answers as a single PDF on gradescope here. TAs won’t be answering questions about the writeup in office hours; you should be able to answer the questions below merely with reading comprehension.
PipeDream – All Sections.
Megatron-LM – All Sections.
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM – Sections 1, 3, Skim Evaluation.
The total length of your response should be about 600-700 words, with approximate breakdowns specified below. Please adhere to the breakdowns; you will be penalized, for example, if your summarization is too long, and the other answers are too short. Additionally, we expect you to cite specific examples and evidence from the papers when answering the questions.