Introduction

Logistics

For now, only thing is to check out the course website. There’s a lot of information there about course policies, plans for assignments, etc. If you have any questions, please email Doug.

Margaret Hamilton, Software Engineering, and the Moon

Margaret Hamilton was the Director of the Software Engineering Division at MIT’s Draper Laboratory in the ’60s. Among other things, she was in charge of the team that developed the in-flight software that ran on the Apollo 11 mission, which took humans to the Moon. She was also one of the inventors of the term “software engineering.” More on that in a second.

How many lines of code would you guess were in the Apollo in-flight system? The answer is about 145,000.

In CSCI 0111, you mostly wrote programs that were less than 100 lines of code or so (usually much less). When writing a program of that size, you might be able to hold the whole thing in your head at once–the whole thing might even fit on your screen! When you’re writing 145,000 lines of code, that’s probably not possible. The Apollo 11 team had to be able to think about particular pieces of code in isolation, without worrying about the details of other parts of the code. Software engineering is the study of building reliable software, and techniques for developing components in isolation are its bread and butter. An example of this kind of technique: functions! When you write a function that does some particular task, the code you write that calls that function doesn’t need to worry about the function’s body. We’ll be learning several other such techniques in this course.

By the way, here’s a chunk of the Apollo 11 code I grabbed at random (the source code is available here):

SETWO           TC      WOZERO          # GO SET WORD ORDER CODE TO ZERO.
        +1      CA      DNECADR         # RELOAD A WITH THE DNADR.
        +2      AD      MINB1314        # IS THIS A REGULAR DNADR?
                EXTEND
                BZMF    FETCH2WD        # YES.  (A MUST NEVER BE ZERO)
                AD      MINB12          # NO.  IS IT A POINTER (DNPTR) OR A
                EXTEND                  #       CHANNEL(DNCHAN)
                BZMF    DODNPTR         # IT'S A POINTER.  (A MUST NEVER BE ZERO)

DODNCHAN        TC      6               # (EXECUTED AS EXTEND)  IT'S A CHANNEL
                INDEX   DNECADR
                INDEX   0       -4000   # (EXECUTED AS READ)
                TS      L
                TC      6               # (EXECUTED AS EXTEND)
                INDEX   DNECADR
                INDEX   0       -4001   # (EXECUTED AS READ)
                TS      DNECADR         # SET DNECADR
                CA      NEGONE          #       TO MINUS
                XCH     DNECADR         #               WHILE PRESERVING A.
                TCF     DNTMEXIT        # GO SEND CHANNELS

What do we notice about this code? A couple of the things I noticed:

  • Almost every line of code has a comment explaining what it’s doing! (The comments are the bits after the `#` on each line)
  • Nevertheless, the code is pretty hard to read. It’s written in a very old, very low-level programming language called AGC assembly language.

Luckily, in this class you’ll be working in much higher-level languages: Python (which you learned in CSCI 0111!) and Scala. These languages have built-in support for the kinds of software engineering techniques and concepts we’ll learn in the course (like functions!). I want to emphasize, however, that clean code with well-isolated components can be written in any programming language!

As an aside: Apollo 11 had 145,000 lines of code. Google has about 2 billion lines of code. Part of what makes that possible is that Google’s code is mostly written in languages like Python, not in AGC assembly language.

Sorting cards

Software engineering techniques will be one big theme in this course. Another theme will be algorithms.

In class, we did an exercise where we sorted playing cards and observed the steps we followed. See the lecture capture for details.

A sequence of steps taken to accomplish some task is called an algorithm. The algorithm most people seem to use to sort playing cards goes roughly like this:

  1. Start with the first two cards in your hand.
  2. If these cards are out of order, swap them.
  3. Now the first two cards in your hand are sorted. Look at the next card and put it in the right place in the sorted “section” at the beginning.
  4. Keep going like this until you’ve sorted your whole hand.

Let’s say we start with a hand like: 7, 2, K, 5, J. How will we sort this hand? The sequence of steps will look something like this:

7, 2, K, 5, J
2, 7, K, 5, J
2, 7, K, 5, J
2, 5, 7, K, J
2, 5, 7, J, K

The sequence of numbered steps described above is an algorithm–it’s a precisely (if informally) described sequence of steps that accomplishes a task. In fact, if we replace “hand” with “list” and “card” with “item,” computer scientists have a name for this algorithm: Insertion Sort.

Another algorithm that a couple of students used goes roughly like this:

  1. Pick out the lowest card and most it to the front of your hand
  2. Pick out the next lowest card and move it after that
  3. Keep going like this until you’ve sorted your whole hand

Computer scientists call this algorithm Selection Sort.

Notice what this algorithm isn’t: a computer program. This is an abstract description of a computation, rather than a particular program that can be executed in a particular langauge. We can implement this algorithm as a computer program by making it more concrete. While we’ve written a pretty precise sequence of steps above–most people could probably execute the algorithm without making any mistakes, if they worked slowly and carefully–computers need a little bit more precision. For instance, what does “keep going like this” mean?

Next time, we’ll see how to go from the Insertion Sort algorithm to Python code.