CS 131/CSCI 1310: Fundamentals of Computer Systems

Lecture 7: Assembly Language

» Lecture video (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 6pm Wednesday, February 19).

Assembly Language

We've now arrived at the end of our introduction to C programming. The rest of the course will mostly look at higher-level concepts built atop the understanding we have developed. But before the look at higher levels, we will briefly pull back the covers and see what happens at the level below C to make your programs run.

    | Web sites, Google, Facebook, AirBnB, etc.
--- |-------------------------------------------
    | Distributed systems     <-- block 4
 C  |-------------------------
 S  | Parallel programming    <-- block 3
    |-------------------------
 1  | C++ | Operating systems <-- block 2
 3  |-------------------------
 1  | C programming language  <-- we discussed this so far
    |-------------------------
    | Assembly language       <-- we will briefly cover this
--- |-------------------------------------------
    | Hardware (chips)

Now that you understand the C language and memory representations of data, you may wonder about the "magic" hexadecimal bytes that the compiler outputs to make your computer's processor do things like adding numbers. How does the compiler choose these bytes, and what bytes are valid?

Each computer architecture (such as x86-64, which most modern computers use and we're considering in this course) has an instruction set specified by the manufacturer. The instruction set, first and foremost, defines what sequences of bytes trigger specific behavior in the processor (e.g., adding numbers, comparing them for equality, or loading data from memory). But hexadecimal bytes are hard for humans to read, so the instruction set also comes with a human-readable assembly language that consists of short, mnemonic instructions that correspond directly to a byte encoding (i.e., each of these instructions corresponds to a specific, unique set of hexadecimal bytes).

Why are we covering this?

You will almost certainly never need to write code in assembly language yourself, but it is helpful to have at least some intuition of how to read it. Being able to read assembly helps you debug weird problems with your program, and can help you understand compiler optimizations betters. Some of the examples we look at will also illustrate to you the tricks used to make computers run our code efficiently.
If you're curious to learn more details about assembly, consider taking CSCI 0330, which explains it in more detail and with more hands-on exercises than we'll have time for!

From C code to instructions

Let's take a look at how your C program get turned into hexadecimal bytes that can run on your processor.

The compiler, which we've discussed a lot already, turns your C program into assembly language. Lots of cleverness and optimizations.

The assembler turns assembly language into a set of bytes in an executable. This a very direct translation.

Registers

Assembly instructions operate on registers, small pieces of very fast memory inside the processor. To process data stored in memory, the processor first needs to load it into registers; and once it has completed working on the data in a register, it needs to store it back to memory.

Registers are the fastest kind of memory available in the machine. x86-64 has 14 general-purpose registers and several special-purpose registers. The table below lists all basic registers, with special-purpose registers highlighted in yellow. You won't understand all columns yet, but you will soon and can then use this table as a reference (we won't ask you to memorize it in detail). You'll notice different naming conventions for subsets of the same register, a side effect of the long history of the x86 architecture (the first x86 processor, the 8086 was first released in 1978).

Full register name	32-bit (bits 0–31)	16-bit (bits 0–15)	8-bit low (bits 0–7)	8-bit high (bits 8–15)	Use in calling convention	Callee-saved?
General-purpose registers:
%rax	%eax	%ax	%al	%ah	Return value (accumulator)	No
%rbx	%ebx	%bx	%bl	%bh	–	Yes
%rcx	%ecx	%cx	%cl	%ch	4th function argument	No
%rdx	%edx	%dx	%dl	%dh	3rd function argument	No
%rsi	%esi	%si	%sil	–	2nd function argument	No
%rdi	%edi	%di	%dil	–	1st function argument	No
%r8	%r8d	%r8w	%r8b	–	5th function argument	No
%r9	%r9d	%r9w	%r9b	–	6th function argument	No
%r10	%r10d	%r10w	%r10b	–	–	No
%r11	%r11d	%r11w	%r11b	–	–	No
%r12	%r12d	%r12w	%r12b	–	–	Yes
%r13	%r13d	%r13w	%r13b	–	–	Yes
%r14	%r14d	%r14w	%r14b	–	–	Yes
%r15	%r15d	%r15w	%r15b	–	–	Yes
Special-purpose registers:
%rsp	%esp	%sp	%spl	–	Stack pointer	Yes
%rbp	%ebp	%bp	%bpl	–	Base pointer (general-purpose in some compiler modes)	Yes
%rip	%eip	%ip	–	–	Instruction pointer (Program counter; called $pc in GDB)	*
%rflags	%eflags	%flags	–	–	Flags and condition codes	No

Note that unlike primary memory (RAM) – which is what we think of when we discuss memory in a C/C++ program – registers have no addresses! There is no address value that, if cast to a pointer and dereferenced, would return the contents of the %rax register. Registers live in a separate world from the memory, and we need special instructions to move data to and from registers and memory.

Whenever you see %ZZZ in assembly code, this refers to a register named ZZZ. The x86-64 registers have confusing names because they evolved over time; each register also has multiple names that refer to different subsets of its bits. For example %rax, one of the general-purpose registers that is, by convention, used to pass return values from functions, is split into the following five names:

 63                              31              15      7       0
 +-------------------------------+-------------------------------+
 |                               |               |       |       |
 +---------------------------------------------------------------+

 |---------------------%rax (64 bits/8 bytes)--------------------|
                                 |-----%eax (32 bits/4 bytes)----|
                                                 |-%ax (16b/2B)--|
                                                 |--%ah--|--%al--| <-- 8 bits/1 byte each

Assembly instructions often have a suffix that indicates what input data size and register width they're operating on. For instance, a set of "move" instructions help load signed and unsigned 8-, 16-, and 32-bit quantities from memory into registers. movzbl, for example, moves an 8-bit quantity (a byte) into 32-bit register (a longword; e.g., %eax) with zero extension; movslq moves a 32-bit quantity (longword) into a 64-bit register (quadword; e.g., %rax) with sign extension.

What's up with long suddenly meaning 32-bits (4 bytes)?

Because of wonderful history of the x86 architecture, and to confuse you, a "long" in x86-64 hardware terms does not refer to the same things as a long integer type in C. Specifically, a x86-64 assembly long is 4 bytes, so it corresponds to a C int. The 8-byte long (or indeed any pointer type) in C uses "quad" instructions in x86-64 assembly, denoted by a q suffix.

Note that what looks like types (such as long, short, etc.) here merely refers to the register width used in the instruction. All actual types are removed from the program during compilation; there are no types in assembly (for examples, see asm06.s and asm07.s and their corresponding C source files in the lecture code).

Instructions

There are three basic kinds of assembly instructions:

Computation: These instructions computate on values, typically values stored in registers. Most have zero or one source operands and one source/destination operand, with the source operand coming first. For example, the instruction addq %rax, %rbx performs the computation %rbx := %rbx + %rax.
Data movement: These instructions move data between registers and memory – so they can move values from one register to another, from memory into a register, and from a register back to memory. Almost all move instructions have one source operand and one destination operand; the source operand comes first. For example, movq %rax, %rbx copies the contents of %rax into %rbx, so it performs the assignment %rbx = %rax.
Control flow: Normally the CPU executes instructions in sequence and in the order they appear in the assembly code (and, once translated into bytes, the order in memory). Control flow instructions change the next instruction the processor executes (something called the "instruction pointer", and stored in special register %rip). There are unconditional branches (the instruction pointer is set to a new value), conditional branches (the instruction pointer is set to a new value if a condition is true), and function call and return instructions.

Some instructions appear to combine computation and data movement. For example, given the C code int* pi; ... ++(*pi); the compiler might generate incl (%rax) rather than movl (%rax), %ebx; incl %ebx; movl %ebx, (%rax). However, the processor actually divides these complex instructions into tiny, simpler, invisible instructions called microcode, because the simpler instructions can be made to execute faster. The complex incl instruction actually runs in three phases: data movement, then computation, then data movement. This matters when we introduce parallelism.

Different assembly syntaxes

There are actually multiple ways of writing x86-64 assembly. We use the "AT&T syntax", which is distinguished from the "Intel syntax" by several features, but especially by the use of percent signs for registers. Sadly, and just to make things more confusing, the Intel syntax puts destination registers before source registers.

How to read an assembly file

Assembly files (and assembly layout in GDB, layout asm) can be confusing at first. The important tricks to reading them are the following:

Focus only on the instructions that you care about, and initially ignore anything else.
Work backwards from the return statement.

Here's an example assembly file (output from gcc -S testasm.c) to consider:

    .file "testasm.c"
    .text
    .globl  _Z1fiii
    .type   _Z1fiii, @function
_Z1fiii:
.LFB0:
    cmpl    %edx, %esi
    je  .L3
    movl    %esi, %eax
    ret
.L3:
    movl    %edi, %eax
    ret
.LFE0:
    .size   _Z1fiii, .-_Z1fiii
    .ident  "GCC: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0"
    .section        .note.GNU-stack,"",@progbits

There are many lines here that are effectively comments. All lines starting with a dot (e.g., .file or .ident) are of this kind: they constitute "directives", rather than instructions. Some directives tell the assembler what to do, but they're often unimportant to your understanding. All directive lines that end in a colon, however, are important: they constitute labels, which matter for control flow instructions (e.g., .L3:).

The actual instructions are on the indented lines between the labels. The processor will execute these top to bottom unless it's told to continue somewhere other than the next instruction by a control flow instruction (e.g., je). A good way to figure out the important parts of the instructions is often to work backwards from the ret instructions, which is the function's return point. Why do we work backwards? Going forward is more difficult because it requires understanding what the state of the processor's register is when we call the function (something not explicitly described in the file here). By working backwards from the return, we can often figure out what the function does without knowing its inputs.

Sometimes, we are also interested in looking at the assembly in an already-compiled object file – i.e., binary code after the translation from human-readable assembly language to bytes. This is called "disassembling" from executable instructions, and happens when we look at assembly using GDB, objdump -d, or objdump -S. This output looks different from compiler-generated assembly: in disassembled instructions, there are no intermediate labels or directives. This is because the labels and directives disappear during the process of generating executable instructions.

Here's the disassembly of our function above, coming from an object file (e.g. testasm.o):

And a disassembly of the same function, from an object file:

0000000000000000 <_Z1fiii>:
   0:   39 d6                   cmp    %edx,%esi
   2:   74 03                   je     7 <_Z1fiii+0x7>
   4:   89 f0                   mov    %esi,%eax
   6:   c3                      retq
   7:   89 f8                   mov    %edi,%eax
   9:   c3                      retq

Everything but the instructions is removed, and the helpful .L3 label has been replaced with an actual address. The function appears to be located at address 0. This is just a placeholder; the final address is assigned by the linking process, when a final executable is created.

Finally, here is some disassembly from that actual executable:

0000000000400517 <_Z1fiii>:
  400517:   39 d6                   cmp    %edx,%esi
  400519:   74 03                   je     40051e <_Z1fiii+0x7>
  40051b:   89 f0                   mov    %esi,%eax
  40051d:   c3                      retq
  40051e:   89 f8                   mov    %edi,%eax
  400520:   c3                      retq

The instructions are the same, but the addresses are different. (Other compiler flags would generate different addresses.)

Summary

Today, we looked at how the computer operates at the level just below C code: it executes a sequence of assembly instructions, which are small operations that translate into operations of the processor's circuits. Assembly is hard to write, but it is useful to be able to read it somewhat intuitively.

We saw that in assembly, there are computation, data movement, and control flow instructions, and that the compiler often produces somewhat unexpected instruction sequences to make things faster. This is part of why we use compilers: they are incredibly smart at distilling our programs down into the fastest possible sequence of instructions.

Next time, we will look at how function calls work and how the assembly instructions actually manage the automatic lifetime memory in the stack segment. After that, we will leave the low-level world of assembly and start moving up the systems stack (no pun in intended)!