Lecture 7: Assembly Language
» Lecture video (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 6pm Wednesday, February 19).
We've now arrived at the end of our introduction to C programming. The rest of the course will mostly look at higher-level concepts built atop the understanding we have developed. But before the look at higher levels, we will briefly pull back the covers and see what happens at the level below C to make your programs run.
| Web sites, Google, Facebook, AirBnB, etc. --- |------------------------------------------- | Distributed systems <-- block 4 C |------------------------- S | Parallel programming <-- block 3 |------------------------- 1 | C++ | Operating systems <-- block 2 3 |------------------------- 1 | C programming language <-- we discussed this so far |------------------------- | Assembly language <-- we will briefly cover this --- |------------------------------------------- | Hardware (chips)
Now that you understand the C language and memory representations of data, you may wonder about the "magic" hexadecimal bytes that the compiler outputs to make your computer's processor do things like adding numbers. How does the compiler choose these bytes, and what bytes are valid?
Each computer architecture (such as x86-64, which most modern computers use and we're considering in this course) has an instruction set specified by the manufacturer. The instruction set, first and foremost, defines what sequences of bytes trigger specific behavior in the processor (e.g., adding numbers, comparing them for equality, or loading data from memory). But hexadecimal bytes are hard for humans to read, so the instruction set also comes with a human-readable assembly language that consists of short, mnemonic instructions that correspond directly to a byte encoding (i.e., each of these instructions corresponds to a specific, unique set of hexadecimal bytes).
Why are we covering this?
You will almost certainly never need to write code in assembly language yourself, but it is helpful to have at least some intuition of how to read it. Being able to read assembly helps you debug weird problems with your program, and can help you understand compiler optimizations betters. Some of the examples we look at will also illustrate to you the tricks used to make computers run our code efficiently.
If you're curious to learn more details about assembly, consider taking CSCI 0330, which explains it in more detail and with more hands-on exercises than we'll have time for!
From C code to instructions
Let's take a look at how your C program get turned into hexadecimal bytes that can run on your processor.
The compiler, which we've discussed a lot already, turns your C program into assembly language. Lots of cleverness and optimizations.
The assembler turns assembly language into a set of bytes in an executable. This a very direct translation.
Assembly instructions operate on registers, small pieces of very fast memory inside the processor. To process data stored in memory, the processor first needs to load it into registers; and once it has completed working on the data in a register, it needs to store it back to memory.
Registers are the fastest kind of memory available in the machine. x86-64 has 14 general-purpose registers and several special-purpose registers. The table below lists all basic registers, with special-purpose registers highlighted in yellow. You won't understand all columns yet, but you will soon and can then use this table as a reference (we won't ask you to memorize it in detail). You'll notice different naming conventions for subsets of the same register, a side effect of the long history of the x86 architecture (the first x86 processor, the 8086 was first released in 1978).
|Full register name||32-bit
|Use in calling convention||Callee-saved?|
|%rax||%eax||%ax||%al||%ah||Return value (accumulator)||No|
|%rcx||%ecx||%cx||%cl||%ch||4th function argument||No|
|%rdx||%edx||%dx||%dl||%dh||3rd function argument||No|
|%rsi||%esi||%si||%sil||–||2nd function argument||No|
|%rdi||%edi||%di||%dil||–||1st function argument||No|
|%r8||%r8d||%r8w||%r8b||–||5th function argument||No|
|%r9||%r9d||%r9w||%r9b||–||6th function argument||No|
(general-purpose in some compiler modes)
(Program counter; called $pc in GDB)
|%rflags||%eflags||%flags||–||–||Flags and condition codes||No|
Note that unlike primary memory (RAM) – which is what we think of when we discuss memory in a C/C++ program
– registers have no addresses! There is no address value that, if cast to a pointer and dereferenced, would
return the contents of the
%rax register. Registers live in a separate world from the memory, and we need
special instructions to move data to and from registers and memory.
Whenever you see
%ZZZ in assembly code, this refers to a register named
ZZZ. The x86-64
registers have confusing names because they evolved over time; each register also has multiple names that
refer to different subsets of its bits. For example
%rax, one of the general-purpose registers that is,
by convention, used to pass return values from functions, is split into the following five names:
63 31 15 7 0 +-------------------------------+-------------------------------+ | | | | | +---------------------------------------------------------------+ |---------------------%rax (64 bits/8 bytes)--------------------| |-----%eax (32 bits/4 bytes)----| |-%ax (16b/2B)--| |--%ah--|--%al--| <-- 8 bits/1 byte each
Assembly instructions often have a suffix that indicates what input data size and register width they're
operating on. For instance, a set of "move" instructions help load signed and unsigned 8-, 16-, and 32-bit
quantities from memory into registers.
movzbl, for example, moves an 8-bit quantity (a byte) into
32-bit register (a longword; e.g.,
%eax) with zero extension;
movslq moves a 32-bit
quantity (longword) into a 64-bit register (quadword; e.g.,
%rax) with sign extension.
What's up with long suddenly meaning 32-bits (4 bytes)?
Because of wonderful history of the x86 architecture, and to confuse you, a "long" in x86-64 hardware terms does not refer to the same things as a
longinteger type in C. Specifically, a x86-64 assembly long is 4 bytes, so it corresponds to a C
int. The 8-byte long (or indeed any pointer type) in C uses "quad" instructions in x86-64 assembly, denoted by a q suffix.
Note that what looks like types (such as
short, etc.) here merely refers to the
register width used in the instruction. All actual types are removed from the program during compilation; there
are no types in assembly (for examples, see
asm07.s and their corresponding
C source files in the lecture code).
InstructionsThere are three basic kinds of assembly instructions:
- Computation: These instructions computate on values, typically values stored in registers. Most have
zero or one source operands and one source/destination operand, with the source operand coming first. For example,
addq %rax, %rbxperforms the computation
%rbx := %rbx + %rax.
- Data movement: These instructions move data between registers and memory – so they can move values
from one register to another, from memory into a register, and from a register back to memory. Almost all move
instructions have one source operand and one destination operand; the source operand comes first. For example,
movq %rax, %rbxcopies the contents of
%rbx, so it performs the assignment
%rbx = %rax.
- Control flow: Normally the CPU executes instructions in sequence and in the order they appear in the
assembly code (and, once translated into bytes, the order in memory). Control flow instructions change the next
instruction the processor executes (something called the "instruction pointer", and stored in special
%rip). There are unconditional branches (the instruction pointer is set to a new value), conditional branches (the instruction pointer is set to a new value if a condition is true), and function call and return instructions.
Some instructions appear to combine computation and data movement. For example, given the C code
int* pi; ...
++(*pi); the compiler might generate
incl (%rax) rather than
movl (%rax), %ebx; incl %ebx;
movl %ebx, (%rax). However, the processor actually divides these complex instructions into tiny, simpler,
invisible instructions called microcode, because the simpler instructions can be made to execute faster.
incl instruction actually runs in three phases: data movement, then computation, then data
movement. This matters when we introduce parallelism.
Different assembly syntaxes
There are actually multiple ways of writing x86-64 assembly. We use the "AT&T syntax", which is distinguished from the "Intel syntax" by several features, but especially by the use of percent signs for registers. Sadly, and just to make things more confusing, the Intel syntax puts destination registers before source registers.
How to read an assembly file
Assembly files (and assembly layout in GDB,
layout asm) can be confusing at first. The important
tricks to reading them are the following:
- Focus only on the instructions that you care about, and initially ignore anything else.
- Work backwards from the return statement.
gcc -S testasm.c) to consider:
.file "testasm.c" .text .globl _Z1fiii .type _Z1fiii, @function _Z1fiii: .LFB0: cmpl %edx, %esi je .L3 movl %esi, %eax ret .L3: movl %edi, %eax ret .LFE0: .size _Z1fiii, .-_Z1fiii .ident "GCC: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0" .section .note.GNU-stack,"",@progbits
There are many lines here that are effectively comments. All lines starting with a dot (e.g.,
.ident) are of this kind: they constitute "directives", rather than
instructions. Some directives tell the assembler what to do, but they're often unimportant to your understanding.
All directive lines that end in a colon, however, are important: they constitute labels, which matter for
control flow instructions (e.g.,
The actual instructions are on the indented lines between the labels. The processor will execute these top to
bottom unless it's told to continue somewhere other than the next instruction by a control flow instruction (e.g.,
je). A good way to figure out the important parts of the instructions is often to work backwards from
ret instructions, which is the function's return point. Why do we work backwards? Going forward
is more difficult because it requires understanding what the state of the processor's register is when we call the
function (something not explicitly described in the file here). By working backwards from the return, we can often
figure out what the function does without knowing its inputs.
Sometimes, we are also interested in looking at the assembly in an already-compiled object file – i.e.,
binary code after the translation from human-readable assembly language to bytes. This is called
"disassembling" from executable instructions, and happens when we look at assembly using GDB,
objdump -d, or
objdump -S. This output looks different from compiler-generated assembly:
in disassembled instructions, there are no intermediate labels or directives. This is because the labels and
directives disappear during the process of generating executable instructions.
Here's the disassembly of our function above, coming from an object file (e.g.
And a disassembly of the same function, from an object file: 0000000000000000 <_Z1fiii>: 0: 39 d6 cmp %edx,%esi 2: 74 03 je 7 <_Z1fiii+0x7> 4: 89 f0 mov %esi,%eax 6: c3 retq 7: 89 f8 mov %edi,%eax 9: c3 retqEverything but the instructions is removed, and the helpful
.L3label has been replaced with an actual address. The function appears to be located at address 0. This is just a placeholder; the final address is assigned by the linking process, when a final executable is created.
Finally, here is some disassembly from that actual executable:
0000000000400517 <_Z1fiii>: 400517: 39 d6 cmp %edx,%esi 400519: 74 03 je 40051e <_Z1fiii+0x7> 40051b: 89 f0 mov %esi,%eax 40051d: c3 retq 40051e: 89 f8 mov %edi,%eax 400520: c3 retqThe instructions are the same, but the addresses are different. (Other compiler flags would generate different addresses.)
Today, we looked at how the computer operates at the level just below C code: it executes a sequence of assembly instructions, which are small operations that translate into operations of the processor's circuits. Assembly is hard to write, but it is useful to be able to read it somewhat intuitively.
We saw that in assembly, there are computation, data movement, and control flow instructions, and that the compiler often produces somewhat unexpected instruction sequences to make things faster. This is part of why we use compilers: they are incredibly smart at distilling our programs down into the fastest possible sequence of instructions.
Next time, we will look at how function calls work and how the assembly instructions actually manage the automatic lifetime memory in the stack segment. After that, we will leave the low-level world of assembly and start moving up the systems stack (no pun in intended)!