CSCI 0300/1310: Fundamentals of Computer Systems

⚠️ This is not the current iteration of the course! Head here for the current offering.

Lecture 7: Alignment continued, Collection Rules, and Signed Number Representation

» Lecture video 1 (last 25 min have no audio) (Brown ID required)
» Lecture video 2 (2021; 5 min, remainder of alignment) (Brown ID required)
» Lecture video 3 (2021; Signed Integers) (Brown ID required)
» Lecture code
» Post-Lecture Quiz (due 11:59pm Sunday, February 20).

Alignment and Collection Rules, continued

In previous lectures, we built up to specifying a set of rules that govern how the C language expects data to be laid out in memory. We're now ready to write down these rules.

Here are the first two:

the first member rule says that the address of the collection (array, structure, or union [see below]) is the same as the address of its first member;
the array rule says that all members of an array are laid out consecutively in memory; and

How are the members of a struct like list_node_t actually laid out in memory? This is defined by the struct rule, which says that the members of a struct are laid out in the order they're declared in, without overlap, and subject only to alignment constraints. These mysterious "alignment constraints" are what makes our list_node_t have a size of 16 bytes even though it only needs 12.

So, by the first member rule, the struct will be aligned. (It turns out that, in practice, structures on the heap are aligned on 16-byte boundaries because malloc() on x86-64 Linux returns 16-byte aligned pointers; structures on the stack are aligned by the compiler.)

The size of a struct might therefore be larger than the sum of the sizes of its components due to alignment constraints. Since the compiler must lay out struct components in order, and it must obey the components' alignment constraints, and it must ensure different components don't overlap, it must sometimes introduce extra space in structs. This space is called padding, and it's effectively wasted memory. Our linked list node is an example of a situation where padding is required : the struct will have 4 bytes of padding after int v, to ensure that list_node_t* has a correct alignment (address divisible by 8).

So, we can now specify the third rule:

the struct rule says that members of a struct are laid out in declaration order, without overlap, and with minimum padding as necessary to satisfy the struct members' alignment constraints.

In addition to these rules, there are three more we haven't covered or made explicit yet.

Aside: Unions

⚠️ We did not cover unions in the course this year. Following material is for your education only; we won't test you on it. Feel free to skip ahead to the other rules.

For the next rule, we need to learn about unions, which are another collection type in C.

A union is a C data structure that looks a lot like a struct, but which contains only one of its members. Here's an example:

union int_or_char {
  int i;
  char c;
}

Any variable u of type union int_or_char is either an integer (so u.i is valid) or a char (so u.c is valid), but never both at the same time. Unions are rarely used in practice and you won't need them in this course. The size of a union is the maximum of the sizes of its members, and so is its alignment.

What are unions good for?

Unions are helpful when a data structure's size is of the essence (e.g., for embedded environments like the controller chip in a microwave), and in situations where the same bytes can represent one thing or another. For example, the internet is based on a protocol called IP, and there are two versions of: IPv4 (the old one) and IPv6 (the new one, which permits >4B computers on the internet). But there are situations where we need to pass an address that either follows the IPv4 format (4 bytes) or the IPv6 format (16 bytes). A union makes this possible without wasting memory or requiring two separate data structures.

Now we can get to the next rule!

The union rule says that the address of all members of a union is the same as the address of the union.

Back to other rules!

The remaining two rules are far more important:

The minimum rule says that the memory used for a collection shall be the minimum possible without violating any of the other rules.
The malloc rule says that any call to malloc that succeeds returns a pointer that is aligned for any type. This rule has some important consequences: it means that malloc() must return pointers aligned for the maximum alignment, which on x86-64 Linux is 16 bytes. In other words, any pointer returned from malloc points to an address that is a multiple of 16.

One consequence from the struct rule and the minimum rule is that reordering struct members can reduce size of structures! Look at the example in mexplore-structalign.c. The struct ints_and_chars defined in that file consists of three ints and three chars, whose declarations alternate. What will the size of this structure be?

It's 24 bytes. The reason is that each int requires 4 bytes (so, 12 bytes total), and each char requires 1 byte (3 bytes total), but alignment requires the integers to start at addresses that are multiples of four! Hence, we end up with a struct layout like the following:

0x... 00 ... 04 ... 08 ... 0c ... 10 ... 14 ...   <- addresses (hex)
     +------+--+---+------+--+---+------+--+---+
     |  i1  |c1|PAD|  i2  |c2|PAD|  i3  |c3|PAD|  <- values
     +------+--+---+------+--+---+------+--+---+

This adds 9 bytes of padding – a 37.5% overhead! The padding is needed because the characters only use one byte, but the next integer has to start on an address divisible by 4.

But if we rearrange the members of the struct, declaring them in order i1, i2, i3, c1, c2, c3, the structure's memory layout changes. We now have the three integers adjacent, and since they require an alignment of 4 and are 4 bytes in size, no padding is needed between them. Following, we can put the characters into contiguous bytes also, since their size and alignment are 1.

0x... 00 ... 04 ... 08 ... 0c 0d 0e 0f ...   <- addresses (hex)
     +------+------+------+--+--+--+--+
     |  i1  |  i2  |  i3  |c1|c2|c3|P.|      <- values
     +------+------+------+--+--+--+--+

We only need a single byte of padding (6.25% overhead), as the struct must be padded to 16 bytes (why? Consider an array of ints_and_chars and the alignment of the next element!). In addition, the structure is now 16 bytes in size rather than 24 bytes – a 33% saving.

Signed number representation

Why are we covering this?

Debugging computer systems often require you to look at memory dumps and understand what the contents of memory mean. Signed numbers have a non-obvious representation (they will appear as very large hexadecimal values), and learning how the computer interprets hexadecimal bytes as negative numbers will help you understand better what is in memory and whether that data is what you expect. Moreover, arithmetic on signed numbers can trigger undefined behavior in non-intuitive ways; this demonstrates an instance of undefined behavior unrelated to memory access!

Recall from last time that our computers use a little endian number representation. This makes reading the values of pointers and integers from memory dumps (like those produced by our hexdump() function) more difficult, but it is how things work.

Using position notation on bytes allows us to represent unsigned numbers very well: the higher the byte's position in the number, the greater its value. You may have wondered how we can represent negative, signed numbers in this system, however. The answer is a representation called two's complement, which is what the x86-64 architecture (and most other architectures) use.

Two's complement strikes most people as weird when they first encounter it, but there is an intuition for it. The best way to think about it is that adding 1 to -1 should produce 0. The representation of 1 in a 4-byte integer is 0x0000'0001 (N.B.: for clarity for humans, I'm using big endian notation here; on the machine, this will be laid out as 0x0100'0000). What number, when added to this representation, yields 0?

The answer is 0xffff'ffff, the largest representable integer in 4 bytes. If we add 1 to it, we flip each bit from f to 0 and carry a one, which flips the next bit in turn. At the end, we have:

   0x0000'0001
 + 0xffff'ffff
--------------
 0x1'0000'0000 == 0x0000'0000 (mod 2^32)

The computer simply throws away the carried 1 at the top, since it's outside the 4-byte width of the integer, and we end up with zero, since all arithmetic on fixed-size integers is modulo their size (here, 16⁴ = 2³²). You can see this in action in signed-int.c.

More generally, in two's complement arithmetic, we always have -x + x = 0, so a negative number added to its positive complement yields zero. The principle that makes this possible is that -x corresponds to positive x, with all bits flipped (written ~x) and 1 added. In other words, -x = ~x + 1.

Signed numbers split their range in half, with half representing negative numbers and the other half representing 0 and positive numbers. For example, a signed char can represent numbers -128 to 127 inclusive (the positive range is one smaller because it also includes 0). The most significant bit acts as a sign bit, so all signed numbers whose top bit is set to 1 are negative. Consequently, the largest positive value of a signed char is 0x7f (binary 0111'1111), and the largest-magnitude negative value is 0x80 (binary 1000'0000), representing -128. The number -1 corresponds to 0xff (binary 1111'1111), so that adding 1 to it yields zero (modulo 2⁸).

Two's complement representation has some nice properties for building hardware: for example, the processor can use the same circuits for addition and subtraction of signed and unsigned numbers. On the downside, however, two's complement representation also has a nasty property: arithmetic overflow on signed numbers is undefined behavior.

Summary

Today, we learned some handy rules about collections and their memory representation. Then, we reviewed how these rules they interact with alignment, particularly within structs. We saw that changing the order in which members are declared in a struct can significantly affect its size, meaning that alignment matters for writing efficient systems code.

We also learned more about how computer represent integers, and in particular about how they represent negative numbers in a binary encoding called two's complement. Next time, we'll learn that certain arithmetic operations on numbers can invoke the dreaded undefined behavior, and the confusing effects this can have, before we move on to talking about assembly code.