Lecture 5: Arrays, Structures, and Alignment
» Lecture code
» Post-Lecture Quiz (due 6pm Monday, February 8).
S1: Arrays
Why are we covering this?
As programmers, we often want to deal with collections of data, such as a million integers to sort. But in C and C++, such collections ultimately turn into bytes in memory that the compiler lays out in specific ways and that we as the programmer have to intpret correctly. Once you understand arrays in C and how they relate to pointers, you will understand how computers represent sequences of data. This becomes important in the vector part of Project 1, in Project 2, and in the OS part of the course.
We've danced around the concept of arrays in C for a while now. C's arrays are simple and lay out a set of equal-size elements consecutively in memory. In many ways, arrays and strings in C are very similar. In particular, you can always think of an array as a pointer to the start of its memory, above which point the elements of the array are laid out in sequence.
C allows you to declare, define, and initialize an array in one go, using curly bracket notation:
int[] = { 1, 2, 3 }
declares and defines an array of three integers and immediately initializes its
contents to the numbers 1 to 3.
How large is this array in memory? It contains three integers of four bytes each, so the array will use 12 bytes in memory. Note that the length of the array (3 elements) is not stored with it in memory! In general, it's up to you as the programmer to remember what an array's length is.
All elements of an array in C must have the same type and size in memory, and the length of an array is fixed.
In other words, you cannot have an array of { int, char, int, long }
, nor can you append elements to an
array. This similar to Java arrays, but a significant difference compared to Python or OCaml lists.
You can also declare an array in C without initializing it. To do so, you put the array's desired length into square
brackets next to the variable name: int a[5]
is an array of five integers. But what is the size
of such an array in memory? You can calculate it manually: the array must be backed by sufficient memory to hold five
integers (20 bytes, since each integer is 4 bytes long). But it turns out C also has a handy keyword to help you get
the byte sizes of its types.
Finding object sizes with sizeof
The sizeof
keyword returns the size in bytes (!) of its argument, which can either be a type or an
object. Some examples:
sizeof(int)
is 4 bytes, because integers consist of four bytes;sizeof(char)
is 1 byte;sizeof(int*)
andsizeof(char*)
are both 8 bytes, because all pointers on a 64-bit computer are 8 bytes in size;- for
int i = 5;
,sizeof(i)
is also 4 bytes, becausei
is an integer; - for an array
int arr[] = { 1, 2, 3 }
,sizeof(arr)
is 12 bytes. - for a pointer to an array, such as
int* p = &arr[0]
,sizeof(p)
is 8 bytes, independent of the size of the array.
sizeof
is at the same time great and a huge source of confusion. It is crucial to remember that
sizeof
only returns the byte size of the known compile-time type of its argument. Importantly,
sizeof
cannot return the length of an array, nor can it return the size of the memory allocation
behind a pointer. If you call sizeof(ptr)
, where ptr
is a char*
, you will get
8 bytes (since the size of a pointer is 8 bytes), independent of whether that char*
points to a much larger
memory allocation (e.g., 100 bytes on the heap).
Why doessizeof
work this way?
The dirty, but amazing, secret behind
sizeof
is that it actually results in no actual compiled code. Instead, the C compiler replaces any invocation ofsizeof(type)
orsizeof(expression)
with the byte size of the argument known at compile time (sosizeof(int)
just turns into a literal4
in the program). Hence,sizeof
cannot possibly determine runtime information like the size of a memory allocation; but it also requires zero processor operations and memory at runtime!
Arrays are just pointers!
Recall our array int a[5]
of five integers. This declaration will tell the compiler to set aside sufficient
memory to hold five integers (20 bytes, since each integer is 4 bytes long).
What are the contents of the memory for a
? Again, the memory is uninitialized, so it could be
anything! To actually fill in your array, you use subscript notation with square brackets on the left hand size of an
assignment: a[0] = 1;
, a[1] = 2;
etc.
Now here's a curious, but super important detail. a[1]
means that we're assigning into the second
element of our array. Where is that element in memory – i.e., what does a[1]
really mean in terms of
the memory boxes that we write to?
Let's figure this out from first principles. a[0]
is the first element, which starts at the first address
in the array. That address is the same as a pointer to the first byte of a
in memory. In fact, we can
rewrite a[0]
as *((int*)a + 0)
! Let's tease this apart: (int*)a
means that we
want to treat a
as a pointer to an integer in memory, and the + 0
part adds zero to it using
pointer arithmetic. Then we dereference the resulting pointer, which gives us the value at a[0]
.
By this reasoning, what does a[1]
translate to? If you guessed *((int*)a + 1)
, that's
correct! But here's a snag: remember that integers are 4 bytes long. Pointer arithmetic always increments the address
stored in the pointer by the number of bytes of its type. In this case, this means that (int*)a + 1
adds
4 bytes to the address (so it's equivalent to (char*)a + 4
). Putting it all together, a[1] = 2;
turns out to really mean "write the integer 2 into the four memory boxes starting four boxes down from the address
in a
".
Here's an important takeaway: array subscript notation and pointer arithmetic are one and the same thing! In fact, the C compiler internally just turns your square-bracked subscript notation into pointer arithmetic.
More generally, the C language definition has a rule for collections of data: the first member rule, which says that the address of a collection is the same as the address of its first member. This rule applies to arrays, but we'll shortly see that it also applies to other structures.
There is a second rule for arrays, the array rule, which says that all elements ("members") of the array are laid out consecutively in memory.
Finally, you'll now see how strings and arrays are super similar: you can think of a string as an array of
char
elements, each one byte in length, with a bonus terminator element at the n+1th
position in the array (whose memory allocation must be n+1 bytes long – easy to forget!).
S2: Structures (struct
): making new types
Why are we covering this?
C would be rather restriced if you could only use its primitive types (
int
,char
and so on). To let you define new data types of your own, the language supplies the idea of a structure (struct
). Structures occur all the time in real-world C programs, and they're very important both to understanding how we can lay out data in specific patters in memory in order to communicate with hardware (one big purpose of systems programming languages when used to write operating systems!), and how C++ classes work.
The C language uses the struct
keyword to define new data structures: objects laid out in
memory in a specific format. This is a very powerful part of the language, and you'll find yourself using structs
a lot in your projects. In partiuclar, you will encounter structs in when you implement your vector in Project 1.
A structure declaration consists of the struct
keyword, a name for this structure, and a specification
of its members within curly braces:
struct x_t {
int i1;
int i2;
int i3;
char c1;
char c2;
char c3;
};
This defines a structure called x_t
(the _t
suffix is a convention and indicates that we're
dealing with a user-defined type), each instance of which contains three integers and three characters.
There are several ways to obtain memory for an instance of your struct in C: using a global, static-lifetime struct,
stack-allocating a local struct with automatic lifetime, or heap-allocating dynamic-lifetime memory to hold the struct.
The example below, based on mexplore-struct.c
, shows how stack- and heap-allocated structs works.
int main() {
// declares a new instance of x_t on the stack (automatic lifetime)
struct x_t stack_allocated;
stack_allocated.i1 = 1;
stack_allocated.c2 = 'A';
printf("stack-allocated structure at %p: i1 = %d (addr: %p), c1 = %c (addr: %p)\n",
&stack_allocated, // need to take address, as stack_allocated is not a pointer
stack_allocated.i1,
&stack_allocated.i1,
stack_allocated.c1
&stack_allocated.c1);
// makes a new instance of x_t on the heap (dynamic lifetime)
struct x_t* heap_allocated = (struct x_t*)malloc(sizeof(struct x_t));
heap_allocated->i1 = 3;
heap_allocated->c1 = 'X';
printf("heap-allocated structure at %p: i1 = %d (addr: %p), c1 = %c (addr: %p)\n",
heap_allocated, // already an address, so no & needed here
heap_allocated>i1,
&heap_allocated->i1,
heap_allocated->c1
&heap_allocated->c1);
}
Observe that we access struct members in two different ways: when the struct is a value (e.g., a
stack-allocated struct), we access member i1
as stack_allocated.i1
, using a dot to separate
variable and member name. (This is the same syntax that you'd use to access members of Java objects.) But if we're
dealing with a pointer to a struct (such as the pointer returned from malloc()
for our
heap-allocated struct), we use ->
to separate variable and member name. The arrow syntax
(->
) implicitly dereferences the pointer and then accesses the member. In other words,
heap_allocated->i1
is identical to (*heap_allocated).i1
.
Different ways to initialize a struct
Like arrays, structures also have an initializer list syntax that makes it easy for you to set the values of their members when creating a struct. For example, you could write
struct x_t my_x = { 1, 2, 3, 'A', 'B', 'C'};
, or even only partially initialize the struct viastruct x_t my_x2 = { .i2 = 42, .c3 = 'X' };
. The values of uninitialized members in practice depends on where the memory comes from (static segment data is initialized to zeros; other segments are not), but it's generally best to treat such memory as uninitialized and set all members.
Sick of writingstruct x_t
all the time?
Normally, you always need to put the
struct
keyword in front of your new struct type whenever you use it. But this gets tedious, and the C language provides the helpful keywordtypedef
to save you some work. You can usetypedef
with a struct definition like this:... and henceforth you just writetypedef struct { int i1; int i2; int i3; char c1; char c2; char c3; } x_t;
x_t
to refer to your struct type.
By the first member rule a pointer to a struct (like heap_allocated
above) always points to
the address of its first member.
S3: Linked List Example
Now let's build a useful data structure! We'll look at a linked list of integers here (linked-list.c
).
This actually consists of two structures: one to represent the list as a whole (list_t
) and one to
represent nodes in the list (list_node_t
). The list_t
structure contains a pointer to the
first node of the list, and (in this simple implementation) nothing else. The list_node_t
structure
contains the node's value (an int
) and a pointer to the next list_node_t
in memory.
typedef struct list_node {
int value;
struct list_node* next;
} list_node_t;
typedef struct list {
list_node_t* head;
} list_t;
Why does thenext
pointer inlist_node_t
have typestruct list_node*
, notlist_node_t*
?
C compilers do not allow recursively-defined type definitions. In particular, you cannot use the type you're defining via
typedef
within its own definition. You can, however, use astruct
pointer within the structure's definition. Think of it this way:struct list_node
is already known a known object for the compiler when the pointer occurs in the definition, butlist_node_t
isn't yet, as its definition only ends with the semicolon.
Note that you can only nest a pointer to a struct in its own definition, not an instance of the struct itself. Try to think of why that must be the case, remebering that C types must have fixed memory sizes at compile time!
A function to append a node to this list must take two arguments: the list to append to (a list_t*
) and
the element to append (an int
). It then needs to check if the list is empty (l->head ==
NULL
); if it is not, append()
needs to find the end of the list. It does so by following the
next
pointer in each node until it encounters a list_node_t
whose next
pointer is
NULL
. Once we have the end of the list, we allocate a new list_node_t
using
malloc()
, set its value and initialize its next
pointer to NULL
(as this will be
the new end of the list). Finally, we change the pointer of the current list end (either l->head
for an
empty list, or cut->next
for a non-empty one) to point to the new node.
void append(list_t* list, int value) {
list_node_t* cur = list->head;
if (cur != NULL) {
while (cur->next != NULL) {
cur = cur->next;
}
}
list_node_t* new_node = (list_node_t*)malloc(sizeof(list_node_t));
new_node->next = NULL;
new_node->value = value;
if (cur != NULL) {
cur->next = new_node;
} else {
list->head = new_node;
}
}
Exercise: how would you write a method to obtained the ith element of a linked list of integers?
The signature of this method is
int* at(int index)
, i.e., it takes an index as its argument and returns a pointer to the integer stored at that index, andNULL
if the index does not exist.This solution uses aint* at(list_t* l, int index) { list_node_t* cur = l->head; if (cur == NULL) { return NULL; } int i = 0; do { if (i == index) { return &cur->value; } cur = cur->next; i++; } while (cur != NULL); return NULL; }
do ... while
loop, which is similar to awhile
loop, except that it executes at least once and the condition is checked after executing the loop.
What's the size of our two structs involved here? list
is 8 bytes in size, because it only
contains a pointer, and list_node
is 12 bytes in size, as it contains a 4-byte int
and an
8 byte pointer. (For reason that we'll understand soon, sizeof(struct list_node_t)
actually returns 16
bytes, however.)
S4: Alignment
Why are we covering this?
Since C requires you to work closely with memory addresses, it is important to understand how the compiler lays out data in memory, and why the layout may not always be exactly what you expect. If you understand alignment, you will get pointer arithmetic and byte offsets right when you deal with them, and you will understand why programs sometimes use more memory than you would think based on your data structure specifications.
The chips in your computer are very good at working with fixed-size numbers. This is the reason why the basic integer
types in C grow in powers of two (char
= 1 byte, short
= 2 bytes, int
= 4 bytes,
long
= 8 bytes). But it further turns out that the computer can only work efficiently if these fixed-size
numbers are aligned at specific addresses in memory. This is especially important when dealing with structs,
which could be of arbitrary size based on their definition, and could have odd memory layouts following the struct
rule.
Just like each primitive type has a size, it also has an alignment. The alignment means that all objects of this type must start at an address divisible by the alignment. In other words, an integer with size 4 and alignment 4 must always start at an address divisible by 4. (This applies independently of whether the object is inside a collection, such as a struct or array, or not.) The table below shows the alignment restrictions of primitive types on an x86-64 Linux machine.
Type | Size | Address restriction |
---|---|---|
char (signed char , unsigned char ) |
1 | No restriction |
short (unsigned short ) |
2 | Multiple of 2 |
int (unsigned int ) |
4 | Multiple of 4 |
long (unsigned long ) |
8 | Multiple of 8 |
float |
4 | Multiple of 4 |
double |
8 | Multiple of 8 |
T* |
8 | Multiple of 8 |
The reason for this lies in the way hardware is constructed: to end up with simpler wiring and logic, computers often move fixed amounts of data around. In particular, when the computer's process accesses memory, it actually does not go directly to RAM (the random access memory whose chips hold our bytes). Instead, it accesses a fast piece of memory that contains a tiny subset of the contents of RAM (this is called a "cache" and we'll learn more about it in future lectures!). But building logic that can copy memory at any arbitrary byte address in RAM into this smaller memory would be hugely complicated, so the hardware designers chunk RAM into fixed-size "blocks" that can be copied efficiently. The size of these blocks differs between computers, but their existence reveals why alignment is necessary.
Let's assume there were no alignment constraints, and consider a situation like the one shown in the following:
| 4B int | <-- unaligned integer stored across block boundary | 2B | 2B | <-- 2 bytes in block k, 2 bytes in block k+1 ----+-----------+-----------+-----------+-- ... | block k | block k+1 | block k+2 | ... <- memory blocks ("cache lines") ----+-----------+-----------+-----------+--
An unaligned integer could end up being stored across the boundary between two memory blocks. This would require the processor to fetch two blocks of RAM into its fast cache memory, which would not only take longer, but also make the circuits much harder to build. With alignment, the circuit can assume that every integer (and indeed, every primitive type in C) is always contained entirely in one memory block.
| 4B int | <-- aligned integer stored entirely in one block | 4B | <-- all 4 bytes in block k+1 ----+-----------+-----------+-----------+-- ... | block k | block k+1 | block k+2 | ... <- memory blocks ("cache lines") ----+-----------+-----------+-----------+--
The compiler, standard library, and operating system all work together to enforce alignment restrictions. If you want
to get the alignment of a type in a C program, you can use the sizeof
operator's cousin alignof
.
In other words, alignof(int)
is replaced with 4 by the compiler, and similarly for other types.
We can now write down a precise definition of alignment: The alignment of a type T
is a number
a
≥ 1 such that the address of every object of type T
is a multiple of a
.
Every object with type T
has size
sizeof(T)
, meaning that it occupies sizeof(T)
contiguous bytes of memory; and each object of
type T
has alignment alignof(T)
, meaning that the address of its first byte is a multiple of
alignof(T)
.
You might wonder what the maximum alignment is – the larger an alignment, the more memory might get wasted by
being unusable! It turns out that the 64-bit architectures we use today have maximum 16-byte alignment, which is
sufficient for the largest primitive type, long double
.
Note that structs are not primitive types, so they aren't as such subject to alignment constraints. However, each struct has a first member, and by the first member rule for collections, the address of the struct is the address of the first member. Since struct members are primitive types (even with nested structures, eventually you'll end up with primitive type members after expansion), and those members do need to be aligned. We will talk more about this next time!
Alignment constraints also apply when the compiler lays out variables on the stack. mexplore-order.c
illustrates this: with all int
variables and char
variables defined consecutively, we end up
with the memory addresses we might expect (the three int
s are consecutive in memory, and the three
char
s are in the bytes below them). But if I move c1
up to declare it just after
i1
, the compiler leaves a gap below the character, so that the next integer is aligned correctly on a
four-byte boundary.
But: if we turn on compiler optimizations, there is no gap! The compiler has reordered the variables on the stack to
avoid wasting memory: all integers are again consecutive in memory, even though we didn't declare them in that order.
This is permitted, as there is no rule about the order of stack-allocated variables in memory (nor is there one about
the order of heap-allocated ones, though addresses returned from malloc()
do need to be aligned). If these
variables were in a struct
(as in x_t
), however, the compiler could not perform this
optimization because the struct rule forbids reordering members.
Summary
Today, we focused on how the C language represents collections of objects, and specifically looked at arrays and structs. We learned some handy rules about collections and their memory representation, which are summarized below:
- The first member rule says that the address of a collection is the same as the address of its first member.
- The array rule says that all members of an array are laid out consecutively in memory.
We also figured out pointer arithmetic in more detail and understood how it's related to array subscript syntax.
We also learned how a linked list of integers can be built from two C structures, and implemented a simple function that appends to our list. This is similar to what you'll do in the vector part of Project 1, except that you'll build a vector and not a linked list.
We also explored the tricky subject of alignment in memory, where the compiler sometimes wastes memory to
achieve faster program execution, and learned how the bytes of types larger than a char
, are actually laid out in memory.