Disassembly C code for fun – Part 9: arrays
Arrays are a list of fixed number of elements of the same type stored into a continuous block of memory. String are a type of array with a fixed number of char
with a NULL as the last element of the array.
The assembly generated by an array will be pretty similar as the one generated with strings.
Simple array
The code I’ll use in this example:
#include <stdio.h>
int main() {
int n[] = {1,2,3,4,5};
int i;
for (int i = 0; i <5; i++) {
printf("%d\n", n[i]);
}
return 0;
}
which generates this disassembly:
0x0000000100000e60 <main+0>: push %rbp
0x0000000100000e61 <main+1>: mov %rsp,%rbp
0x0000000100000e64 <main+4>: sub $0x30,%rsp
0x0000000100000e68 <main+8>: mov 0x1c9(%rip),%rax # 0x100001038
0x0000000100000e6f <main+15>: mov (%rax),%rax
0x0000000100000e72 <main+18>: mov %rax,-0x8(%rbp)
0x0000000100000e76 <main+22>: movl $0x0,-0xc(%rbp)
0x0000000100000e7d <main+29>: mov 0xcc(%rip),%rax # 0x100000f50
0x0000000100000e84 <main+36>: mov %rax,-0x20(%rbp)
0x0000000100000e88 <main+40>: mov 0xc9(%rip),%rax # 0x100000f58
0x0000000100000e8f <main+47>: mov %rax,-0x18(%rbp)
0x0000000100000e93 <main+51>: mov 0xc7(%rip),%ecx # 0x100000f60
0x0000000100000e99 <main+57>: mov %ecx,-0x10(%rbp)
0x0000000100000e9c <main+60>: movl $0x0,-0x28(%rbp)
0x0000000100000ea3 <main+67>: cmpl $0x5,-0x28(%rbp)
0x0000000100000eaa <main+74>: jge 0x100000ed9 <main+121>
0x0000000100000eb0 <main+80>: lea 0xad(%rip),%rdi # 0x100000f64
0x0000000100000eb7 <main+87>: movslq -0x28(%rbp),%rax
0x0000000100000ebb <main+91>: mov -0x20(%rbp,%rax,4),%esi
0x0000000100000ebf <main+95>: mov $0x0,%al
0x0000000100000ec1 <main+97>: callq 0x100000f0c <dyld_stub_printf>
0x0000000100000ec6 <main+102>: mov %eax,-0x2c(%rbp)
0x0000000100000ec9 <main+105>: mov -0x28(%rbp),%eax
0x0000000100000ecc <main+108>: add $0x1,%eax
0x0000000100000ed1 <main+113>: mov %eax,-0x28(%rbp)
0x0000000100000ed4 <main+116>: jmpq 0x100000ea3 <main+67>
0x0000000100000ed9 <main+121>: mov 0x158(%rip),%rax # 0x100001038
0x0000000100000ee0 <main+128>: mov (%rax),%rax
0x0000000100000ee3 <main+131>: mov -0x8(%rbp),%rcx
0x0000000100000ee7 <main+135>: cmp %rcx,%rax
0x0000000100000eea <main+138>: jne 0x100000efb <main+155>
0x0000000100000ef0 <main+144>: mov $0x0,%eax
0x0000000100000ef5 <main+149>: add $0x30,%rsp
0x0000000100000ef9 <main+153>: pop %rbp
0x0000000100000efa <main+154>: retq
0x0000000100000efb <main+155>: callq 0x100000f00 <dyld_stub___stack_chk_fail>
We can strip the prologue, epilogue and array initialisation with canary (explained in the previous post) and reduce the code to just the body of the loop between 0x100000eb0 to 0x100000ed1:
0x0000000100000eb0 <main+80>: lea 0xad(%rip),%rdi # 0x100000f64
0x0000000100000eb7 <main+87>: movslq -0x28(%rbp),%rax
0x0000000100000ebb <main+91>: mov -0x20(%rbp,%rax,4),%esi
0x0000000100000ebf <main+95>: mov $0x0,%al
0x0000000100000ec1 <main+97>: callq 0x100000f0c <dyld_stub_printf>
0x0000000100000ec6 <main+102>: mov %eax,-0x2c(%rbp)
0x0000000100000ec9 <main+105>: mov -0x28(%rbp),%eax
0x0000000100000ecc <main+108>: add $0x1,%eax
0x0000000100000ed1 <main+113>: mov %eax,-0x28(%rbp)
The first instruction is trivial: LEA at 0x100000eb0 loads the format string of the printf()
function.
The next two instructions are responsible to load the correct array element into ESI to be used in the printf()
function call. To do so the MOV instruction at 0x100000ebb does a calculation to point to the start of the array plus an offset; the formula is in the form:
address = a + b * c
and replacing the terms with our values from the code:
address = -0x28 + RBP + RAX * 4
The first two terms of the operation (RBP-0x28) is the memory location of the first element of the array on the stack, the third and fourth terms together are the offset. At the first iteration of the loop’s body the content of RAX is 0 so the result of the multiplication is 0; on the second iteration of the loop RAX is 1 so the result will be 4 and so on until 16.
The number 4 is the size of a single element of the array witch in our case is the size of an 64-bit int
. If an array of short
or char
was defined a 2 or 1 were used as second term of the offset’s multiplication.
Optimisations
The code generated with the -O3
is not exciting, the loop is completely unrolled and the array’s values are hardcoded. However building the code with the -O1
optimisation is a little more interesting:
0x0000000100000eda <main+10>: xor %ebx,%ebx
0x0000000100000edc <main+12>: lea 0x6d(%rip),%r15 # 0x100000f50
0x0000000100000ee3 <main+19>: lea 0x7a(%rip),%r14 # 0x100000f64
0x0000000100000eea <main+26>: nopw 0x0(%rax,%rax,1)
0x0000000100000ef0 <main+32>: mov (%rbx,%r15,1),%esi
0x0000000100000ef4 <main+36>: mov %r14,%rdi
0x0000000100000ef7 <main+39>: xor %al,%al
0x0000000100000ef9 <main+41>: callq 0x100000f1a <dyld_stub_printf>
0x0000000100000efe <main+46>: add $0x4,%rbx
0x0000000100000f02 <main+50>: cmp $0x14,%ebx
0x0000000100000f05 <main+53>: jne 0x100000ef0 <main+32>
0x0000000100000f07 <main+55>: xor %eax,%eax
The RBX register is initialised to zero and will hold the offset to the base address of the array. The R15 register holds the base address of the array and R14 the address of the format string. The next NOP operation align the next instruction to a memory location multiple of 16; as we saw in a previous post this is a CPU cache optimisation.
The next instructions from 0x100000ef0 to 0x100000f05 are the body of the loop: now is the RBX register which is incremented by 4 (instruction at 0x100000efe) on every iteration of the loop until the offset points after the last element of the array (at line 0x100000f02 if RBX is 20 divided by 4 equals to 5 which is the zero-based index of the next array element to be printed but or array has only 5 elements thus a maximum zero-based index of 4).