Author: Varun Gandhi (@typesanitizer)
Compilers generate assembly in the penultimate stage of compilation.
The assembly is processed by an assembler, which generates machine code for subsequent processing by a linker.
Assembly typically consists of instructions, each of which has a name and zero or more operands.
Code example showing comparison between assembly and Python.
Example 1: nop
in Assembly is like pass
in Python.
Example 2: inc rax
in Assembly is like rax += 1
in Python.
Example 3: mov rax 42
in Assembly is like rax = 42
in Python.
Assembly instructions are specific to a family of processors.
A laptop with the x86_64 architecture may have the following assembly:
mov rax [rdx]
call bake_cake
For the same task, a phone with the arm64 architecture may have the following assembly:
ldr x0 [x1]
call bake_cake
We will focus on x86_64 assembly.
Register provide scratch space for processors to do calculations.
Data is loaded from memory into the registers, where it is used for computation, and the data is stored back into memory.
In between the registers and the memory, there are some CPU caches.
Processors can also use constants (aka immediates) for calculations.
Registers can be special purpose or general purpose.
rax
is a general purpose register that is used for calculations and passing arguments and return values.
RFLAGS
is a special purpose register that holds bitflags for information related to overlfow, comparisons and so on.
Registers can overlap!
For example, the eax
register is 32 bits wide. The lower 16 bits of eax
form the ax
register. The upper 8 bits of ax
form the ah
register. The lower 8 bits of ax
form the al
register.
eax
itself represents the lower 32 bits of the rax
register!
Some instructions modify bitflags to provide additional information.
add rax 7
jo .overflow
Adding 7 to rax sets OF
(overflow flag) to 1 on unsigned overflow. If OF
is 1, jo
makes execution jump to the instruction immediately after the .overflow
label.
Different addressing modes can be tricky to understand.
Example 1: mov [rax] rdx
in assembly is like *rax = rdx
in C.
Example 2: mov rax [rdx + 2*rbx]
in assembly is like rax = rdx[2*rbx]
in C.
Terms and conditions apply: The exact mapping may require additional factors of 2, 4, or 8 depending on the types of different variables in the C code.
Compiler Explorer (https://godbolt.org) is a friendly tool to explore the assembly generated by different compilers.
In practice, understanding assembly can be tricky due to the large number of instructions and concepts.
Search keywords: instruction selection, register allocation, position-independent code, global offset table, disassembler, SIMD instructions, atomic instructions, memory model.
Learning resources: Intel and ARM architecture manuals, Compiler Explorer, Agner Fog instruction tables, Computer Architecture and Compilers coures, Shenzhen I/O.