x86 Assembly
x86 Assembly
x86 (32-bit) and x86-64 (64-bit) assembly is syntactic sugar for machine code that gets executed by a CPU. The two flavors of assembly are intel and att, however intel is much more digestible.
Assembly Introduction Reference
Registers
There are 16 registers available, 3 of which are special purpose.
- rbp: base pointer (points to the bottom of the stack frame)
- rsp: stack pointer (points to the top of the stack)
- rip: instruction pointer (points to the instruction to be executed)
8 Byte Register | Lower 4 Bytes | Lower 2 Bytes | Lower Byte |
---|---|---|---|
rbp | ebp | bp | bpl |
rsp | esp | sp | spl |
rip | eip | ||
rax | eax | ax | al |
rbx | ebx | bx | bl |
rcx | ecx | cx | cl |
rdx | edx | dx | dl |
rsi | esi | si | sil |
rdi | edi | di | dil |
r8 | r8d | r8w | r8b |
r9 | r9d | r9w | r9b |
r10 | r10d | r10w | r10b |
r11 | r11d | r11w | r11b |
r12 | r12d | r12w | r12b |
r13 | r13d | r13w | r13b |
r14 | r14d | r14w | r14b |
r15 | r15d | r15w | r15b |
Segment Registers
Early CPU implementations had special registers for memory segmentation, essentially allowing more addresses than could fit in a single register. The x86-64 architecture does not use segmentation in 64-bit mode, though it is still supported for backwards compatibility. In 64-bit mode, all segment registers are forced to 0 except for FS and GS which allows the OS to use them for special purposes.
Register (2 Bytes) | Description |
---|---|
cs | Code segment |
ds | Data segment |
ss | Stack segment |
es | Extra segment |
fs | Extra segment |
gs | Extra segment |
Calling Conventions
When a function is called in assembly, generally what happens:
- Save caller-saved registers (if any)
- Push the arguments onto the stack (if any)
- Push the return address onto the stack
- Push the base pointer onto the stack
- Set the stack pointer to the base pointer
- Subtract the stack pointer to make room for local variables
- The return value will be stored in rax (64-bit) or eax (32-bit)
64-bit Conventions
- Function arguments are first put into registers
rdi
,rsi
,rdx
,rcx
,r8
,r9
respectively, then pushed onto the stack in reverse order if there are more - caller-saved registers:
r10
,r11
, and any registers that parameters are put into - callee-saved registers:
rbx
,rbp
,r12
,r13
,r14
, andr15
32-bit Conventions
- All function arguments are pushed onto the stack in reverse order
- caller-saved registers:
eax
,ecx
, andedx
- callee-saved registers:
ebx
,edi
, andesi
System Calls
System calls are a way for programs to request the kernel to do something on the user’s behalf, such as open, read, write files or fork and exec processes.
In 64-bit, the syscall
instruction is used with the syscall number in
rax
. Arguments are put into registers rdi
, rsi
, rdx
, rcx
,
r8
, and r9
. rcx
and r11
will be clobbered by the syscall and
the return value will be in rax
.
In 32-bit, the int 0x80
instruction is used with the syscall number
in eax
. Arguments are put into registers rbx
, rcx
, rdx
, rsi
,
rdi
, and rbp
. The return value will be in eax
.
Common System Call Numbers
Platform | Number | Description |
---|---|---|
32-bit | 0xb | execve(char *path, char *argv[], char *envp[]) |
64-bit | 0x3b | execve(char *path, char *argv[], char *envp[]) |
Flags
Flags are stored in a special 32-bit EFLAGS
register.
Bit | Name | Description |
---|---|---|
0 | CF | Carry Flag; set if the last arithmetic operation carried or borrowed a bit beyond the size of the register |
2 | PF | Parity Flag; set if the number of set bits in the least significant byte is a multiple of 2 |
4 | AF | Adjust Flag; carry of binary coded decimal numbers arithmetic operations |
6 | ZF | Zero Flag; set if the result of an operation is 0 |
7 | SF | Sign Flag; set if the result of an operation is negative |
8 | TF | Trap Flag; set if step by step debugging |
9 | IF | Interruption Flag; set if interrupts are enabled |
10 | DF | Direction Flag; stream direction. If set, string operations will decrement their pointer rather than incrementing it, reading memory backwards |
11 | OF | Overflow Flag; set if signed arithmetic operations result in a value too large for the register to contain |
12/13 | IOPL | I/O Privilege Level field (2 bits); I/O Privilege Level of the current process |
14 | NT | Nested Task flag; controls chaining of interrupts. Set if the current process is linked to the next process |
16 | RF | Resume Flag; response to debug exceptions |
17 | VM | Virtual-8086 Mode; set if in 8086 compatibility mode |
18 | AC | Alignment Check; set if alignment checking of memory references is done |
19 | VIF | Virtual Interrupt Flag; virtual image of IF |
20 | VIP | Virtual Interrupt Pending flag; set if an interrupt is pending |
21 | ID | Identification flag; support for CPUID instruction if can be set |
Instructions
Instructions are variable length and encoded in bytes (not included here). This is not an exhaustive list.
Instruction | Description |
---|---|
mov dst, src | Move data from register src to register dst |
mov dst, [src] | Move data from memory at register src to register dst |
mov [dst], src | Move data from register src to memory at register dst |
lea dst, [src] | load effective address; move the address of src to dst |
add dst, src | Add dst and src and store the result in dst |
sub dst, src | Subtract dst and src and store the result in dst |
xor dst, src | Xor dst and src and store the result in dst |
push src | Grow the stack (subtract bytes) and copy the value in src to the top of the stack |
pop dst | Copy the value at the top of the stack into dst and shrink the stack (add bytes) |
jmp addr | Set the instruction pointer (rip ) to addr |
call addr | Push rip onto the stack, then jump to addr |
leave | Resets the stack frame; set the stack pointer to the base pointer, then pop the base pointer from stack |
ret | Pop rip from the stack |
jnz addr | Jump to addr if the zero flag is not set |
jz addr | Jump to addr if the zero flag is set |
PIE and RELRO
Position Independent Executable and Relocation Read-Only are security measures to make exploiting binaries harder.
PIE means that the assembly code is position independent, and the linker can perform address space layout randomization (ASLR). This is only referring to the main executable (i.e. not dynamically linked code). In most cases, ASLR is applied to shared libraries, the stack, and the heap regardless of PIE. ASLR works in tandem with the PLT and GOT.
The PLT (Procedure Linkage Table) is a table of function stubs that
are called in place of the actual, dynamically linked functions. The
address of the actual dynamically linked function is stored in the
GOT (Global Offset Table), which is populated by the dynamic linker
at runtime. By default, the GOT is populated lazily whenever the PLT
stub is called, however these can be controlled at linkage time with the
-zlazy
and -znow
flags to ld
or gcc
.
An executable can be checked for PIE and linkage flags with
readelf --dynamic $EXE
Additionally, there are two types of RELRO: partial and full. Partial RELRO is the default and forces the GOT to come before the BSS in memory (see ELF). This prevents a buffer overflow of a global variable overwriting GOT entries. Full RELRO makes the entire GOT read-only, which increases the startup time for the executable as all symbols must be resolved before the program starts. This can significantly impact startup times, and as such is not the default.