ELF File Introduction: From Sections to Segments
Use structure, examples, and tools to connect ELF types, layout, relocations, and dynamic linking.
ELF (Executable and Linkable Format) is the most common object and executable file format on Unix-like systems. Understanding its structure helps you connect the key steps of compile �� link �� load �� run.
Three ELF types
- Relocatable: Object files (
.o) emitted by the compiler/assembler, waiting for the linker to merge and fix addresses. - Executable: A program that can be loaded and run directly.
- Shared Object: A shared library (
.so) linked at runtime.
From source to execution: a clear chain
Source/Assembly
↓ Compile/Assemble
Object file (.o, with Sections)
↓ Link
Executable (ELF, with Segments)
↓ Loader
Mapped into memory and executed
Two perspectives: Section vs Segment
ELF provides two views:
- Linker view: ELF is a set of Sections that store code, data, symbols, relocations, etc.
- Loader view: ELF is a set of Segments that describe memory mappings and permissions (R/W/X).
A simplified correspondence:
Section view (linker) Segment view (loader)
[ELF Header] [ELF Header]
[Section Header Table] [Program Header Table]
.text LOAD (R-X) <- .text
.data LOAD (RW-) <- .data + .bss
.bss
.symtab/.strtab
.rel.*
Relocatable file example: ELF Header + Sections
A relocatable object (.o) header excerpt from readelf -h:
ELF Header:
Class: ELF64
Data: 2's complement, little endian
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Entry point address: 0x0
Start of section headers: 0x2c0 (bytes into file)
Number of section headers: 12
A readelf -S excerpt (key columns only):
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 1] .text PROGBITS 0000 0040 0038 00 AX 0 0 16
[ 2] .rel.text REL 0000 0210 0018 08 6 1 8
[ 3] .data PROGBITS 0000 0080 0020 00 WA 0 0 8
[ 4] .bss NOBITS 0000 00a0 0010 00 WA 0 0 8
[ 5] .symtab SYMTAB 0000 0240 00f0 18 6 8 8
[ 6] .strtab STRTAB 0000 0330 0048 00 0 0 1
Field notes:
- Addr: Load address (often 0 in relocatable files; fixed by the linker).
- Off/Size: File offset and size.
- Flg: Permissions (A=alloc, X=exec, W=write).
Object file layout (simplified)
0x0000 ELF Header
0x0040 .text
0x0080 .data
0x00a0 .bss (no file bytes)
0x0210 .rel.text
0x0240 .symtab
0x0330 .strtab
0x02c0 Section Header Table
Executable example: Program Headers and Segments
After linking, an executable includes Program Headers:
Program Headers:
Type Offset VirtAddr FileSiz MemSiz Flg Align
LOAD 0x0000 0x400000 0x0800 0x0800 R E 0x1000
LOAD 0x1000 0x601000 0x0200 0x0300 RW 0x1000
Explanation:
- First LOAD: Contains
.text, permissions R-X. - Second LOAD: Contains
.data + .bss, permissions RW-. MemSiz > FileSizusually means.bssoccupies memory only.
Use readelf -l to view Section to Segment mapping and see how Sections are merged into Segments.
Relocation: turning placeholders into real addresses
Object files often contain placeholder addresses that the linker fixes using .rel.* entries.
A simplified example (pseudo-assembly):
mov data_items(%rip), %rax ; access a global array
In the object file the encoding may contain placeholders:
8b 04 bd 00 00 00 00
After linking it becomes a real address:
8b 04 bd a0 90 04 08
Corresponding relocation entry (excerpt):
Relocation section '.rel.text' contains 1 entry:
Offset Info Type Sym.Name
0x0008 ... R_X86_64_32 data_items
Key idea: the linker patches specific offsets based on relocation tables.
Shared libraries and PIC / GOT / PLT
Shared objects must load at arbitrary addresses, so they use PIC (position-independent code):
- GOT (Global Offset Table): Stores real addresses of variables/functions.
- PLT (Procedure Linkage Table): A jump stub used for lazy binding.
A typical PLT entry (simplified):
push@plt:
jmp *GOT[push]
pushq $reloc_index
jmp plt0
The first call enters the dynamic linker; subsequent calls jump directly via the GOT entry.
Dynamic linking flow (high level)
- The dynamic linker loads required shared libraries.
- First call to an external symbol triggers PLT resolution.
- The resolved address is written to the GOT.
- Later calls jump directly to the resolved address.
Useful tools
# ELF header, sections, segments
readelf -h a.out
readelf -S a.out
readelf -l a.out
# Disassembly and symbols
objdump -d a.out
objdump -t a.out
nm -n a.out
# Size, deps, strings
size a.out
ldd a.out
strings a.out
Summary
The key to ELF is: linkers care about Sections, loaders care about Segments. Relocatable files emphasize linkability, executables emphasize loadability, and shared objects emphasize relocation and dynamic linking.
FAQ
Q1: Why keep the Section Header Table in executables?
A: Loaders do not need it, but debuggers and analysis tools rely on it.
Q2: Why does .bss not occupy file space?
A: It only records the size; memory is allocated and zeroed at load time.
Q3: Why are so many addresses 0 in object files?
A: They are placeholders patched later using relocation records.
Q4: Why do shared libraries require PIC?
A: They must load at different addresses across processes, so absolute addresses cannot be hard-coded.
Q5: Why are segment permissions page-based?
A: The MMU enforces permissions per page, so code and data are typically mapped into separate pages.