CFN Cloud
Cloud Future New Life
en zh
2026-01-09 · 0 views

ELF File Introduction: From Sections to Segments

Use structure, examples, and tools to connect ELF types, layout, relocations, and dynamic linking.

ELF (Executable and Linkable Format) is the most common object and executable file format on Unix-like systems. Understanding its structure helps you connect the key steps of compile �� link �� load �� run.

Three ELF types

  • Relocatable: Object files (.o) emitted by the compiler/assembler, waiting for the linker to merge and fix addresses.
  • Executable: A program that can be loaded and run directly.
  • Shared Object: A shared library (.so) linked at runtime.

From source to execution: a clear chain

Source/Assembly
  ↓ Compile/Assemble
Object file (.o, with Sections)
  ↓ Link
Executable (ELF, with Segments)
  ↓ Loader
Mapped into memory and executed

Two perspectives: Section vs Segment

ELF provides two views:

  • Linker view: ELF is a set of Sections that store code, data, symbols, relocations, etc.
  • Loader view: ELF is a set of Segments that describe memory mappings and permissions (R/W/X).

A simplified correspondence:

Section view (linker)               Segment view (loader)
[ELF Header]                        [ELF Header]
[Section Header Table]              [Program Header Table]
  .text                               LOAD (R-X) <- .text
  .data                               LOAD (RW-) <- .data + .bss
  .bss
  .symtab/.strtab
  .rel.*

Relocatable file example: ELF Header + Sections

A relocatable object (.o) header excerpt from readelf -h:

ELF Header:
  Class:                             ELF64
  Data:                              2's complement, little endian
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Entry point address:               0x0
  Start of section headers:          0x2c0 (bytes into file)
  Number of section headers:         12

A readelf -S excerpt (key columns only):

[Nr] Name      Type      Addr   Off    Size   ES Flg Lk Inf Al
[ 1] .text     PROGBITS  0000   0040   0038   00 AX  0  0  16
[ 2] .rel.text REL       0000   0210   0018   08     6  1  8
[ 3] .data     PROGBITS  0000   0080   0020   00 WA  0  0  8
[ 4] .bss      NOBITS    0000   00a0   0010   00 WA  0  0  8
[ 5] .symtab   SYMTAB    0000   0240   00f0   18     6  8  8
[ 6] .strtab   STRTAB    0000   0330   0048   00     0  0  1

Field notes:

  • Addr: Load address (often 0 in relocatable files; fixed by the linker).
  • Off/Size: File offset and size.
  • Flg: Permissions (A=alloc, X=exec, W=write).

Object file layout (simplified)

0x0000  ELF Header
0x0040  .text
0x0080  .data
0x00a0  .bss (no file bytes)
0x0210  .rel.text
0x0240  .symtab
0x0330  .strtab
0x02c0  Section Header Table

Executable example: Program Headers and Segments

After linking, an executable includes Program Headers:

Program Headers:
  Type   Offset  VirtAddr  FileSiz MemSiz  Flg Align
  LOAD   0x0000  0x400000  0x0800  0x0800  R E 0x1000
  LOAD   0x1000  0x601000  0x0200  0x0300  RW  0x1000

Explanation:

  • First LOAD: Contains .text, permissions R-X.
  • Second LOAD: Contains .data + .bss, permissions RW-.
  • MemSiz > FileSiz usually means .bss occupies memory only.

Use readelf -l to view Section to Segment mapping and see how Sections are merged into Segments.

Relocation: turning placeholders into real addresses

Object files often contain placeholder addresses that the linker fixes using .rel.* entries.

A simplified example (pseudo-assembly):

mov    data_items(%rip), %rax   ; access a global array

In the object file the encoding may contain placeholders:

8b 04 bd 00 00 00 00

After linking it becomes a real address:

8b 04 bd a0 90 04 08

Corresponding relocation entry (excerpt):

Relocation section '.rel.text' contains 1 entry:
  Offset  Info   Type       Sym.Name
  0x0008  ...    R_X86_64_32 data_items

Key idea: the linker patches specific offsets based on relocation tables.

Shared libraries and PIC / GOT / PLT

Shared objects must load at arbitrary addresses, so they use PIC (position-independent code):

  • GOT (Global Offset Table): Stores real addresses of variables/functions.
  • PLT (Procedure Linkage Table): A jump stub used for lazy binding.

A typical PLT entry (simplified):

push@plt:
  jmp *GOT[push]
  pushq $reloc_index
  jmp plt0

The first call enters the dynamic linker; subsequent calls jump directly via the GOT entry.

Dynamic linking flow (high level)

  1. The dynamic linker loads required shared libraries.
  2. First call to an external symbol triggers PLT resolution.
  3. The resolved address is written to the GOT.
  4. Later calls jump directly to the resolved address.

Useful tools

# ELF header, sections, segments
readelf -h a.out
readelf -S a.out
readelf -l a.out

# Disassembly and symbols
objdump -d a.out
objdump -t a.out
nm -n a.out

# Size, deps, strings
size a.out
ldd a.out
strings a.out

Summary

The key to ELF is: linkers care about Sections, loaders care about Segments. Relocatable files emphasize linkability, executables emphasize loadability, and shared objects emphasize relocation and dynamic linking.

FAQ

Q1: Why keep the Section Header Table in executables?
A: Loaders do not need it, but debuggers and analysis tools rely on it.

Q2: Why does .bss not occupy file space?
A: It only records the size; memory is allocated and zeroed at load time.

Q3: Why are so many addresses 0 in object files?
A: They are placeholders patched later using relocation records.

Q4: Why do shared libraries require PIC?
A: They must load at different addresses across processes, so absolute addresses cannot be hard-coded.

Q5: Why are segment permissions page-based?
A: The MMU enforces permissions per page, so code and data are typically mapped into separate pages.

References