UB Chrinovic

The Mechanics of Displacement in x86-64 Branches

Introduction

To the human eye, an assembly dump looks like a map of fixed locations. When you disassemble a piece of code which includes a couple of controls flows, which on the low level translates to branches, you see a label like loop_start or a memory address like 0x401005, it is natural to assume the CPU is hard-wired to jump to that specific coordinate. However, if you peer into the raw machine code, that address is nowhere to be found. In this article we explore this.

Address Displacement in Instruction encoding

In the x86-64 architecture, branch instruction almost never encode the target address. Instead, they encode the displacement, which is a signed integer representing the distance from the current instruction to destination. This is referred to as PC-Relative addressing. You might ask questions on why this is used and "isn't an absolute address simpler than performing arithmetic operations for address resolution?". Well hardware designers are pretty clever, with this design, two major problems were solved at once, saving precious bits in the instruction encoding and enabling Position Independent Code (PIC). PIC allows for foundational security features like Address Space Randomization(ASLR).

The Assembly scenario

Imagine you have a simple loop. the JNE (Jump if Not Equal) instruction needs to jmp back to the label at loop_start

0000000000401000 <_start>:
401000:    mov ecx, 5 
401005:    dec ecx 
401007:    cmp ecx, 0 
40100a:    jne 401005
40100c:    ret

Machine code (hex)

if you look at the raw bytes for the jne instruction, you won't see the address 40 10 05, instead you see this:

0x40100a: 75 f9

The decoding of this instruction is 75 : This is the opcode for a "short JCC"(Conditional Jump) f9 : This is the relative offset

Address resolution

The x86-64 rule for relative jump is, Target Address = Address of next instruction + Signed Offset. So, this makes the:

-Current Instruction Address = 0x40100a. Instruction length = 2 bytes. Next Instruction Address(RIP) = 0x40100c. The Offset f9 = -7 in 8-bit Two's compliment.

Now, we perform the arithmetic

Target Address = 0x40100c + (-7) Target Address = 0x401005

Wow! and what do you know, the instruction at address 0x401005 is the decrement instruction (dec ecx), this is how you know we are performing a loop.

But why?

As we have mention previously, hardware engineers are actually smart and where able to solve two major problems at once.

The Space Constraint of Instruction encoding

In a N-bit architecture, the instruction itself is only N-bits wide. Encoding a full N-bit address within a single instruction leaves no space for the Opcode which specifies the operation(e.g branch) nor the Condition field(e.g "if equal"). By using an offset instead of an absolute address, the CPU only needs to store the distance from the current instruction.

Position Independent Code (PIC)

If instructions contained hardcoded absolute addresses, your program would have to be loaded into the exact same spot in memory every time it runs.

By using offsets instead of absolute addresses, the code becomes relocatable. Branches, function calls, and data accesses are expressed as "current instruction address + offset", meaning the relationship between the branch instruction and it's target remains correct regardless of the program's load address, this allows multiple instances of the program to coexist in memory, or for the kernel to load shared libraries at arbitrary addresses without modification.

Conclusion

In x86-64, using displacement instead of absolute addresses is not just an encoding optimization, It enables relocatable, secure code. PC-relative addressing reduces instruction size, supports Position Independent Code, and underpins modern security mechanisms like ASLR. Understanding this low-level detail gives insight into both the efficiency and security considerations in modern CPU design.