.NET Information center: Understanding the x64 code models

Thursday, January 05, 2012

Understanding the x64 code models

An interesting issue that comes up when writing code for the x64 architecture is which code model to use. This probably isn’t a very well-known topic, but if one wants to understand the x64 machine code generated by compilers, it’s educational to be familiar with code models. There are also implications for optimization, for those who really care about performance down to the smallest instruction.

There’s very little information on this topic online or anywhere. By far the most important resource is the official x64 ABI, which you can obtain from the x86-64.org page (from now on I’m going to refer to it simply as "the ABI"). There’s also a bit of information in the gcc man-pages. The aim of this article is to provide an approachable reference, with some discussion of the topic and concrete examples to demonstrate the concepts in real-life code.

An important disclaimer: this is not a tutorial for beginners. The prerequisites are a solid understanding of C and assembly language, plus a basic familiarity with the x64 architecture.

Code models – motivation

References to both code and data on x64 are done with instruction-relative (RIP-relative in x64 parlance) addressing modes. The offset from RIP in these instructions is limited to 32 bits. So what do we do when 32 bits are not enough? What if the program is larger than 2 GB? Then, a case can arise when an instruction attempting to address some piece of code (or data) just can’t do it with its 32-bit offset from RIP.