2005-10-11

What is x86-64? "Long Mode" memory model ...

Since I'm getting tired of explaining this over and over, it's definitely time for a Blog post!

Looking beyond the hardware changes in the current AMD64 (NUMA/HyperTransport) architecture from the previous AMD (EV6) architecture, x86-64 comes down to a new "memory model" for programmers. Even if you're not a programmer, this affects you as a user.

- OSes and Memory Models

"Software compatibility" is all about memory models -- the way the CPU handles address registers, address translation (typically with page or page translation). Your OS' "kernel" (or main, controlling program) must be able to provide the "memory models" that programs need. What the programs do with those memory models is up to them -- and the source of compatibility issues.

The OS's kernel can provide any memory model that is compatible with its own. In most cases, this means any memory model that is the same or "smaller" although that can be a simplification. It all depends on what the memory model entails. E.g., some memory models that allow a program direct access to processor registers that are "protected" may not be allowed.

- History: DOS Memory Models

DOS is known for 3 main memory models. NOTE: There were actually more than just a few DOS Memory Models, but I don't want to spend my time talking about that.

- 16-bit DOS (16-bit offset) could provide 64KiB (COM)
- 20-bit DOS (16-bit segment + 16-bit offset) could provide newer 640KiB/1MiB (EXE)
- 32-bit DOS (16-bit segment + 32-bit offset) Protected Memory Interface (DPMI) could provide 16-64+MiB of memory, as well as legacy COM and EXE compatibility.

Early OS/2, Novell NetWare (DOS-based) as well as all MS-DOS 7.x (Windows 95/98/Me) versions _use_ DPMI. NT and Linux can provide DOS/Win16 Emulation via DPMI (that can run COM and EXE as well). In NT, this is the NT Virtual DOS Machine (NTVDM). In Linux, this is typically DOSEmu, DOSBox, etc... (although some software emulate a full 8086/8088, etc...).

[ NOTE: There were a few 24-bit (286/16MiB, 24-bit offset) and 32-bit (386/4GiB, 32-bit offset) Extenders that were not so compatible -- long story. ]

Programs and libraries of different memory models can_not_ use each other though. E.g., Win16 could not use Win32 libraries and vice-versa. Microsoft did, however, come up with Win16-on-Win32 so some shared objects, etc... could be used and run on the same OS.

Now let's talk the 32-bit space and higher.

- 32-bit, 4GiB "Flat" Addressing (i386)

In Intel i386, there is a 16-bit segment and a 32-bit offset register. The offset register is capable of addressing 32-bit (4GiB) "flat." The segment register is often used to designate a location in memory as the "start". Adding the 16-bit segment to the 32-bit off-set is known as "normalization" (also used in 20-bit DOS addressing, only with a 16-bit offset register).

-36-bit, Processor Address Extensions (i686)

In the Intel i686 (Pentium Pro on-ward) processors can now map above 32-bit. In coordination with the 16-bit segment register with Processor Address Extensions (PAE), which is a bit on the CPU. They use a 3-level Translation Table, hence 36-bit PAE mode. The offset still allows programs to use the same 32-bit flat space and work together, the OS is doing the translation and paging above 4GiB. Some programs can be written to PAE36, although interacting with 32-bit libraries
may or may not be an issue (depending).

- NT and Linux PAE support

Before I can continue, I have to explain kernel (including I/O, memory mapped I/O and other reservations) and userspace.

In NT (including NT4.0, 5.0/2000 and 5.1/XP/2003) use a split 2GiB/2GiB kernel/user model. The consumer and even entry Server versions (even XP Pro???) only allow an absolute of 4GiB -- possibly only 2GiB of "user" memory.Advanced Server versions still use the same model, but they support PAE and can map in 512MiB pages in the 2GiB user portion, up to 64GiB.

Yeah, forget more memory for running Battlefield 2 on XP. But even if you ran a Server version which did support more, it would be slower because of the paging overhead.

Linux kernels can use a variety of modes. In fact, if you have 1GiB or less of memory, it is recommend you use the 32-bit 3GiB/1GiB kernel/user model, which gives you 960MiB (just under 1GiB) usable, but uses _no_ page translations (10-20% faster). The other mode is to use "HIMEM" or the 3-level PAE mode. The most popular is now the 4GiB/4GiB model. The kernel is always in memory, but is virtualized to a 4GiB area. It then maps in different 4GiB user pages as
necessary.

But in all cases, there is still only a 32-bit offset register, and the 16-bit segment register and "normalized" 32-bit address is often used alongside page tables.

-52-bit PAE and 48-bit "Long Mode" (x86-64)

x86-64 now introduces a new model called "Long Mode." On the page table side, it is a 4-level, 52-bit PAE approach that is completely compatible with 36-bit PAE. In other words, a 52-bit PAE kernel can also provide 36-bit PAE and 32-bit memory model compatibility (probably DPMI as well, to a point).

"Long Mode" also offers a 48-bit (256TiB) "Flat" address model by using the 16-bit segment register as bits 32-47 to the 32-bit offset register of bits 0-31. This is the new "memory model" that programs and libraries may use.

48-bit "flat" programs cannot call 32-bit "flat" libraries and vice-versa. It's not the kernel, but the way programs talk to each other. Outside of the kernel, if an OS ships with 48-bit libraries, then they cannot be used by 32-bit programs. Therefore, some x86-64 OSes only ship with a PAE52 kernel, but still offer largely 32-bit libraries and programs, with a few 48-bit exceptions.

NOTE: FYI, there is a 40-bit (1TiB) _physical_ platform limitation to current AMD64 designs. It's due to the Athlon64/Opteron's core design being shared with the older, 40-bit EV6 Althon. Yes, the original "32-bit" Athlon was capable, on the physical platform, of addressing 40-bit. Hint: EV6 was designed for 64-bit Alpha 264, not Athlon.

- Windows 64-bit v. GNU/Linux x86-64

Windows XP 64-bit edition is still largely a 32-bit OS. It ships with a PAE52 capable kernel, but at the heart of Windows is Win32. Win32 is heavily x86 (data alignment ignorant, long story), and while Win64 is slowly developing set of replacements, the main component is Win32-on-Win64 (WoW). It's a way so Win64 programs can use Win32 objects, which are at the heart of the supposed "64-bit" OS still.

Independent Software Developers (ISVs) must come up with most of their own libraries in the absence of complete Win64 support. Unfortunately, most core OS librares are still 32-bit, which is a major issue. As such, many 32-bit versions of applications run faster on the 64-bit version (let alone even the 32-bit version) than 64-bit versions of applications -- depending on if they applications use 32-bit OS interfaces, or 64-bit ones.

GNU platforms have been "64-bit clean" for a long time, and GNU/Linux for Alpha pioneered this over a decade ago. The result is that Red Hat and SuSE have shipping, true 64-bit versions with 48-bit applications and libraries. The Linux Filesystem Hierarchy Standard (FHS) and newer IEEE POSIX 2001+ and X/Open Single UNIX Specification (SUS) v3 standards define that on 64-bit platforms that offer both 64-bit and 32-bit memory models, /lib contain the legacy 32-bit libraries and /lib64 contain the new 64-bit libraries.

So on a GNU/Linux platform, such as Fedora Core, Red Hat Enterprise Linux, SuSE Linux, etc... built for x86-64, all OS appliations are 64-bit (48-bit memory model) and their libraries are located in /lib64 (48-bit memory model), while very few applications are 32-bit, and a good set of common libraries are available in 32-bit under /lib. This includes multimedia (DirectMedia, DSL), 3D (OpenGL), etc... Even the X-Window version 11 (X11) graphical system is truly 64-bit, but can display 32-bit X11 programs, even those that use OpenGL over X11 (GLX).

There are a few issues. One is the fact that web browser plug-ins are still typically 32-bit. Red Hat ships both a 32-bit and 64-bit (48-bit memory model) Mozilla with its products, but the more popular Firefox is only 64-bit and doesn't work with most plug-ins that are only available in 32-bit form (e.g., Sun only recently introduced a Linux/x86-64 version of its Java Runtime Engine, JRE).

But for the most part, GNU has been 64-bit for so long with portability tools (e.g., Autoconf, Automake, etc...) for builting on Alpha (always 64-bit), SPARCv9 (64-bit), MIPS4000+ (64-bit), etc... so almost all major Freedomware applications and libraries build without any modification. And most enterprise applications have been running on them from before Linux/x86-64 was an option.

This is wholly untrue of the Windows platform, which has always run as 32-bit even on the 64-bit Digital Alpha processor/platform.


1 comments:

wow power leveling said...

Americans everywhere humor A detention wow gold notice was written like this: a wow power leveling police car with stones, to win wow gold the detention center for seven wow power leveling days all-inclusive accommodation replica rolex Tour Value; hit send 2 a beautiful bracelet, wow power level fashionsuit, police transport; more more surprises , the former can enjoy free shaved 10; before the 100 can play with power leveling the dogs, the guests were presented massage sticks, electric shocks to CHEAPEST power leveling the dead skin beauty care services.