This article was published on the 27th of June 2018. This article was updated on the 30th of April 2020.
Assembly language consists of a set of instructions (commonly referred to as operation codes or “opcodes”) in combination with memory addresses. There are many instruction sets, and one can even write a custom instruction set. The most commonly known ones are the 32-bit (x86) and 64-bit (x64_86) sets, the ARM sets and the MIPS sets. The first two are used on the desktop and laptop platforms that are used every day. The ARM sets are mostly used on mobile phones, and the MIPS sets are often used in embedded devices.
The focus of this course starts with 32-bit (x86) and 64-bit (x86_64) architecture. Other architectures might be added to this course at a later stage, but unless stated otherwise, the used architectures are either x86 or x86_64.
Table of contents
The registers on the CPU are equal in size to the architecture: 32 bits in x86 and 64 bits in x86_64. Originally, the first Central Processing Units (CPUs) could contain only 8 bits (such as the Intel 8008, released in 1972). Later, the 16-bit architecture emerged (e.g. the Intel 8086, released in 1974). The Intel 8086 was the first CPU of the x86 family. Notable processors in this family, that followed after the Intel 8086, were the Intel 80386 (32-bit architecture, also known as i386, released in 1985) and AMD’s Opteron (64-bit architecture, released in 2003). The reason that these CPUs are considered a family is because of the backwards compatibility with regards to previous versions. For this reason, modern 64-bit platforms can still run 32-bit binaries, if the used operating system also supports this.
General Purpose Registers (GPR)
In the x86 family, the registers have roughly stayed the same, although some registers have been added later on. The 16-bit Intel 8086 used eight General Purpose Registers (GPRs), as are given below.
The Accumulator Register (AX): used in arithmetic and is often used to store the return value (if the value does not exceed the register size), or a pointer to the returned data
The Base Register (BX): contains a pointer to data (the DS register is used when in segmented mode)
The Counter Register (CX): used during loops (to keep track of the loop count) and in shifts/rotations of data
The Data Register (DX): used for I/O and arithmetic
The Stack Pointer Register (SP): points to the top of the stack
The Stack Base Pointer Register (BP): used to point to the base of the stack
The Source Index Register (SI): used during stream operations and points to the source location
The Destination Index Register (DI): used during stream operations and points to the destination location
The Instruction Pointer Register (IP): used to point to the next instruction, also known as the program counter (PC)
The Accumulator Register (AX), Base Register (BX), Counter Register (CX) and Data Register (DX) are divided in two 8-bit registers. The lower half is accessible by replacing the X with the L (for Lower) and the higher half is accessible by replacing the X with an H (for Higher). This results in the following.
AX = AH and AL BX = BH and BL CX = CH and CL DX = DH and DL
The 32-bit architecture uses the same registers as the 16-bit variant, but each register has twice the size. The naming scheme is therefore different: each register has the prefix E, which stands for Extended:
EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI and EIP
The 64-bit architecture uses the same registers as the 32-bit variant and has registers twice the size of the 32-bit architecture. The 64-bit registers have the prefix R (which stands for Register), resulting in a different naming scheme:
RAX, RBX, RCX, RDX, RSP, RBP, RSI, RDI and RIP
Other than the different naming scheme, the 64-bit architecture introduced 8 additional registers:
R8, R9, R10, R11, R12, R13, R14 and R15
The first 8 registers are counted from 0 through 7, resulting in the total amount of 16 registers. Some of these registers are used for specific purposes, whereas others are free to use by the user. According to documentation provided by Intel, the values in the registers R8, R9, R10 and R11 are considered volatile and should be considered lost when another function is called. The values in the registers R12, R13, R14 and R15 must be saved before calling another function.
An additional difference with the 32-bit variant, is the way variables are passed to another function. In the 8-bit, 16-bit and 32-bit variants, the stack is used to pass arguments to a function. In the 64-bit variant, the first few arguments are stored in registers (RCX, RDX, R8 and R9), whereas the remaining arguments are pushed on the stack. The difference exists because there are different calling conventions, which will be explained in the methods and macros: the call stack article.
Other than the General Purpose Registers, there are six Segment Registers. Their purpose is to store the value of the segments of the binary which is executed. Nowadays, these registers are not always used for their originally intended purpose since the memory is not accessed in the flat mode but rather via paging. This topic will be discussed in a later article but the registers and their original purpose are included here for the sake of completeness.
The Code Segment Register (CS): contains the value of the code segment of the binary
The Data Segment Register (DS): contains the value of the data segment of the binary
The Extra Segment Registers (ES, FS and GS): the extra registers are filled with data from the operating system such as exceptions or thread handling
The Stack Segment Register (SS): contains the value of the stack segment of the binary
In the story of Guilliver’s Travels, the Lilliputans discuss whether to break eggs on the big end(ian) or the small end(ian). Danny Cohen used this analogy in his “On holy wars and a plea for peace” to solve the conflict between the different methods of reading and storing values in memory.
When the Big Endian notation is used, the big end of the data is read first (which equals the lowest address in memory of the value). This is the most logical way of writing for humans, as we do it all the time. The four characters a, b, c and d can also be written as a hexadecimal value. The characters are, respectively, 0x61, 0x62, 0x63 and 0x64. If these four characters would be written as a single hexadecimal value with the Big Endian notation, the output would be 0x61626364 (which equals abcd). The most significant byte (or the big end) is placed at the lowest address.
The Little Endian notation is the inverse of the Big Endian notation, with the least significant byte (or the little end) at the lowest address. The same string abcd in Little Endian would be written as dcba, which equals 0x64636261 as a hexadecimal value.
The flags register contains information about numerous specific settings. Each setting is saved in another bit of the register. As a result, all these settings are saved in a single register, and yet they have specific names. The flags in the register are set as a result of some instructions, such as sub, add or cmp (compare).
Below, all flags that are stored in the flags register are explained, starting with the 16-bit architecture. Note that these flags are used in the x86 family as a whole.
The Carry Flag (CF): this flag is set when an addition needs to carry one bit over (e.g. 9+7 makes 6 with the carry set to one) or during subtraction when a bit is borrowed (e.g. 1-2 will set the carry to one)
The Parity Flag (PF): if the sum of the set bits of a value is even, this flag is set. Otherwise it is not set
The Adjust Flag (AF): this flag is also known as the Auxiliary Flag (AF) or the Auxiliary Carry (AC). This register functions as the Carry Flag between the lowest 4-bits (1 nibble) and the highest 4-bits (1 nibble) of an 8 bit register
The Zero Flag (ZF): if the result of an instruction is zero, the Zero Flag is set to one, otherwise it is set to zero
The Sign Flag (SF): checks if a value is signed, meaning that the most significant bit equals one. If this is true, the flag is set to one, otherwise it is equal to zero. A signed value is a negative value
The Trap Flag (TF): sets the CPU in single step mode. This mode executes only a single instruction at a time before it halts, as is required when debugging
The Interrupt Enable Flag (IF): if external interrupts are allowed, the flag is set to one. If external interrupts should be ignored, the flag is set to zero
The Direction Flag (DF): if the value of the flag is zero, data is read from the left side onward. If the value equals one, data is read from the right side onward
The Overflow Flag (OF): if a signed value does not fit in the register without losing the signing bit, this register is set to one in order to avoid the loss of the sign bit. Otherwise, this register will remain zero
The I/O Privilege Level Flag (IOPL): this flag is two bits in size, making it possible to contain higher values. This is required for the 4 privilege levels (0 through 3). If the privilege level of a program should be equal to or less than the value of this flag. Otherwise, the requested action will be denied. This flag can only be altered from the kernel itself (ring 0)
The 32-bit architecture has twice as much space in each register when compared to the 16-bit architecture, resulting in additional space in the flags registers, meaning more flags can be used.
The Resume Flag (RF): used during debugging and debugging exceptions
The Virtual 8086 Mode (VM): if the CPU runs in 8086 compatibility mode, this flag is set to one, otherwise it is set to zero
The Alignment Check (AC): used during the alignment checking of memory addresses, where a one means enabled and a zero means disabled
The Virtual Interrupt Flag (VIF): the virtual version of the Interrupt Flag (IF)
The Virtual Interrupt Pending (VIP): set to one if a virtual interrupt is pending, otherwise it is equal to zero.
The CPUID Flag (ID): depending on the value, different results are returned from the CPUID call
The VAD Flag (VAD): allows the Virtual Address Descriptor to be accessed if set to one, otherwise it is set to zero
Similar to the shift from 16-bit to 32-bit architecture, the 64-bit architecture has twice the size of the 32-bit architecture. The registers in the upper half of the 64-bit flags register (which equals the newly added space, compared to the 32-bit flags register) are all reserved and are therefore not accessible.
Assembly language differs from most of the programming languages we know today, especially the code that is generated (and optimised) by the compiler. The easiest way to compare assembly language, is with the help of another language (any given one works). In this case, C is used as an example, since this provides the option to directly access memory. This helps to explain the usage of registers in practice. In this practical case, an integer with the value 5 is printed.
Example in x86 ASM
Firstly, the stack pointer is subtracted with 8 bytes, as two 4 byte variables will be pushed on the stack. Since the stack grows downwards, the value which is to be printed is pushed onto the stack before the literal string (which is located at 0x0484c0 and equals %d). The function sym.imp.printf is the default printf function which prints data to the stdout (standard output). Due to the found symbols (sym), Radare2 calls it an imported (imp) function with the name printf. The function printf is then called and uses the provided parameters. The return value of printf equals the amount of characters that are written to the stdout, which would be one in this case. This value is saved in the accumulating register EAX.
0x0804841c sub esp, 8 0x0804841f push 5 ; 5 0x08048421 push 0x80484c0 0x08048426 call sym.imp.printf
Example in C
If one is to call the function printf with 1 variable, it would look as follows in C.
The function printf requires a literal string in which the flags are provided. The %d flag represents an unsigned decimal integer. The first value after the literal string equals the value of the first flag that has been provided.
Using (pseudo) C, one can visualise the structure used in assembly language. The literal string %d is located at the memory address 0x80484d0. The value of the integer is located at the stack pointer (ESP) minus the size of one variable (4 bytes). Note that the call frame is left to avoid overcomplication. This matter is explained in detail in a later chapter.
printf((char *)0x80484d0, (int *)[ESP - 4]);
The chapter regarding assembly basics can be found here!