This article was published on the 7th of July 2018. This article was updated on the 30th of April 2020, as well as the 11th of May.
Conditions and loops are used in (nearly) all programming and scripting languages. Conditions can be compared to questions. If a certain condition is met, a certain action is executed. If this action is not met, either another action is executed, or a part of the execution flow is skipped.
A loop can be used to repeat certain actions without the need to include the code more than once in the binary, and to avoid creating repetitive code from a programmer’s point of view. To understand when the execution flow is continued as usual or when it is skipped, one needs to understand how the decisions are made.
Table of contents
Conditions
The if-statement
There are multiple forms of conditions. Firstly, the so called “if”-statement can be used to compare two or more variables. An example, in C, is shown below.
if (x == y) { printf("%d equals %d", x, y); }
To be able to compile this, a simple C program can be used, as can be seen below.
#include <stdio.h> int main(int argc, char** argv) { int x = 5; int y = 5; if (x == y) { printf("%d equals %d", x, y); } return 0; }
If the two integers x and y are equal, the function printf is called, which prints the given string. If the two integers are not equal, the call to the printf function is never made. The same function in x86 assembly is shown below. Do note that the shown assembly instructions are not the complete function, as only the required instructions are shown to avoid clutter.
Compiling the binary with gcc using the command that is given below.
gcc ./input.c -o output.bin -s
Once that is complete, one can open and analyse the binary with Radare2.
r2 ./output.bin [0x08048310]> aaaa [x] Analyze all flags starting with sym. and entry0 (aa) [x] Analyze function calls (aac) [x] Analyze len bytes of instructions for references (aar) [x] Emulate code to find computed references (aae) [x] Analyze consecutive function (aat) [x] Constructing a function name for fcn.* and sym.func.* functions (aan) [x] Type matching analysis for all functions (afta)
Now that Radare2 has analysed all functions and created references to aid the reverse engineering process, the command afl (Analyse Function List) can be used to list all functions. The s (Seek) command is used to set the current address in the binary. Since the only user created function in the program is the function main, this is where the address is set to.
[0x08048310]> afl 0x080482ac 3 35 fcn.080482ac 0x080482e0 1 6 sym.imp.printf 0x080482f0 1 6 sym.imp.__libc_start_main 0x08048300 1 6 sub.__gmon_start_300 0x08048310 1 33 entry0 0x08048340 1 4 fcn.08048340 0x08048350 4 43 fcn.08048350 0x080483c0 3 30 entry2.fini 0x080483e0 8 43 -> 93 entry1.init 0x0804840b 3 74 main [0x08048310]> s main
By default, the address in Radare2 is set to the entry point of the executable, which is named entry0. This can also be seen in the in section above. The address which Radare2 displays when the afl command is executed equals 0x08048310. In the list of functions, this corresponds to entry0.
To disassemble the current function, one needs to use the command pdf (Print Disassembly Function). Note that the machine code (which is the second column) and the additional arrow which indicates the jump path are removed to improve readability.
[0x0804840b]> pdf / (fcn) main 74 | main (int arg_4h); | ; var char *format @ ebp-0x10 | ; var int local_ch @ ebp-0xc | ; var int local_4h @ ebp-0x4 | ; arg int arg_4h @ esp+0x4 | ; DATA XREF from 0x08048327 (entry0) | 0x0804840b lea ecx, [arg_4h] ; 4 | 0x0804840f and esp, 0xfffffff0 | 0x08048412 push dword [ecx - 4] | 0x08048415 push ebp | 0x08048416 mov ebp, esp | 0x08048418 push ecx | 0x08048419 sub esp, 0x14 | 0x0804841c mov dword [format], 5 | 0x08048423 mov dword [local_ch], 5 | 0x0804842a mov eax, dword [format] | 0x0804842d cmp eax, dword [local_ch] | 0x08048430 jne 0x8048448 | 0x08048432 sub esp, 4 | 0x08048435 push dword [local_ch] | 0x08048438 push dword [format] | 0x0804843b push str.d_equals__d ; 0x80484e0 ; "%d equals %d\n" ; const char *format | 0x08048440 call sym.imp.printf ; int printf(const char *format) | 0x08048445 add esp, 0x10 | 0x08048448 mov eax, 0 | 0x0804844d mov ecx, dword [local_4h] | 0x08048450 leave | 0x08048451 lea esp, [ecx - 4] \ 0x08048454 ret
The two variables, x and y, have a different name in Radare2, namely format and local_ch respectively. The value of x (format) is then stored in the accumulating register EAX. The value of y (local_ch) is then compared to the value of x (located in EAX). The result of this compare instruction is saved in the Flag Register under the Zero Flag. The specific jump instruction jne stands for Jump Not Equal. If the two values that have been compared are not equal, the jump is taken. If the values are equal, the jump is not taken.
A jump sets the instruction pointer (also known as the program counter) to the provided address, skipping the instructions between. The 32-bit register in which the instruction pointer resides, is named EIP (Extended Instruction Pointer). The compare instruction sets the Zero Flag to either 0 or 1. If the result (or difference) between the two compared values equals 0, then the values are equal and the Zero Flag is set.
mov dword [format], 5 mov dword [local_ch], 5 mov eax, dword [format] cmp eax, dword [local_ch] jne 0x8048448
If the jump is taken, the instruction pointer is set to the address 0x8048448. This is the next line of code after the if-statment: return 0. On the address 0x8048448, the instruction mov eax, 0 resides, after which the main function returns with the ret instruction.
0x08048448 mov eax, 0 [...] 0x08048454 ret
Using the plug-in r2dec, the command pdd decompiles the function at the current address. This results in the following pseudo code:
[0x0804840b]> pdd #include <stdio.h> int32_t main () { ecx = arg_4h; *(format) = 5; *(local_ch) = 5; eax = *(format); if (eax == *(local_ch)) { /* const char *format */ printf ("%d equals %d\n", format); } eax = 0; ecx = *(local_4h); esp = ecx - 4; return eax; }
This provides the same insight as the analysis of the assembly instructions, albeit not being fully correct. The printf function requires the following three parameters: the literal string and the two integers that will be used to replace the %d flags in the literal string. Currently, only one integer is passed into the function. The plug-in is still under development, but it provides a valuable lesson: never fully trust a tool. Note that future updates might have fixed this issue, meaning that it is possible that one will get a rather different result when following this article in the future.
The else-statement
Conditions can also be used in a combination, as can be seen in the following pseudo code below.
if [something] then
      [do something]
else
      [do something else]
The above used source code needs to be altered a bit to include the else-statment:
#include <stdio.h> int main(int argc, char** argv) { int x = 5; int y = 5; if (x == y) { printf("%d equals %d\n", x, y); } else { printf("%d does not equal %d\n", x, y); } }
If the variables x and y are not equal to one another, the if-statement is false and therefore the code that is encapsulated by the if-statement part will not be executed. Because the only condition for the else-statement to be reached, is the invalidity of the if-statement, the else-statement is taken when x does not equal y.
Upon disassembling the code with Radare2, the following disassembly is provided:
[0x0804840b]> pdf / (fcn) main 98 | main (int arg_4h); | ; var char *format @ ebp-0x10 | ; var int local_ch @ ebp-0xc | ; var int local_4h @ ebp-0x4 | ; arg int arg_4h @ esp+0x4 | ; DATA XREF from 0x08048327 (entry0) | 0x0804840b lea ecx, [arg_4h] ; 4 | 0x0804840f and esp, 0xfffffff0 | 0x08048412 push dword [ecx - 4] | 0x08048415 push ebp | 0x08048416 mov ebp, esp | 0x08048418 push ecx | 0x08048419 sub esp, 0x14 | 0x0804841c mov dword [format], 5 | 0x08048423 mov dword [local_ch], 5 | 0x0804842a mov eax, dword [format] | 0x0804842d cmp eax, dword [local_ch] | 0x08048430 jne 0x804844a | 0x08048432 sub esp, 4 | 0x08048435 push dword [local_ch] | 0x08048438 push dword [format] | 0x0804843b push str.d_equals__d ; 0x80484f0 ; "%d equals %d\n" ; const char *format | 0x08048440 call sym.imp.printf ; int printf(const char *format) | 0x08048445 add esp, 0x10 | 0x08048448 jmp 0x8048460 | 0x0804844a sub esp, 4 | 0x0804844d push dword [local_ch] | 0x08048450 push dword [format] | 0x08048453 push str.d_does_not_equal__d ; 0x80484fe ; "%d does not equal %d\n" ; const char *format | 0x08048458 call sym.imp.printf ; int printf(const char *format) | 0x0804845d add esp, 0x10 | ` 0x08048460 mov eax, 0 | 0x08048465 mov ecx, dword [local_4h] | 0x08048468 leave | 0x08048469 lea esp, [ecx - 4] \ 0x0804846c ret
The if-statement still remains the same, but the code that is encapsulated in it, has been changed. First the values x and y (or format and local_ch respectively) are set to the value 5. Then, the value of format is moved into the accumulating register EAX. The compare instruction then compares the value which resides in EAX to the value of local_ch (which corresponds to the variable y in the source code). If the values are not equal, the jump to 0x804844a is taken, which is the start of the else-statement. If x and y are equal, the two values and the literal string are pushed on top of the stack, and the printf function is called. The mentioned instructions are given below.
0x0804841c mov dword [format], 5 0x08048423 mov dword [local_ch], 5 0x0804842a mov eax, dword [format] 0x0804842d cmp eax, dword [local_ch] 0x08048430 jne 0x804844a 0x08048432 sub esp, 4 0x08048435 push dword [local_ch] 0x08048438 push dword [format] 0x0804843b push str.d_equals__d ; 0x80484f0 ; "%d equals %d\n" ; const char *format 0x08048440 call sym.imp.printf ; int printf(const char *format) 0x08048445 add esp, 0x10 0x08048448 jmp 0x8048460
In the last line of the body of the if-statement, an unconditional jump is taken to the address 0x8048460. At this address, the register EAX is set to 0, after which the main functions returns (this equals the return 0 line in the C code).
0x08048460 mov eax, 0 [...] 0x0804846c ret
The else-statement prints a different string with the two provided integers. There is no jump at the end of the else function, since it is always followed by the return 0 line.
The altered code can also be decompiled with r2dec, do note that this pseudo code contains the same printf parameter passing error as in the if-statement part of this article. Since r2dec is still under development, this bug will most likely be solved in later versions.
[0x0804840b]> #include <stdio.h> int32_t main () { ecx = arg_4h; *(format) = 5; *(local_ch) = 5; eax = *(format); if (eax == *(local_ch)) { /* const char *format */ printf ("%d equals %d\n", format); } else { /* const char *format */ printf ("%d does not equal %d\n", format); } eax = 0; ecx = *(local_4h); esp = ecx - 4; return eax; }
Operators
With the help of operators, it is possible to allow multiple cases in a single statement. Some examples of operators are &&, ||, ==, and !=. The meaning of these operators are, in order, and, or, equals, and does not equal. In this example, the operator || will be examined, but the concept remains the same for all operators. The compare instruction can only compare two values at a time. Therefore, a single if-statement with one or more operators, will increase the complexity of the call flow in the assembly with more compare instructions and jumps. The code within the statement remains in only one place, as the jumps will direct the execution flow.
The source code for the operator example is nearly the same as the else-statement code, but this time it also prints the x equals y line if y equals x – 3.
#include <stdio.h> int main(int argc, char** argv) { int x = 5; int y = 5; if (x == y || (x - 3) == y) { printf("%d equals %d\n", x, y); } else { printf("%d does not equal %d\n", x, y); } }
Radare2 provides the disassembly again, as can be seen below.
[0x0804840b]> pdf / (fcn) main 109 | main (int arg_4h); | ; var char *format @ ebp-0x10 | ; var int local_ch @ ebp-0xc | ; var int local_4h @ ebp-0x4 | ; arg int arg_4h @ esp+0x4 | ; DATA XREF from 0x08048327 (entry0) | 0x0804840b lea ecx, [arg_4h] ; 4 | 0x0804840f and esp, 0xfffffff0 | 0x08048412 push dword [ecx - 4] | 0x08048415 push ebp | 0x08048416 mov ebp, esp | 0x08048418 push ecx | 0x08048419 sub esp, 0x14 | 0x0804841c mov dword [format], 5 | 0x08048423 mov dword [local_ch], 5 | 0x0804842a mov eax, dword [format] | 0x0804842d cmp eax, dword [local_ch] | 0x08048430 je 0x804843d | 0x08048432 mov eax, dword [format] | 0x08048435 sub eax, 3 | 0x08048438 cmp eax, dword [local_ch] | 0x0804843b jne 0x8048455 | 0x0804843d sub esp, 4 | 0x08048440 push dword [local_ch] | 0x08048443 push dword [format] | 0x08048446 push str.d_equals__d ; 0x8048500 ; "%d equals %d\n" ; const char *format | 0x0804844b call sym.imp.printf ; int printf(const char *format) | 0x08048450 add esp, 0x10 | 0x08048453 jmp 0x804846b | 0x08048455 sub esp, 4 | 0x08048458 push dword [local_ch] | 0x0804845b push dword [format] | 0x0804845e push str.d_does_not_equal__d ; 0x804850e ; "%d does not equal %d\n" ; const char *format | 0x08048463 call sym.imp.printf ; int printf(const char *format) | 0x08048468 add esp, 0x10 | 0x0804846b mov eax, 0 | 0x08048470 mov ecx, dword [local_4h] | 0x08048473 leave | 0x08048474 lea esp, [ecx - 4] \ 0x08048477 ret
The code that change is given below.
| 0x0804841c mov dword [format], 5 | 0x08048423 mov dword [local_ch], 5 | 0x0804842a mov eax, dword [format] | 0x0804842d cmp eax, dword [local_ch] | 0x08048430 je 0x804843d | 0x08048432 mov eax, dword [format] | 0x08048435 sub eax, 3 | 0x08048438 cmp eax, dword [local_ch] | 0x0804843b jne 0x8048455
At first, the values are compared, and if they are equal, the jump to the code within the if-statement is taken. The instruction je stands for Jump Equal. If the two values are not equal, the next instruction is executed, which puts the value of x (or format, as Radare2 has named the variable) in the accumulating register EAX. After that, 3 is subtracted from EAX and the comparison is made. If y equals x – 3, the next instruction is executed, which is located at 0x804843d. This is the code within the if-statement, equal to the je instruction four instructions above. If the two values are not equal, a jump is taken to 0x8048455, where the else-statement‘s code resides.
Decompiling this code provides a higher level overview, albeit having the printf parameter error in it.
[0x0804840b]> pdd #include <stdio.h> int32_t main () { ecx = arg_4h; *(format) = 5; *(local_ch) = 5; eax = *(format); if (eax != *(local_ch)) { eax = *(format); eax -= 3; if (eax == *(local_ch)) { /* const char *format */ printf ("%d equals %d\n", format); } else { /* const char *format */ printf ("%d does not equal %d\n", format); } } eax = 0; ecx = *(local_4h); esp = ecx - 4; return eax; }
Loops
Loops are used to repeat a specific piece of code without the need to write it multiple times. One could use a loop to iterate through a list, which is especially helpful when the size of said list is unknown. In this example, a loop which iterates 15 times is executed. The function printf is called in each iteration. The output equals This is loop [n]!, where [n] equals the iteration count. The source code is only a few lines and is given below.
#include <stdio.h> int main(int argc, char** argv) { for (int i = 0; i < 15; i++) { printf("This is loop %d!\n", i); } }
When running this code, the following output can be observed.
libra@x86:~/Desktop$ ./output This is loop 0! This is loop 1! This is loop 2! This is loop 3! This is loop 4! This is loop 5! This is loop 6! This is loop 7! This is loop 8! This is loop 9! This is loop 10! This is loop 11! This is loop 12! This is loop 13! This is loop 14!
The disassembly uses jumps to return to the start of the loop until the set amount of 15 iterations has been completed. The jump then is not taken anymore and the loop ends. In this example, there is no more code after the loop, resulting in the termination of the program.
[0x0804840b]> pdf / (fcn) main 68 | main (int arg_4h); | ; var char *format @ ebp-0xc | ; var int local_4h @ ebp-0x4 | ; arg int arg_4h @ esp+0x4 | ; DATA XREF from 0x08048327 (entry0) | 0x0804840b lea ecx, [arg_4h] ; 4 | 0x0804840f and esp, 0xfffffff0 | 0x08048412 push dword [ecx - 4] | 0x08048415 push ebp | 0x08048416 mov ebp, esp | 0x08048418 push ecx | 0x08048419 sub esp, 0x14 | 0x0804841c mov dword [format], 0 | 0x08048423 jmp 0x804843c | 0x08048425 sub esp, 8 | 0x08048428 push dword [format] | 0x0804842b push str.This_is_loop__d ; 0x80484d0 ; "This is loop %d!\n" ; const char *format | 0x08048430 call sym.imp.printf ; int printf(const char *format) | 0x08048435 add esp, 0x10 | 0x08048438 add dword [format], 1 | 0x0804843c cmp dword [format], 0xe ; [0xe:4]=-1 ; 14 | ` 0x08048440 jle 0x8048425 | 0x08048442 mov eax, 0 | 0x08048447 mov ecx, dword [local_4h] | 0x0804844a leave | 0x0804844b lea esp, [ecx - 4] \ 0x0804844e ret
The parameter format is equal to integer i in the source code. At the address 0x08048423, an unconditional jump is taken to 0x0804843c, where the value of format is compared to 0xe, which equals 14 in decimal notation. If the result of this comparison is less than or equal to (jle stands for Jump Less or Equal), the loop is executed once until it reaches the compare instruction again. After the two required variables are pushed on top of the stack, the printf function is called and the counter is increased with 1 using the add instruction at 0x08048438. As long as the counter (the format variable) is less than or equal to 14, the jump is taken. After 15 loops (0 through 14), the loop ends and the code after it is executed. Since there is no code after the loop in this example, the program terminates.
To contact me, you can e-mail me at [info][at][maxkersten][dot][nl], or DM me on BlueSky @maxkersten.nl.