Showing posts with label Reversing. Show all posts
Showing posts with label Reversing. Show all posts

Tuesday, June 17, 2008

Will the real Virtumonde please stand up?

It seems that quite a bit of malware is being classified as Vundo (Virtumonde) these days. With the volume of malware currently being distributed in dynamic link library form, it is not always easy to differentiate one from another. Frequently these modules are statically linked with C and C++ runtimes, compression, and GUI libraries, which can slow analysis down. In addition to all this embedded library code, Vundo's code seems to be under constant development and is updated to fix bugs, add a new piece of functionality, or add more randomization to prevent signature recognition quite frequently.

However, there is one construct that the developers behind the code seem to enjoy using. In almost every place where an event and sometimes registry value names are created, the name is generated by a function which is similar between variants.

The function derives this name from an attribute of the infected computer. The attribute is the serial number assigned to the "C:" drive volume when it was last formatted by the operating system. Then, the serial number is randomized by one or more bitwise cpu instructions against a number selected by the programmer. The result of these operations is converted into a string and returned for use.

The recognition of this function can help positively ID a Vundo sample. The source code representation of this function would look similar to this:


#include <windows.h>
#define arbitrary_vundo_number 0xFDEC

int generate_number(char *output)
{
int return_value;
DWORD volume_serial_number;

return_value = GetVolumeInformation("c:\\", NULL, 0,
&volume_serial_number, NULL, NULL, NULL, 0);

volume_serial_number ^= arbitrary_vundo_number;

return wsprintf(output, "%08x", volume_serial_number);
}


Actual Vundo assembly code looks like this:

push esi ; nFileSystemNameSize
push esi ; lpFileSystemNameBuffer
push esi ; lpFileSystemFlags
push esi ; lpMaximumComponentLength
lea eax, [ebp+VolumeSerialNumber]
push eax ; lpVolumeSerialNumber
push esi ; nVolumeNameSize
push esi ; lpVolumeNameBuffer
push offset RootPathName ; "c:\\"
mov [ebp+VolumeSerialNumber], 123h
call ds:GetVolumeInformationA
xor [ebp+VolumeSerialNumber], 34D2121h
push [ebp+VolumeSerialNumber]
push offset a08x ; "%08x"
push [ebp+arg_0] ; LPSTR
call ds:wsprintfA
add esp, 0Ch
pop esi
leave
retn

Tuesday, June 10, 2008

Vundo variant appropriates Microsoft Research source code

For the past several years, the Vundo family (also known as Virtumonde) of malware appeared high on AV vendors' prevalence lists -- this stuff is everywhere. To get there, the malware employs an aggressive set of tactics over the course of its distribution to evade AV and anti-spyware solutions. A close examination exposes an interesting observation that some of its user-mode rootkit tactics use the Microsoft Research Detours library in order to hide its presence from security solutions. Below is a somewhat technical description.

First off, the Detours project out of Microsoft Research focuses on "Binary Interception of Win32 Functions". In other words, when a developer or malware writer wants to hook a function inline and insert their own code, they can intercept a win32 function with code from the Detours library.
To use this code commercially, "Detours Professional 2.1 includes a license for use in production environments and the right to distribute detour functions in products...For information on licensing Detours Professional 2.1 contact Microsoft's IP Licensing Group at iplg@microsoft.com". Let's assume either that Microsoft never provided the vundo developers with a license or that the vundo developers never attempted to obtain a license for their "commercial" use.

One of Vundo's library components currently in the wild is injected into processes as a part of its attack. This component may in turn be detected by anti-spyware scanners using the EnumProcessModules api call, which would provide an anti-malware scanner using that call with a handle to the injected module. And this is where the abuse begins.
You can see the malicious Vundo hook in this screenshot, implementing the hook functionality from the Detours library. Basically, if a process calls EnumProcessModules, the vundo appropriated code will intercept the win32 function and report that the module enumeration procedure failed. When the EnumProcessModules call fails, certain security scanners are unable to detect the vundo component's presence:



How can Detours code be identified in this dll? Well, the source of the detours library can be placed side-by-side with the unpacked and disassembled vundo component. In many places, the same sequence and order of instructions and data is unmistakably identical. For the sake of brevity, we'll focus on just a couple that briefly illustrates our point in this post.

Here, the deadlisting for the vundo function is on the left, and the matching Detours source code on the right. This chunk of Detours code is at the core of the hooking functionality within disasm.cpp of detours.lib. The source from the Detours library here is determining the length of the currently evaluated instruction and then copying the instruction to the trampoline buffer (this location is the place where the inlined vundo rootkit function can call back into the original function without interception). The appropiated code on the left is compiler optimized, and it is a mirror image of the Detours logic on the right:



Here, in a similar fashion, we see vundo functionality that was stolen from the Detours library calling the DetourCopyInstructionEx() function and an inlined detour_does_code_end_function() function. In this reversing illustration, the vundo function is performing checks to ensure the target function's eligibility for interception. In other words, vundo's appropriated Detours code is checking to see if the target function contains a select set of instructions that would prevent hooking:

Tuesday, June 3, 2008

Bancostrings

When does BCD0236E965582D56DD365E44BD764FA5DFD6CBF312BB124AA2563B5C2 mean ":: Bradesco Pessoa Fosica ::"? Only when CD30ABC0221E5486A23D0F619DB27FC50110504DB9D3DC357893D269E177CB2D1BD1758CCC77AA93ED3DBA190A7BD914B80F5254919C2DC0D471B02CC20260CC4CB2C73A5B really means "HSBC Bank Brasil S.A. -- Banco Muliplo -- No Brasil e no mundo, HSBC", of course.

A couple of previous posts provided insight into what clues strings provide when performing malware analysis, and a concise description of how to decrypt obfuscated strings in a static file using advanced IDA Pro functionality.

Here, we'll use a debugger to step through a malicious file in the lab and observe data as it is decoded by the malware itself. Sometimes, when speed is a priority and not all that many strings are involved, stepping through the decryption loop prior to writing an IDA script is another good approach to have in the toolkit.
We've started the executable within Ollydbg. No human-readable strings are visible to the analyst here, but a quick look at the text section following some unpacking reveals multiple arrays of garbled text. Also suspicious is that each string of unreadable, or probably crypted, data is being passed by pointer to the same function. Most likely, this procedure includes the decryption loop that we are looking for. Each call to this same procedure being passed a pointer is highlighted in a red below:



We can review this loop, setting a breakpoint on the procedures that are passed these strings as a parameter. Somewhere along the way in here, the decrypted data is most likely written out to memory or as a hash. As we single step through the code (hitting F7), we'll watch for pushes, pops, repeated movs intructions, and look for pointers to strings and data copies from esi to edi. We find an interesting loop here after the garbled text is pushed onto the stack. Notice that string data is being copied from esi to edi:



Following edi in the data dump displays the memory contents as they are written out and decrypted by multiple layers and loops. Setting a breakpoint here and running through the loop reveals the decrypted data. We can single step through this loop to evaluate the decryption algorithm.
Eventually this decrypted data is passed to another function via pointers on the thread stack. Now that we've run through the loops, we can identify a list of banks and web sites that our portuguese speaking friends in Brazil may recognize:



Having identified these strings within the malware, we craft few custom written empty web pages with these strings as title bar content. We then open the html pages with Internet Explorer. We'll witness images stored within the malware being presented in the foreground of the browser, waiting for our login id's and passwords. Here are a few related screenshots:




These strings helped lead us to identify another all too popular Brazilian banking password stealer. Done with these strings, off for a little samba and sun on the coast of Buzios!

Tuesday, May 20, 2008

Keeping strings real - Part II

In part 1 of keeping strings real, strings were chased around in a disassembler to provide insight into the functionality of a piece of malware. Part two investigates the instance where there seem to be no recognizable strings in the target at all.

When doing a quick skim of a malware file (in this case, ldr.exe md5:007571544614a7646e750a51ccaf2e9e), sometimes you encounter data that looks similar to the below image. In this particular instance, it seems that these strings are encrypted.









Encrypted strings make it very difficult to do quick analysis. Fortunately, a there are a couple of options to get past this small road block. The string can be observed as it is passed around between functions, or debuggers can be used to halt execution when the target string data is accessed.


For this sample, the above strings were followed from the initial cross reference to a function that is self contained. It takes arguments, executes some code, and returns. The code has a loop, operates on individual bytes (reads from pointers into register halves), and performs a few additions. Is this a possible candidate for a string decryption? You betcha.

After walking through the function once in a debugger, it becomes obvious this function decrypts the string to a different buffer. This is an excellent first step, but there are a massive amount of strings in this file. It would be less than desirable to execute this function in a debugger and make note of the result for each and every encrypted string present. There has to be a faster and more elegant way to figure these out. Now what?


Enter Cryptanalysis. This particular function is not very large or complicated, so determining the algorithm used to reveal the strings should not take an unreasonable amount of effort. After determining the algorithm, it is possible to write a program or script to accept the encrypted string data and output the decrypted string.


Below is what the reversed function looks like.

















This function accepts what looks like a null terminated pascal string. The first character in the string contains the length (0 to 255), followed by the ciphered string data, then a zero to indicate the end of the string.


The next step is to add the cipher key value to the first encrypted character in the string. This key value starts at 186 (or 0xBA in hexadecimal). On each loop pass, the key is increased by 2 and added to the next character in the string.


For instance, the character 'a' is represented by the number 97 (0x61). To encrypt this initial data based on the algorithm above, we would subtract 186 (0xBA). To decrypt it later, 186 (0xBA) is added to the encrypted data.


The result of this 97 - 186 subtraction is 167 (0xA7). This math looks funny, but it works this way when working with individual bytes and their associated range of 0 to 255 (unsigned).


This behavior is due to the wrap-around effect caused by an integer overflow. To see this in action on Windows, open calculator (calc.exe), change the view to scientific mode, then change the number system from decimal (Dec) to hexadecimal (Hex), lastly change the size from "Qword" to "Byte." Now you can type in 61 minus BA and the result is A7 (167).


Keeping the above math in mind, the algorithm can now be re-implemented using IDA's built in scripting language (IDC). The script will be need to be passed the source string data, extract a byte, add the key to the byte, store the result, add 2 to the key, and repeat this process till all bytes in the string have been processed.


The Byte() function will be used to extract the byte from the "address" of the string's beginning found in IDA's dissassembler window. The Message() function will display the deciphered byte in the message window, and the PatchByte() function will modify the representation of the byte inside of the disassembler window. (Note: PatchByte() can be commented out to prevent the script from actually modifying any data, it will simply print the result in the message window)


The script representation of this algorithm reconstruction is found in the image below, and the idc script itself can be downloaded from our PC Tools ThreatFire forum, where you can log in and scroll down the thread for 186plus2_decipher.zip:


















Now it is time for some fun. An encrypted string is selected in IDA for decoding and the script is launched. The result:




Keeping strings real - Part I

All malware researchers love strings. They allow us to gain valuable insights into the possible behavior of the sample being investigated. Even IT professionals, who do not research malware professionally, can make good use of these clues.

Here's a quick example of strings in a malware disassembly listing:

00403100 Security Troubleshooting.url
00403120 ot.ico
00403128 %s/soft/?c=%1.1d%d%1.1d
00403140 Online Security Guide.url
0040315C ts.ico
00403164 %s/test/?c=%1.1d%d%1.1d
0040317C Online Security Test.url
00403198 *.securemanaging.com
004031B0 *.safetyincludes.com
004031C8 *.securewebinfo.com
004031DC 85.255.117.158
004031EC 88.255.74.197
00403300 195.95.*.*
0040330C 194.187.*.*
00403318 turbocodec.com
00403328 flyvideonetwork.com
0040333C websoft-c.com
0040375C plus-codec.com
0040376C freerealitympegs.com
00403784 inc-codec.com
00403794 user_pref("browser.search.selectedEngine", "Search");
004037D0 user_pref("browser.search.selectedEngine"
00403840 \profiles.ini
00403850 Mozilla\Firefox
00403908 Software\Microsoft\Internet Explorer\New Windows\Allow
00403940 %sVersion\Internet Settings\ZoneMap\EscDomains\%s
004039A8 Domains\%s

Right off the bat, one might guess that there is probably something fishy going on with these domains in relation to Firefox and Internet Explorer settings. A quick google search on some of these domains yields many results which are seemingly related to malware. If the search result is some what ambiguous, a researcher can always plug a string into ThreatExpert to find related malware behavior.

Searching for "securewebinfo.com" on ThreatExpert yields plenty of results. Most of the strings found in this particular sample match up very nicely to the results found, so it is reasonably safe to assume that this sample is probably a variant. However, if the search results were inconclusive, one of the next steps a malware researcher can take is to disassemble the file in the IDA Pro.

What is this malware actually doing with those strings? We are glad you asked!

Below is the image of the strings in the disassembler. The following items are shown moving from left to right: the address in memory where the strings reside, the automatic name IDA gave this location, the string data itself, and last but not least, the cross reference (XREFs).







Navigating to one of the cross references changes the view to an array of string pointers as seen in the image below. This array also contains a cross reference, but to a function this time.







The function seen below was labeled "modify_IEXPLORE_SecurityZones" as it was found to call sub-functions which modify the registry associated with Internet Explorer's Security Zones.


The last loop in this function, "AddAllowPopup_loop", executes once for each item in the domain_name_array. Each item in the array will be added to the AllowPopup registry key. The next time Internet Explorer is run, those domains will be allowed to display pop-up windows at will. This code confirms our suspicions of malicious behavior.





Tuesday, May 6, 2008

AMTSO and CARO Workshop

The AV industry was busy this past week amongst the blooming tulips in Hoofddorp, the Netherlands. Both an AMTSO conference and a CARO workshop was held the last three days of the week.

A large group of attendees arrived for the Wednesday all-day testing standards meeting, with more journalists in attendance than before. It was encouraging to see, because one of the AMTSO's formative goals has been to invite and include representatives from all parts of the computer security industry. Progress is being made toward a set of testing standards for anti-malware products for everyone involved.

The CARO workshop followed on Thursday and Friday, with presentations focusing on malware obfuscation from the AV industry's perspective (googling "datasecurity event caro" provides a link to the home page). The opening talk by Paul Ducklin from Sophos set the tone for most of the event -- legitimate compressors/packers are acceptable and good (according to a number of individuals in the AV scanner business), while software protection solutions like Themida and SVKP are unacceptable and evil (to a number of individuals in the AV scanner business).
It was interesting that while AV vendors and Ilfak Guilfanov of IDA Pro/Hex Rays spoke and gave presentations over the two days, none of the developers or vendors from Themida or ASProtect (a couple of software protection systems that were referred to in the presentations) were invited or presented their thoughts.

Even at the workshop, it seems that there remains disagreement on how the industry should handle software obfuscation, and there remains a sense that software obfuscation is a major source of problems for the AV industry. Whether it's due to difficulties in emulation, performance issues when unpacking, the complexities of the virtualization packers (where Sophos' Boris Lau showed that a single NOP instruction can be easily and inexpensively be translated into over 50 virtual instructions) or simply disagreement over how to identify what is behind software protection, it continues to be a weakness for traditional AV scanners.
Just to give an idea of the volume of difficulties and tricks that researchers have to develop methods to deal with, Peter Ferrie's paper was presented by Mady Marinescu of Microsoft, and in it he enumerated over 50 anti-unpacking tricks commonly seen in packers and often seen in malware.
Presenters also included evaluations of the proportions of malware seen packed by specific packers and various approaches to dealing with them, including blacklisting. It seems that it is easier to include this approach in a scanner than to have to actually implement an unpacker in a scanner for all the different varieties of packers. Blacklisting is cheap and easy, but is more prone to causing fp's, and often decisions to blacklist may be debatable.
We will see what this turn away from extremely low false positive rates will do to the major advantage that the scanners had over behavioral based solutions.

From the perspective of an individual pushing a behavioral solution that solves for the difficulties that scanners have with obfuscation, it is somewhat easy to be critical of AV scanner products' inability to continue performing with such a low level of false positives and exacting matches in the face of ongoing obfuscation and "server-side polymorphism"/"rapid release" techniques currently used by malware distributors to evade the AV solutions. The complexity and difficulties are high for the guys trying to develop elegant and effective AV solutions to these problems.
We'll see more of this obfuscation topic, but from the "hackers" perspective, when defcon's "Race To Zero" contest is held this fall.

Thursday, March 20, 2008

Common Hijack Habits Are Hard to Break

You just need to find the right point. Breakpoint, that is.

We've had a couple of recent posts that record the use of an injection technique quite commonly used by ITW malware. It has been used for years to evade personal firewalls. New code utilizing the same technique for a variety of solutions (grey, black, or white hat) continues to be posted. Proper prevention for this injection technique has a heightened longevity because of its popularity, and it underscores the usefulness of behavioral based security products. Let's take a look at some of the low level activity of the subject of yesterday's post.

Using a variety of monitoring tools, we can see that the software creates an Internet Explorer process in a suspended state. Eventually, that process is started and sends yahoo messenger spam off of the system and performs a variety of tasks. Let's use one of our favorite debuggers, Ollydbg, to identify the hijacking activity.

The dropper overwrites the entire code section of iexplore.exe process after it starts the browser in a suspended state. We'll throw the first executable, up.exe, into one of our favorite debuggers, Ollydgb. We search its list of imports, set a breakpoint on CreateProcessA and run the executable. These listings show the unusual command-line parameter and provided to CreateProcess:






The stack shows the CREATE_SUSPENDED state of the new process:







The ProcessInfo structure that is passed back out of this call provides handles to both the process ID (PID) and the main thread ID (TID) of the newly created Internet Explorer process. These handles will be re-used later in the routine. For now, the hijacker will call GetThreadContext to copy out all of the values held in the registers of the currently suspended iexplore main thread. They will be used when the thread's execution is resumed:







We see the entire .text section of iexplore mercilessly overwritten and extended with a loop that calls WriteProcessMemory and VirtualProtect on ten separate occasions. It's a lot of work to hijack IE successfully!
In effect, this work completely overwrites Microsoft's code, making Internet Explorer just a shell for the injection code to work within:








Now that the executable code has been tediously copied into IE, the context of the suspended thread is set back to its original environment (actually, a small trick is used and just the context defaults are used) and the newly overwritten thread's execution is resumed:








What looks like a familiar browser process is not a browser at all.

If your security solution doesn't already stop an old habit like this one, you might want to find out why not.

Tuesday, March 11, 2008

Snaps, Crackle, Pops or Get Your Wheatys

Some things about windbg are just great. But often, they come with a little bit of work.
For one, dll load analysis can be performed with ease, even on unusually crafted files. Like the kinds of files you would see from hackers and eventually malware authors. Want to review the entire flow of process creation on a malformed PE? No problem.
You can do it in a snap with windbg. Or rather, you can use windbg to observe and understand how the process loader performs its work, including "loader snaps" (which doesn't get mention in Russinovich's Internals books). Unfortunately, I couldn't get the provided Gflag utility to help enable "Show loader snaps" as Matt Pietrek at wheaty.net informed us way back, when he showed off snappy output from his debugger. But his article is an inspiration to understanding loader internals. It detailed enabling loader snaps using the gflag utility and its results -- "captures detailed information about the loading and unloading of executable images and their supporting library modules....For per-process (image file): Whenever a DLL is loaded, this flag writes the loader contents (and data related to DLL loading) to the program debugger console". This article is a great source of information for trying to analyze these sorts of very unusual binaries.

Windbg can break extremely early in the load process, so we can choose a breakpoint within the dll loaded first into any process, even before kernel32.dll loads -- ntdll!LdrpInitialize. When the debugger breaks, we can set ntdll's ShowSnaps global flag to "1" early enough to see all the modules loading and their corresponding user-mode debug Snap messages:
0:000> dw ShowSnaps L1
7c97c121 0000
0:000> ew ShowSnaps 1
0:000> dw ShowSnaps L1
7c97c121 0001

Now we run our harmless sample and wait for the loader data we are looking for:
0:000> g
LDR: PID: 0xe38 started - 'C:\vx\tiny.exe'
LDR: NEW PROCESS
Image Path: C:\vx\tiny.exe (tiny.exe)
Current Directory: C:\Program Files\Debugging Tools for Windows\
Search Path: C:\vx;C:\WINDOWS\system32;C:\WINDOWS\system;C:\WINDOWS;.;C:\Program Files\Debugging Tools for Windows\winext\arcade;
LDR: LdrLoadDll, loading kernel32.dll from
ModLoad: 7c800000 7c8f5000 C:\WINDOWS\system32\kernel32.dll
LDR: ntdll.dll used by kernel32.dll
LDR: Snapping imports for kernel32.dll from ntdll.dll
LDR: LdrGetProcedureAddress by NAME - BaseProcessInitPostImport
[e38,74c] LDR: Real INIT LIST for process C:\vx\tiny.exe pid 3640 0xe38
[e38,74c] C:\WINDOWS\system32\ntdll.dll init routine 7C913156
[e38,74c] C:\WINDOWS\system32\kernel32.dll init routine 7C80B5AE
[e38,74c] LDR: ntdll.dll loaded - Calling init routine at 7C913156
[e38,74c] LDR: kernel32.dll loaded - Calling init routine at 7C80B5AE
LDR: LdrGetProcedureAddress by NAME - BaseQueryModuleData
LDR: \\66.93.68.6\z used by tiny.exe
LDR: Loading (STATIC, NON_REDIRECTED) \\66.93.68.6\z
ModLoad: 10000000 10000370 \\66.93.68.6\z
LDR: USER32.dll used by z
ModLoad: 7e410000 7e4a0000 C:\WINDOWS\system32\USER32.dll
LDR: GDI32.dll used by USER32.dll
ModLoad: 77f10000 77f57000 C:\WINDOWS\system32\GDI32.dll
LDR: KERNEL32.dll used by GDI32.dll
LDR: Snapping imports for GDI32.dll from KERNEL32.dll
LDR: LdrGetProcedureAddress by NAME - RtlDeleteCriticalSection
LDR: LdrGetProcedureAddress by NAME - RtlLeaveCriticalSection
LDR: LdrGetProcedureAddress by NAME - RtlEnterCriticalSection
LDR: LdrGetProcedureAddress by NAME - RtlSetLastWin32Error
LDR: LdrGetProcedureAddress by NAME - RtlGetLastWin32Error
LDR: ntdll.dll used by GDI32.dll
LDR: Snapping imports for GDI32.dll from ntdll.dll
LDR: USER32.dll used by GDI32.dll
LDR: Snapping imports for GDI32.dll from USER32.dll
LDR: Snapping imports for USER32.dll from GDI32.dll
LDR: KERNEL32.dll used by USER32.dll
LDR: Snapping imports for USER32.dll from KERNEL32.dll
LDR: LdrGetProcedureAddress by NAME - RtlReAllocateHeap
LDR: LdrGetProcedureAddress by NAME - RtlSizeHeap
LDR: LdrGetProcedureAddress by NAME - RtlSetLastWin32Error
LDR: LdrGetProcedureAddress by NAME - RtlGetLastWin32Error
LDR: LdrGetProcedureAddress by NAME - RtlAllocateHeap
LDR: LdrGetProcedureAddress by NAME - RtlFreeHeap
LDR: ntdll.dll used by USER32.dll
LDR: Snapping imports for USER32.dll from ntdll.dll
LDR: Snapping imports for z from USER32.dll
LDR: Snapping imports for tiny.exe from \\66.93.68.6\z

Snap and dll load data of all sorts is provided for further exploration and analysis. We can investigate exactly when and how each dll is loaded from within this malformed PE file with windbg's help.

An exhaustive source of information on the process of dll loading can be found on the Win2k loader here by Russ Osterlund.

Wednesday, January 9, 2008

Microsoft MS08-001 Reversing

If you are yet unaware, Microsoft pushed out another couple of security updates this month and posted about it in their new "Microsoft Vulnerability Research and Defense" blog. Msoft started it a couple of weeks ago, providing lower level technical information about the vulnerabilities they are fixing.
Be sure to install MS08-001 if you haven't already.

The first of the updates, MS08-001, provides reason for caution, because it allows for reliable exploitation. Surprisingly, we have not seen any public exploitation or even PoC just yet.
You can watch a great four minute video of MS08-001 patch analysis by the makers of Bindiff, a binary diffing tool used to uncover security vulnerabilities like this one. Grab your popcorn, bring a date, and head on over. I'll ruin the ending for you...of the nine functions changed in the tcpip component that was patched, they examine one function that iterates a list of structures and mistakenly performs a bad comparison. They even find some overwriteable memory for successful exploit!

Tuesday, January 8, 2008

Help.exe still not much of a helper

One of the highest hitting worms that ThreatFire encountered over the past week is a worm designed to target online game player logins by dropping a password stealer and rootkit components on infected systems. We previously blogged about the help.exe component that drops rkd.dll, amvo0.dll and amvo.exe, and now we observe many more variants that are repacked with some fairly sophisticated packer and code perversion technology.

The password stealers themselves are updated on various websites that we have observed moving locations throughout China, repacked for AV and emulation evasion purposes. We also see ongoing server side polymorphism with the dropper.

The executables all display very unusual static PE characteristics. First, the import directory contains the name of one dll (kernel32) and imports only three of its functions (LoadLibraryA, GetProcAddress, ExitProcess), the bare bones minimum that you need for a PE packer:


























All of the section names are mangled, to further raise our suspicion:


























And finally, the resource section is huge and unrecognizable to a simple resource section parser (hint -- it contains more executable code):



























Unfortunately, effectively this incessant rate of change results in a low rate of AV scanner detection:


























If you are seeing a popup like this one, go ahead and quarantine the thing:


Monday, December 31, 2007

Reversing a suspicious dll continued

In a post earlier this month, I presented steps for unpacking and restoring the IT/IAT of a suspicious BHO for analysis purposes. In that case, it was packed with a tool called "Upack", otherwise known as the "Ultimate PE Packer" by its author Dwing. Upack often is used on executable files around 40kb in size. It compresses the file's contents with the LZMA algorithm and adds an unpacking stub to the target file for self-decompressing at runtime.
In other words, to make a file smaller for download and delivery without requiring a decompression utility like WinZip or WinRar to already be installed on another system at runtime, an author can compress their executable creation with this tool.
This posting will work with the PE file that was recreated from that previous work.

Here are some of the steps we used to work on this file, leaving off at the last step to identify some behaviors of this malicious file:
Change PE file to .exe in PE header, rename dll to exe extension
Load into Ollydbg
Find OEP (original entry point) -- pretty easy with Upack
Break at oep and dump file from memory to disk
Fixup IAT with ImpRec and write to dumped file
Rename fixed file and modify PE header back to dll
Load into IDA Pro 5.1 with the IDA Python plugin installed...

When we load this file into IDA Pro, the disassembler now can provide a listing that can be used to reverse engineer the component's functionality. Without properly unpacking the file and fixing up the imports, the disassembler cannot analyze the code.
However, the listing doesn't seem to immediately reveal much about the component's activity. But knowing that this component is a BHO helps identify key areas for reversing progress. We do see fundamental Win32 API calls like "AtlInternalQueryInterface" and "AtlComPtrAssign", leaving clues about COM programming within the component. The location of these calls can lead us further down the control flow to locations where COM calls can be further analyzed and easily understood. Joe Stewart published information about reversing OLE, but this code is more complex than a common SubmitHook trojan.
Frank Boldewin's Python scripts come in handy for walking through these COM calls -- the listing now reveals a section where the code obtains the "document" interface within the web browser and enumerates its connection points. We can set memory breakpoints on these sections for further analysis, and when we visit various banking web sites, we can see that the BHO is building an event sink:

















Once the event sink is set, GetKeyState is then called on "KEY_DOWN" events. The component can check on each individual keystroke as they are hit. And it appears that the only keystrokes being checked are the ones emanating from the userid and pass input fields.

So, we've got a dll that identifies Urls of banks and other financial institutions and, after parsing and identifying an "interesting" Url, then constructs an event sink attached to very specific fields within the browser's web page -- namely, userid and password input fields. This ActiveX component will log these keystrokes and send them off the system. The component calls "HttpSendRequestA" to send off the banking usernames and passwords it just collected from these fields. I think that we've found an interesting piece of malware, quite possibly a password stealer for banking websites. We'll add more technical detail to this post as time permits.
It helps to be able to dump this file and modify it for static analysis.

Tuesday, December 18, 2007

Shellcode analysis -- download n' exec

In a previous post, I mentioned that we could use c code to analyze some shellcode currently being posted in the wild by malicious web site operators.

These malicious websites are delivering malware by exploiting several Windows based vulnerabilities. The websites attack visitors by targeting vulnerabilities in .ani file parsing, .wmf file parsing, and rtsp content-type string parsing in the QuickTime plugin.

In our labs, we visit these web sites with vulnerable systems, allowing the pages to compromise the systems. We then analyze the techniques being used. Let's take a quick look at a major part of the attack -- the shellcode within the delivered malformed wmf file. We'll take a look at the low level data content of the malformed file itself:





















After seeing a lot of these malformed files, you can spot the shellcode right away. I did in the above image after a quick visual scan, but sometimes details of the file format need to be known to find the shellcode on the first try.
We copy out the string of shellcode hex data into a c-style string, like this one:
"\x83\xec\x10\xd9\xee\xd9\x74\x24\xf4\x58\x33\xc9\xb1\xdb..."

I copy it into the buffer in the c file from the previous post, and the assignment will look like this:
unsigned char shellcode[] = "\x83\xec\x10\xd9\xee\xd9\x74\x24\xf4\x58\x33\xc9\xb1..."

I compile it using gcc, but you can use the cl.exe Microsoft compiler if you would like -- whatever c compiler should be fine. I've never seen a problem with substituting one for another:
C:\sh\>gcc sh3ll.c -o sh3ll.exe

The compiler emits an expected warning that can be ignored, and now we have an executable to work with. We'll run it in Olly to its entry point, and then search for the beginning of the shellcode string in memory. When we find it, we'll set a memory access break point on that memory location and then let the process run to that point by hitting f9.
When the debugger arrives at this starting point for the shellcode, the debugger shows us a very strange listing -- "jno" instruction followed by a bunch of "cnq" instructions? The listing looks very strange:


















We hit f7 a few times and notice "xor byte ptr ds:[eax+12], 99", followed by a loopd instruction that takes us back to a few lines prior. This loop is an xor decoder loop, implemented in this shellcode because we are exploiting BoF, and usually that means we are attacking a string handling flaw. Any "00" or null bytes in the code will likely crash the code, as explained in chaps 3, 7, 9.
We also notice that ecx is set to "0xdbh" at 0040200e, meaning that this loop will decode the subsequent 219 bytes of data:










We can continue stepping through the code with f7, watching the decoding taking place, until ecx decrements to zero. When it finishes, we step through a bit more slowly.
Stepping into the instructions with f7 now reveals the code searching for kernel32's location in the process space using the common and reliable technique of parsing the PEB and its module initialization linked lists. It then searches for LoadLibraryA, ExitThread, and WinExec win32 api calls. It loads urlmon and finds URLDownloadToFileA. These calls all tell us that this shellcode's functionality is download and execute -- and we can observe the url strings that the code is communicating with.
Download and execute shellcode like this happens to be some of the most prevalent shellcode that we see served up by malicious web sites.

Hope that you learned a few things about the sorts of techniques we can use to analyze shellcode and its behaviors. Let me know what you think of it!

Monday, December 17, 2007

Tool for shellcode analysis

Here's some favorite c that I use to reverse engineer shellcode that I collect from malicious files, malicious web sites and attacking network traffic:


unsigned char shellcode[] = "";

int main()
{
void (*c)();
printf("Shellcode it is!\n");
*(int*)&c = shellcode;
c();
}


Basically, the code creates a buffer that stores your collected shellcode, creates a pointer to a void function empty of instruction, points the function to the beginning of the buffer and transfers control to it, just like an attacker's exploit. Drop the hex into the array as a c-style string, compile it, and toss it into Olly for stepping and analysis!
We'll look at a current example from a site in the wild in an upcoming post.

Wednesday, December 5, 2007

Unpacking a suspicious dll -- top to bottom

Fyi, this writeup is geared to satisfy curiosities about technical stuff, to start responding to some of the interest expressed over at our forum. You have been warned...

We use Ollydbg for all sorts of things around here. It's an outstanding tool. In fact, Olly himself found some spare time and is releasing a new version soon. He's got the pre-alpha version 2 code available on his website.

His debugger is a very useful tool for reversing user-mode software. When we're looking to get to the bottom of a suspicious component, one way is to fire up olly and get started. Unfortunately, there are challenges to that approach. Sometimes, we need to understand what a dll or other component is doing as well, and sometimes those dlls and other components are packed.
There are other tools that we use, and this post will survey the steps for using them while unpacking a dll...you can find this sort of information all over the web, but the writing styles sometimes make understanding the content very difficult.
Some of the fine reverse engineering tools available are
Ollydbg
LordPE
Import Reconstructor
IDA Pro

In our labs, we have a suspicious dll to examine. Apparently, it was installed as a bho into Internet Explorer:





















When you load this dll into Olly, the tool reports that its listing of the binary's instructions are most likely inaccurate. IDA Pro can't disassemble the binary either.
So we can use a couple of tools to help identify if this executable has been tampered with. One popular tool is PEiD. PEiD detects "Upack" as the packer used here, and usually is pretty accurate. You can also take a peek with ProtectionID.
Upack is a very simple packer, used to compress executables, and can make file examination only somewhat difficult. There are no antidebugging tricks that it employs to be concerned with. Here is PEiD in action, identifying the file as packed with UPack by Dwing:























If we want to load it into olly and dump it for full unpacking, one way to start the unpacking process is to simply rename the file extension to "exe" and modify a flag in its PE header so that windows loads the file as an exe, not a dll. You can take a course from a reverser like Jason Geffner on deobfuscation and read all the PE documents, then perform the math, pop open Ultraedit or hexedit and manually edit the file's PE header. Or you can run LordPE on the file and simply deselect the "Dll" checkbox under its file characteristics:
























After you save your modification, load up the file into Olly and identify the program's original entry point, or OEP. This work can be time consuming when learning about a new packer. But Upack is a simple packer. It's much like UPX, the industry standard, but it uses the LZMA compression algorithm. A reverser might notice that the first instruction of the unpacker is "pushad", followed by a call instruction:

















The easiest thing to do would be to scan the rest of this section for a matching "popad" instruction followed by a jmp to the beginning of the lzma decompressed code. When we do that, we find a popad (a restore of all the register values that were pushed onto the stack at the beginning of the unpacker stub) followed immediately by a jmp to the .Upack section that was previously empty:























At this point, we can hit "F7" to step into this new code section, use Olly's "Analyse" function and voila, we see
push ebp
mov ebp, esp
and we are most likely at the dll's original entry point (OEP):






















Great! Now, using LordPE again we can dump the file to disk and fix up the Imports with ImpRec. Here's a view of LordPE options for attaching to a process and dumping an individual module to disk:











Now that we have the image dumped to disk, we can use Import Reconstructor to attach to the dll's process as it is suspended at its OEP, find the import address table in memory and then fixup the dumped image on disk:























We have to provide ImpRec with the OEP. Hopefully it then can find the Import Directory and IAT for us, and with UPack, it reliably completes the fixup for us. Clicking on "Fix dump" and selecting the image dumped by LordPE will provide us with an unpacked file that we can next throw into IDA Pro for disassembly and analysis, which will be another post:























Hope that satisfies some of the curiosities of our forum readers, next we'll take a look at some of the malicious behaviors this dll performs.


Note- This example worked through one of the simplest packers out there, Upack. For more information on unpacking tricks, you can find a couple of awesome lists of tips and tricks related to anti-debugging/anti-reversing and at openrce and Mark Vincent Yason's Blackhat paper.

Tuesday, November 27, 2007

Online games and false positives

Online games have always had the problems of cheats, password stealers and bots. Volumes of information have been written on the topic, including Hoglund and McGraw's published material. In response, game developers at studios like Blizzard Entertainment and Amped have developed ways to unexpectedly "govern" the software that is running on their users' systems, and ways to "harden" their software against reverse engineering attempts. For better or worse, these "tools" have turned into somewhat intrusive tools that peek into everything on the system and prevent RE activity using methods similar to those used by malware writers.

Sometimes, these defenses cause problems for the software security industry. You can see here from virustotal signature-based scan results today that our Tantra-playing friends in the Phillipines trying to play "Tantra" might be interrupted by their game's security software:


























These problems cropped up with today's binaries, and have cropped up in the past. In August, AVG already was detecting the "tantrum.exe" component as a virus with its generic packer detections: Regarding Virus "obfustat.iiy" On Wr Ph, Problem Fixed
The problem, in part, for the av signature-based products seems to be the packer. The packer that Amped is using, Molebox, is polymorphic and provides some difficulties for black, grey and white hat reversers trying to peek into the code behind their tantrum.exe component. Malware writers and distributors in the recent past have used molebox to evade detection and make their creations more difficult to reverse engineer. You might notice that the screenshot above shows that Ikarus detects the component as "Rbot".

For behavioral-based security products, a problem arises when these components, which have very similar file characteristics to malware that we've seen, exhibit behaviors similar to malware. For example, this Tantra game component injects itself into operating system components in the same way as backdoors like Bifrost and other trojans.

For now, it seems that these problems will be ongoing. The game developers need to protect their games the best that they can, and security software products need to be as sensitive as possible.