Showing posts with label Obfuscation. Show all posts
Showing posts with label Obfuscation. Show all posts

Thursday, July 3, 2008

Return of Rustock?

Return is a powerful concept in many ways. In literature, return can touch on the limits of faith, love, loyalty, friendship, fidelity and mortality.

Homer's Ulysses wanders for years, returning to his home and his family in disarray. Initially, the only witness to recognize Ulysses in his home is his old dog Argus, faithfully waiting for his master's return over those 20 years: "As soon as he saw Odysseus standing there, he dropped his ears and wagged his tail, but he could not get close up to his master. When Odysseus saw the dog on the other side of the yard, dashed a tear from his eyes...But Argos passed into the darkness of death, now that he had seen his master once more."

Edward Fitzgerald's "The Rubaiyat of Omar Khayyam" speculates on the importance of understanding the inability to return:
"Then to the lip of this poor earthen Urn
I lean'd, the Secret of my Life to learn:
And Lip to Lip it mumur'd -- "While you live
Drink! -- for, once dead, you never shall return"

Unfortunately, in our last round of spambots, we find lots of return. However, these returns do not provide deep insight or wistful second comings. Instead, these returns serve to obfuscate the functionality of the rootkit driver component ("pgasghjd.sys") that appears to be the newest project of one of the rustock creators:
C:\progz\NewWork2\driver\objfre\i386\driver.pdb

Return is a powerful computing concept, and an important part of any CPU instruction set. The "RET" or "Return from procedure" instruction "transfers control to a return address located on the top of the stack".
These returns are used in an unusual way in the unpacking stub of the driver, avoiding making standard calls early in the routine. Here is the driver's entry point.



Notice the push of a hard-coded offset and the immediate return. This unusual sequence of assembly instructions simply pushes a return address to the stack, only to take control when the "ret" or "retn" is executed and control flows to this new offset. This sequence can be used as an effective emulator evasion trick.

These returns do not provide anything all that valuable, instead, these returns help to produce the unwanted spam, clogging global network pipes and peddling "male enhancement" drugs. These are the messages that are crass and vain, including with them a link to a couple of these "drug" peddling web sites. Obscene messages are not reproduced here, but here are a few examples:
"Give your chick a night to remember"
"Make sure you don't get left out of the action at parties"
"Fantastic results guaranteed"

Some returns come with really bad literature.

Tuesday, June 17, 2008

Will the Real Virtumonde Please Stand Up?

It seems that quite a bit of malware is being classified as Vundo (Virtumonde) these days. With the volume of malware currently being distributed in dynamic link library form, it is not always easy to differentiate one from another. Frequently these modules are statically linked with C and C++ runtimes, compression, and GUI libraries, which can slow analysis down. In addition to all this embedded library code, Vundo's code seems to be under constant development and is updated to fix bugs, add a new piece of functionality, or add more randomization to prevent signature recognition quite frequently.

However, there is one construct that the developers behind the code seem to enjoy using. In almost every place where an event and sometimes registry value names are created, the name is generated by a function which is similar between variants.

The function derives this name from an attribute of the infected computer. The attribute is the serial number assigned to the "C:" drive volume when it was last formatted by the operating system. Then, the serial number is randomized by one or more bitwise cpu instructions against a number selected by the programmer. The result of these operations is converted into a string and returned for use.

The recognition of this function can help positively ID a Vundo sample. The source code representation of this function would look similar to this:


#include <windows.h>
#define arbitrary_vundo_number 0xFDEC

int generate_number(char *output)
{
int return_value;
DWORD volume_serial_number;

return_value = GetVolumeInformation("c:\\", NULL, 0,
&volume_serial_number, NULL, NULL, NULL, 0);

volume_serial_number ^= arbitrary_vundo_number;

return wsprintf(output, "%08x", volume_serial_number);
}


Actual Vundo assembly code looks like this:

push esi ; nFileSystemNameSize
push esi ; lpFileSystemNameBuffer
push esi ; lpFileSystemFlags
push esi ; lpMaximumComponentLength
lea eax, [ebp+VolumeSerialNumber]
push eax ; lpVolumeSerialNumber
push esi ; nVolumeNameSize
push esi ; lpVolumeNameBuffer
push offset RootPathName ; "c:\\"
mov [ebp+VolumeSerialNumber], 123h
call ds:GetVolumeInformationA
xor [ebp+VolumeSerialNumber], 34D2121h
push [ebp+VolumeSerialNumber]
push offset a08x ; "%08x"
push [ebp+arg_0] ; LPSTR
call ds:wsprintfA
add esp, 0Ch
pop esi
leave
retn

Tuesday, June 3, 2008

Bancostrings

When does BCD0236E965582D56DD365E44BD764FA5DFD6CBF312BB124AA2563B5C2 mean ":: Bradesco Pessoa Fosica ::"? Only when CD30ABC0221E5486A23D0F619DB27FC50110504DB9D3DC357893D269E177CB2D1BD1758CCC77AA93ED3DBA190A7BD914B80F5254919C2DC0D471B02CC20260CC4CB2C73A5B really means "HSBC Bank Brasil S.A. -- Banco Muliplo -- No Brasil e no mundo, HSBC", of course.

A couple of previous posts provided insight into what clues strings provide when performing malware analysis, and a concise description of how to decrypt obfuscated strings in a static file using advanced IDA Pro functionality.

Here, we'll use a debugger to step through a malicious file in the lab and observe data as it is decoded by the malware itself. Sometimes, when speed is a priority and not all that many strings are involved, stepping through the decryption loop prior to writing an IDA script is another good approach to have in the toolkit.
We've started the executable within Ollydbg. No human-readable strings are visible to the analyst here, but a quick look at the text section following some unpacking reveals multiple arrays of garbled text. Also suspicious is that each string of unreadable, or probably crypted, data is being passed by pointer to the same function. Most likely, this procedure includes the decryption loop that we are looking for. Each call to this same procedure being passed a pointer is highlighted in a red below:



We can review this loop, setting a breakpoint on the procedures that are passed these strings as a parameter. Somewhere along the way in here, the decrypted data is most likely written out to memory or as a hash. As we single step through the code (hitting F7), we'll watch for pushes, pops, repeated movs intructions, and look for pointers to strings and data copies from esi to edi. We find an interesting loop here after the garbled text is pushed onto the stack. Notice that string data is being copied from esi to edi:



Following edi in the data dump displays the memory contents as they are written out and decrypted by multiple layers and loops. Setting a breakpoint here and running through the loop reveals the decrypted data. We can single step through this loop to evaluate the decryption algorithm.
Eventually this decrypted data is passed to another function via pointers on the thread stack. Now that we've run through the loops, we can identify a list of banks and web sites that our portuguese speaking friends in Brazil may recognize:



Having identified these strings within the malware, we craft few custom written empty web pages with these strings as title bar content. We then open the html pages with Internet Explorer. We'll witness images stored within the malware being presented in the foreground of the browser, waiting for our login id's and passwords. Here are a few related screenshots:




These strings helped lead us to identify another all too popular Brazilian banking password stealer. Done with these strings, off for a little samba and sun on the coast of Buzios!

Tuesday, May 20, 2008

Keeping strings real - Part II

In part 1 of keeping strings real, strings were chased around in a disassembler to provide insight into the functionality of a piece of malware. Part two investigates the instance where there seem to be no recognizable strings in the target at all.

When doing a quick skim of a malware file (in this case, ldr.exe md5:007571544614a7646e750a51ccaf2e9e), sometimes you encounter data that looks similar to the below image. In this particular instance, it seems that these strings are encrypted.









Encrypted strings make it very difficult to do quick analysis. Fortunately, a there are a couple of options to get past this small road block. The string can be observed as it is passed around between functions, or debuggers can be used to halt execution when the target string data is accessed.


For this sample, the above strings were followed from the initial cross reference to a function that is self contained. It takes arguments, executes some code, and returns. The code has a loop, operates on individual bytes (reads from pointers into register halves), and performs a few additions. Is this a possible candidate for a string decryption? You betcha.

After walking through the function once in a debugger, it becomes obvious this function decrypts the string to a different buffer. This is an excellent first step, but there are a massive amount of strings in this file. It would be less than desirable to execute this function in a debugger and make note of the result for each and every encrypted string present. There has to be a faster and more elegant way to figure these out. Now what?


Enter Cryptanalysis. This particular function is not very large or complicated, so determining the algorithm used to reveal the strings should not take an unreasonable amount of effort. After determining the algorithm, it is possible to write a program or script to accept the encrypted string data and output the decrypted string.


Below is what the reversed function looks like.

















This function accepts what looks like a null terminated pascal string. The first character in the string contains the length (0 to 255), followed by the ciphered string data, then a zero to indicate the end of the string.


The next step is to add the cipher key value to the first encrypted character in the string. This key value starts at 186 (or 0xBA in hexadecimal). On each loop pass, the key is increased by 2 and added to the next character in the string.


For instance, the character 'a' is represented by the number 97 (0x61). To encrypt this initial data based on the algorithm above, we would subtract 186 (0xBA). To decrypt it later, 186 (0xBA) is added to the encrypted data.


The result of this 97 - 186 subtraction is 167 (0xA7). This math looks funny, but it works this way when working with individual bytes and their associated range of 0 to 255 (unsigned).


This behavior is due to the wrap-around effect caused by an integer overflow. To see this in action on Windows, open calculator (calc.exe), change the view to scientific mode, then change the number system from decimal (Dec) to hexadecimal (Hex), lastly change the size from "Qword" to "Byte." Now you can type in 61 minus BA and the result is A7 (167).


Keeping the above math in mind, the algorithm can now be re-implemented using IDA's built in scripting language (IDC). The script will be need to be passed the source string data, extract a byte, add the key to the byte, store the result, add 2 to the key, and repeat this process till all bytes in the string have been processed.


The Byte() function will be used to extract the byte from the "address" of the string's beginning found in IDA's dissassembler window. The Message() function will display the deciphered byte in the message window, and the PatchByte() function will modify the representation of the byte inside of the disassembler window. (Note: PatchByte() can be commented out to prevent the script from actually modifying any data, it will simply print the result in the message window)


The script representation of this algorithm reconstruction is found in the image below, and the idc script itself can be downloaded from our PC Tools ThreatFire forum, where you can log in and scroll down the thread for 186plus2_decipher.zip:


















Now it is time for some fun. An encrypted string is selected in IDA for decoding and the script is launched. The result:




Keeping strings real - Part I

All malware researchers love strings. They allow us to gain valuable insights into the possible behavior of the sample being investigated. Even IT professionals, who do not research malware professionally, can make good use of these clues.

Here's a quick example of strings in a malware disassembly listing:

00403100 Security Troubleshooting.url
00403120 ot.ico
00403128 %s/soft/?c=%1.1d%d%1.1d
00403140 Online Security Guide.url
0040315C ts.ico
00403164 %s/test/?c=%1.1d%d%1.1d
0040317C Online Security Test.url
00403198 *.securemanaging.com
004031B0 *.safetyincludes.com
004031C8 *.securewebinfo.com
004031DC 85.255.117.158
004031EC 88.255.74.197
00403300 195.95.*.*
0040330C 194.187.*.*
00403318 turbocodec.com
00403328 flyvideonetwork.com
0040333C websoft-c.com
0040375C plus-codec.com
0040376C freerealitympegs.com
00403784 inc-codec.com
00403794 user_pref("browser.search.selectedEngine", "Search");
004037D0 user_pref("browser.search.selectedEngine"
00403840 \profiles.ini
00403850 Mozilla\Firefox
00403908 Software\Microsoft\Internet Explorer\New Windows\Allow
00403940 %sVersion\Internet Settings\ZoneMap\EscDomains\%s
004039A8 Domains\%s

Right off the bat, one might guess that there is probably something fishy going on with these domains in relation to Firefox and Internet Explorer settings. A quick google search on some of these domains yields many results which are seemingly related to malware. If the search result is some what ambiguous, a researcher can always plug a string into ThreatExpert to find related malware behavior.

Searching for "securewebinfo.com" on ThreatExpert yields plenty of results. Most of the strings found in this particular sample match up very nicely to the results found, so it is reasonably safe to assume that this sample is probably a variant. However, if the search results were inconclusive, one of the next steps a malware researcher can take is to disassemble the file in the IDA Pro.

What is this malware actually doing with those strings? We are glad you asked!

Below is the image of the strings in the disassembler. The following items are shown moving from left to right: the address in memory where the strings reside, the automatic name IDA gave this location, the string data itself, and last but not least, the cross reference (XREFs).







Navigating to one of the cross references changes the view to an array of string pointers as seen in the image below. This array also contains a cross reference, but to a function this time.







The function seen below was labeled "modify_IEXPLORE_SecurityZones" as it was found to call sub-functions which modify the registry associated with Internet Explorer's Security Zones.


The last loop in this function, "AddAllowPopup_loop", executes once for each item in the domain_name_array. Each item in the array will be added to the AllowPopup registry key. The next time Internet Explorer is run, those domains will be allowed to display pop-up windows at will. This code confirms our suspicions of malicious behavior.





Wednesday, May 14, 2008

Agent again, this time undetected

Several interesting surges in malware activity are showing up today. The most highly propagated that we are seeing is a large increase in the past 24 hours of an old friend that's been labelled "Trojan.Agent". The filename that we are seeing the most of is "wingmmesc.exe", and it continues to run rampant without much in the way of AV detection, including the new and improved engines to detect suspicious obfuscation:




We are investigating its spread and its packing techniques. While the outer layer was packed with upx, another layer of protection needs to be peeled back, which may explain low AV detections. In the past, this sort of stuff was spread via emails with "enticing" (often pornographic) messages with links to urls, like hxxp://aliodsf . com / video.exe. We'll get back with more detail.

Update...It appears to be related to the Sality family, because we're seeing lots of familiar Sality "WINEUJE.EXE" activity related to the downloader, a worm that's run around for a long time now, especially in Asia. It attempts to download .gif files from "kukutrustnet888.info" and "microupdate14.info", both domains that we've seen from this family before. We'll rename this one to a more appropriate Sality label, and more AV detections should begin to pick up, now that we've uploaded it to virustotal for sharing.

Tuesday, May 6, 2008

AMTSO and CARO Workshop

The AV industry was busy this past week amongst the blooming tulips in Hoofddorp, the Netherlands. Both an AMTSO conference and a CARO workshop was held the last three days of the week.

A large group of attendees arrived for the Wednesday all-day testing standards meeting, with more journalists in attendance than before. It was encouraging to see, because one of the AMTSO's formative goals has been to invite and include representatives from all parts of the computer security industry. Progress is being made toward a set of testing standards for anti-malware products for everyone involved.

The CARO workshop followed on Thursday and Friday, with presentations focusing on malware obfuscation from the AV industry's perspective (googling "datasecurity event caro" provides a link to the home page). The opening talk by Paul Ducklin from Sophos set the tone for most of the event -- legitimate compressors/packers are acceptable and good (according to a number of individuals in the AV scanner business), while software protection solutions like Themida and SVKP are unacceptable and evil (to a number of individuals in the AV scanner business).
It was interesting that while AV vendors and Ilfak Guilfanov of IDA Pro/Hex Rays spoke and gave presentations over the two days, none of the developers or vendors from Themida or ASProtect (a couple of software protection systems that were referred to in the presentations) were invited or presented their thoughts.

Even at the workshop, it seems that there remains disagreement on how the industry should handle software obfuscation, and there remains a sense that software obfuscation is a major source of problems for the AV industry. Whether it's due to difficulties in emulation, performance issues when unpacking, the complexities of the virtualization packers (where Sophos' Boris Lau showed that a single NOP instruction can be easily and inexpensively be translated into over 50 virtual instructions) or simply disagreement over how to identify what is behind software protection, it continues to be a weakness for traditional AV scanners.
Just to give an idea of the volume of difficulties and tricks that researchers have to develop methods to deal with, Peter Ferrie's paper was presented by Mady Marinescu of Microsoft, and in it he enumerated over 50 anti-unpacking tricks commonly seen in packers and often seen in malware.
Presenters also included evaluations of the proportions of malware seen packed by specific packers and various approaches to dealing with them, including blacklisting. It seems that it is easier to include this approach in a scanner than to have to actually implement an unpacker in a scanner for all the different varieties of packers. Blacklisting is cheap and easy, but is more prone to causing fp's, and often decisions to blacklist may be debatable.
We will see what this turn away from extremely low false positive rates will do to the major advantage that the scanners had over behavioral based solutions.

From the perspective of an individual pushing a behavioral solution that solves for the difficulties that scanners have with obfuscation, it is somewhat easy to be critical of AV scanner products' inability to continue performing with such a low level of false positives and exacting matches in the face of ongoing obfuscation and "server-side polymorphism"/"rapid release" techniques currently used by malware distributors to evade the AV solutions. The complexity and difficulties are high for the guys trying to develop elegant and effective AV solutions to these problems.
We'll see more of this obfuscation topic, but from the "hackers" perspective, when defcon's "Race To Zero" contest is held this fall.