Use only Ida Pro to analyze Lab05-01.dll.
Lab summary questions
-
Considering malware analysis as a whole, how important is it to a general aspiring analyst to complete this chapter’s exercises?
As important as the rest of the book thus far. Being able to use Ida to navigate through the binary and answer general questions about an executable is an imperative skill for malware analysts.
-
What did doing the lab exercises teach that was not gained from reading the chapter?
Tons of practice using Ida. Also, it helped me appreciate how quickly it is to accedentally misrepresent certain portions of disassembly (via accedentally converting instructions to constants, or strings to code, or etc). However, this can usually be undone, and “undefining” the data is very helpful in a lot of cases. (U undefines data.)
Challenges encountered in the labs
Ida pro is needed for a couple of questions of this lab, and unfortunately, I don't have that.
Furthermore, I struggled with the symbolic constant naming portion of the lab simply because a DLL referenced in the book didn't seem to exist within my virtual environment. The problem was sidestepped via looking directly at MSDN documentation. See below for details on where the problem was encountered.
Lab Walkthrough (includes answers!)
First thing I did was change the settings to include addresses in graph mode, as was suggested by the authors of the book.
5.1
-
What is the address of
DllMain
?This one is easy enough. Just look in the functions window, scroll down to DllMain, and click it.
0x1000D02E
. -
Use the Imports window to browse to
gethostbyname
. Where is the import located?Imports is a tab at the top of the disassembly window. Going in there and hitting ctrl+f to search for
gethostbyname
is how I arrived at this answer. -
How many functions call
gethostbyname
?Ah, we need to find all cross references to
gethostbyname
and count the number of unique functions in them. . Do this by hitting x whilegethostbyname
is selected.I count 5 unique functions.
This looks like good information, but what’s that strange letter in the second column? What do the letters stand for? This helpful post provided me insight into the three types of cross references.
- r: read
- w: write
- p: pointer
-
Focusing on the call to
gethostbyname
located at 0x100001757, can you figure out which DNS request will be made?I had a minor panic attack when I didn’t see address 10001757 above, but then I noticed if we do some simple math we find that 1001656 + 101 gives us that number. Woo, ADDITION
eax
is being pushed as a parameter. If you hover over the operand for mov, it says “[this is RDO]pics.practicalmalwareanalys…”. Then 0xD is added to this, which pushes the offset to the beginning of the URL.pics.practicalmalwareanalysis.com is the DNS request being made.
-
How many local variables has IDA Pro recognized for the subroutine at 0x10001656?
Let’s count em! I believe every variable in the list is a local variable, and there are 24 of these. But wait! One of the elements of this list is lpThreadParameter, which is actually a parameter. So, 24-1=23 local variables.
-
How many parameters has IDA Pro recognized for the subroutine at 0x10001656?
Looking at how the function is called, I see one parameter: lpThreadParameter. 1
-
Use the Strings window to locate the string \cmd.exe /c in the disassembly. Where is it located?
0x10095B34
-
What is happening in the area of code that references \cmd.exe /c?
Double click the string, then click the address of this string in the disassembler and find cross references. Find that there is only one:
Proceed to click this reference to see what’s going on and Ida helps us identify quickly that it’s a check to see whether cmd.exe or command.exe is the correct command to run command prompt on whatever version of Windows this malware is attacking.
-
In the same area, at 0x100101C8, it looks like dword_1008E5C4 is a global variable that helps decide which path to take. How does the malware set dword_1008E5C4? (Hint: Use dword_1008E5C4’s cross-references.)
If we look at the xrefs of this, there are three. Two are reads, and one is a write. The write is the one we are interested in because writing the variable implies that’s where it’s set to something. So if we look at where that happens…:
So, that area is populated with whatever is in
eax
after functionsub_10003695
runs. Usuallyeax
contains return values, so we want to investigate what this function does.This function is a wrapper for the Microsoft function GetVersionExA, which can quickly be identified (via a google search) as a function that returns the operating system version running on the machine.
-
A few hundred lines into the subroutine at 0x1000FF58, a series of comparisons use memcmp to compare strings. What happens if the string comparison to robotwork is successful (when memcmp returns 0)?
Look at the strings, identify where “robotwork” is, cross reference it, find it’s used here:
So if the comparison returns 0, then eax will be 0. Test sets the flags, and then JNZ jumps if not zero, meaning the jump will be taken (the green path) if the zero flag is NOT set. The question asks what happens if memcmp returns 0, which would result in the zero flag being set, which means this jump will NOT be taken, so we follow the red path.
The red path takes us to a guaranteed calling of sub_100052A2 with parameter “s”. This function appears to query a registry key to find some data value labeled as Robot_Worktime.
-
What does the export PSLIST do?
After briefly looking over the functions called from within PSLIST I believe PSLIST is looking through a set of processes trying to find one to inject code with. (I see “OpenProcess” in each function guaranteed to be called from within PSLIST).
-
Use the graph mode to graph the cross-references from sub_10004E79. Which API functions could be called by entering this function? Based on the API functions alone, what could you rename this function?
After finding the function and getting its disassembly onscreen, I went to view -> graphs -> Xrefs from: and obtained this graph.
strlen, sprintf, GetSystemDefaultLangId, malloc, send, free
I’d name it something like
SendSystemLanguage
. -
How many Windows API functions does DllMain call directly? How many at a depth of 2?
Four direct calls: Strncpy, createThread, strlen, _strnicmp.
In order to view calls at a depth of two, we can go to view -> graphs -> User Xref chart and choose a recurive depth of 2. However, this doesn’t seem to have any affect on Ida demo and i’m not sure why. The information could be derived manually, however, from doing xref charts on each of the above listed functions, so I’m not going to spend more time on it.
-
At 0x10001358, there is a call to Sleep (an API function that takes one parameter containing the number of milliseconds to sleep). Looking backward through the code, how long will the program sleep if this code executes?
A pointer to the string “[This is CTI]30” is moved into
eax
. Then, 0xD is added to it, which moves forward the address to point directly to the “30” part of the string. Then, atoi converts it to a number and stores it ineax
. Up next, 0x3E8 is multiplied into the 30 ineax
. I clicked this hex number and hit h to turn it into a decimal number, which is 1000. So, 30*1000 is 30,000 milliseconds, which is 30 seconds. -
At 0x10001701 is a call to socket. What are the three parameters?
Do a ‘goto’ and go to that address. Ida lists what the three parameters are:
- Protocol: 6
- Type: 1
- af: 2
All we need to do is identify what those numbers refer to. Pull up the microsoft documentation for
socket
and we find that:- Protocol of 6 refers to IPPROTO_TCP, which (shockingly) configures the socket to use the TCP protocol.
- Type of 1 means SOCK_STREAM, which is ‘a socket type that provides sequenced, reliable, two-way connection-based byte streams with an OOB data transmission mechanism.’
- af (address family) of 2 means AF_INET– IPv4 address families.
-
Using the MSDN page for socket and the named symbolic constants functionality in IDA Pro, can you make the parameters more meaningful? What are the parameters after you apply changes?
We can make the parameters more meaningful, if Ida Demo actually found the right library to load. It’s slightly inconvenient to look at MSDN documentation, but it gets the job done, so I’m going to move on.
-
Search for usage of the in instruction (opcode 0xED). This instruction is used with a magic string VMXh to perform VMware detection. Is that in use in this malware? Using the cross-references to the function that executes the in instruction, is there further evidence of VMware detection?
If we do a search -> text and click find all occurances and search for the text
in
, we find that in the resulting list there is only onein
instruction. If we go to that location and cross reference the function that contains the instruction, we find that three functions call it: InstallRT, InstallSA, and InstallSB. The functions all look very similar. There certainly is evidence of VMWare detection. -
Jump your cursor to 0x1001D988. What do you find?
We find information interpreted as characters by Ida.
-
If you have the IDA Python plug-in installed (included with the commercial version of IDA Pro), run Lab05-01.py, an IDA Pro Python script provided with the malware for this book. (Make sure the cursor is at 0x1001D988.) What happens after you run the script?
I don’t have Ida Pro :/
But the book states that the obfuscated string would be decoded to reveal a backdoor.
-
With the cursor in the same location, how do you turn this data into a single ASCII string?
You’d hit the A key to turn it into a readable string after running the plugin.
-
Open the script with a text editor. How does it work?
I’ll omit this question since I cannot figure it out without doing the above two parts legitimately.