Lab summary questions
-
Considering malware analysis as a whole, how important is it to a general aspiring analyst to complete this chapter’s exercises?
This lab may prove to be less challenging for those already aquainted with C programming, but it was still very helpful because of the way it starts off very easy and builds up. I’d argue that this lab was the most fun one I’ve done so far.
-
What did doing the lab exercises teach that was not gained from reading the chapter?
This lab helps fortify the necessity of paying attention to details, and knowing exactly what the functionality of MSDN functions are.
Challenges encountered in the labs
I didn’t encounter too many challenges in this lab, but I believe that was due to my knowledge of C programming. It significantly sped up how fast I was able to gather the semantic meaning of the program because I didn’t need to look up nearly as many functions or parameters as I otherwise would have.
Lab Walkthrough (includes answers!)
6.1
-
What is the major code construct found in the only subroutine called by
main
?It’s an ‘if’ condition checking whether or not we’re connected to the internet.
-
What is the subroutine located at
0x40105F
?sub_40105F
is called if the code finds there is an internet connection. The first function call in this _stbuf(), which appears to be called in preparation for sending content to stdout. This is enough for me to believe that this function is simply a print function. -
What is the purpose of this program?
The purpose of this program is to print whether or not an internet connection can be made.
6.2
-
What operation does the first subroutine called by
main
perform?The first subroutine called by main is
sub_401000
, which appears to be the exact same subroutine we analyzed in the problem above (i.e. the one that checks to see if we have an internet connection). -
What is the subroutine located at
0x40117F
?This is printf (same answer to 6.1.2).
-
What does the second subroutine called by
main
do?sub_401040
, the second subroutine, phones home looking to get a command, presumably to execute on the infected machine.The function opens a handle to an internet resource (in this case practicalmalwareanalysis.com/cc.htm), and then, if that succeeds, heads to loc_40109d to read 0x200 bytes into a buffer using the InternetReadFile function.
We see here that it moves in the dereferenced pointer to the buffer into ecx (i.e. the first character of the buffer), and then compares it to ascii 60. Ascii 60 corresponds to the “<” character, which is the first character of an HTML tag. Immediately after that, it compares to ascii 33, which is the “!” character. Then, it looks for 0x2d or 45 in decimal, which is a “-“ character. It does this twice. If at any time it detects characters that are not these, in this order, it breaks to loc_40111D, which yields an error.
So it’s looking for the string “<!–”. This is how a comment in an HTML document is started. We can conclude it’s looking for an HTML comment.
-
What type of code construct is used in this subroutine?
This looks like a switch nested within an if-statement.
Note from after reading the book: the answer they were looking for was an array. There is an array at work here as well, as one can tell from the above screenshot. Bytes of data one byte apart from each other, and referenced in terms of ebp, are sequentially compared, which is indicative of an array.
-
Are there any network-based indicators for this program?
Absolutely. We see that this block of data gets pushed as a parameter to InternetOpenUrlA:
And we can see what the rest of the data is by clicking szUrl.
http://www.practicalmalwareanalysis.com/cc.htm
is what the network fellas should look for.The book also mentions that we can look for the user agent that this program uses, which makes sense.
-
What is the purpose of this malware?
The purpose of this malware is to phone home and get commands to execute on the infected machine.
If the code passes the HTML comment check, then it moves the character immediately proceeding the comment into al (which is eax’s lowest 8 bits) and returns. Then, that single character is moved into the pointer pointing to the memory address contained within ebp+var_8. This character is then pushed to be in a function that prints a single character immediately before a command, which is indicative that there are a predefined set of commands, each corresponding to a single character, that this malware can run. The purpose of the malware is to get a single character command and do something with it.
6.3
-
Compare the calls in
main
to Lab 6.2’smain
method. What is the new funtion called frommain
?The new function appears to be shortly after the parsing actions. I presume this function will execute the command received and parsed from the internet.
-
What parameters does this new function take?
Function sub_401130 takes two arguments:
-
LpExistingfileName, which is, I believe, the name of itself. The
argv
pointer is equivalent toargv[0]
, andargv[0]
is conventionally the name of the program being executed. -
a magic character. From analysis of 6.2, we are highly confident that this particular character corresponds to a command to execute.
-
-
What major code construct does this function contain?
Ida has identified a switch. BEAUTIFUL.
-
What can this function do?
The function puts the character into ecx and subtracts 97. 97 corresponds to ascii ‘a’. It then jumps to a default case if the resulting number is above 4 (i.e. if the character command was not a, b, c, d, or e).
We see now why it subtracted 97– the resulting number is used as an index in a 5-case jump table.
The cases are:
- Create directory
C:\\Temp
- Copy itself into directory
C:\\Temp
, renaming itselfcc.exe
(but if the file exists it will not overwrite the existing file) - Delete file
C:\\Temp\\cc.exe
- Modify the
Software\\Microsoft\\Windows\\CurrentVersion\\Run
registry key. It adds a new servicemalware
and sets it to executeC:\\Temp\\cc.exe
. This establishes persistence on the infected machine. - Sleep for 100000. The sleep function accepts an integer that represents milliseconds, so it sleeps for 100000/1000 = 100 seconds.
- (Default case) Print an error message that says it's not a valid command.
- Create directory
-
Are there any host-based indicators for this malware?
Most definitely.
- From case 1 above: if
C:\Temp
exists, you may be infected. - From case 2 above: if
C:\Temp\cc.exe
exists, you are almost surely infected. - From case 3 above: if the registry has added
Malware
as a service that executesC:\Temp\cc.exe
, you’re certainly infected.
- From case 1 above: if
-
What is the purpose of this malware?
The purpose of this malware is to copy itself into a folder and establish persistence, depending on commands it receives from the internet. If the internet cannot be reached, it terminates. My guess is switch case 4 will be a cooler evil payload in 6.4.
6.4
-
What’s the difference between the calls made from the
main
method in Labs 6-3 and 6-4?They are at different addresses, I believe. I’m not going to put too much time into comparing the two, because being able to compare things is a skill I already have.
-
What new code construct has been added to
main
?Ah, now we have the same functionality wrapped in a loop. It will wait 60 seconds every iteration, and execute 1440 times (presuming it’s connected to the internet).
-
What’s the difference between this lab’s parse HTML function and those of the previous labs?
Different user agent?
After reading book: Ah yes, it builds a separate user agent using
sprintf
. -
How long will this program run? (Assume it’s connected to the internet.)
It runs 1440 times * 60 seconds. That’s 1440 minutes, which is 24 hours.
-
Are there any new network-based indicators for this malware?
A different user agent that’ll follow the format of
Internet Explorer 7.50/pma%d
, where the number substituted for %d is the iteration number that the malware is on. This can be used as input to the command & control server to serve a HTML file with a different command depending on what iteration the malware is on. -
What’s the purpose of this malware?
The same as previous samples, but upgraded grab MULTIPLE commands from the command and control server back home.