Pages

Tuesday 3 December 2019

In depth analysis of an infostealer: Raccoon


General Info


Raccoon is a malware written in C++. It came to my attention while looking at my twitter feed and spotting a tweet from @tkanalyst. I was not aware of it, and as a malware analyst working at a sandbox company(tria.ge), I wanted immediately to analyze it and develop signatures. Also, I have not any background in Threat Intel or attribution, so the name was chosen due to @tkanalyst tweet.
The sample that was analyzed has the following information:
  • MD5 HASH (Packed) : 126ed436b3531dd857b25b9da2c80462
  • MD5 HASH (Unpacked): 3367E9FC3CDBE03D65460E5BF86EE16B
  • Raccoon Version: 1.2
Generally, the sample is a typical infostealer malware. It checks for the existence of various types of applications such as browsers, email clients, coin wallets and attempts to steal their data by reading their configuration files or databases. The execution of the malware is closely related with the configuration that the CnC server will send, thus there is an obstacle during the dynamic analysis if the CnC domain is down. In our case this was solved by writing a module in Fakenet-NG and emulating the responses of the CnC(Figure 1).

Figure 1



Static Analysis

While I am not that fond of malware written in C++ for obvious reasons, Raccoon was not that complicated - it does not have any ANTI - methods, and its execution is straight forward. With the help of FLIRT signatures if correctly applied, the static analysis can become a lot easier. While spending some time doing static analysis, I noticed some patterns for string decryption. The majority of the strings are encrypted with a combination of bitwise NOT/XOR (depending on the sample)(Figure 2,3). To make my life easier and to practice my IDApython skills, I created a script in order to search and decrypt these strings: [7]. Some main points from the script:
  • It has two major functions responsible for reading ASM instructions and gathering the data and decrypting it.
  • It is based on pre-defined regex for deciding whether there is potential encrypted data. There is often overlap between the addresses which is solved by checking the decrypted strings and deciding which one is valid.
  • It's really easy to fix a decrypted string that it is wrong or was overwritten by another pattern - you just call one of the two available functions having as a parameter the address of the instruction that one believes to move data needed for the decryption of the string.
Figure 2

Figure 3

Dynamic Analysis


Starting to analyze the malware dynamically, the malware first checks for a mutex, in order to determine if an instance of it is already running. Specifically, the mutex's name is a result of a string decryption and concatenation with the current user's name. If the mutex does not exist, it is created and the function returns 1, else it returns 0. (Figure 4)

Figure 4

Immediately after that, the malware check the privilege that is executed with. Specifically, with the help of the token is determined whether the process is run from Local System group. If that's the case, then a snapshot of running processes is acquired, and it will try to duplicate a token(with higher privileges) from another one in order to call CreateProcessWithTokenW API, restarting with higher privileges.(Figure 5)

Figure 5

As it was mentioned in Cybereason's post[1], the malware checks the locale of the system against various other values such as: Russian, Ukrainian, Belarussian, Kazakh, Kyrgyz, Armenian, Tajik, and Uzbek.
In order to continue the execution, the malware needs to get its JSON config. The CnC server serving the config does not exist inside the sample - instead, the sample dynamically acquires its CnC via another request. The samples firstly proceeds into decrypting a RC4 key (1@zFg08*@45) which is further used to decrypt a URL and sample's Config ID. (Figure 6)

Figure 6

There are 2 hardcoded strings, encrypted with RC4 and encoded with base64 encoding .They also have multiple newline and space characters (probably to break static tool?). These strings are the URL of the first domain and the Config ID of the current sample. (Figure 7)
Figure 7


In the current sample, a request is performed towards a drive.google.com url followed, by a lookup in the response headers in order to locate two substrings: '.txt";filename*=UTF-8' and 'attachment;filename='(Figure 8,10). Their values are the RC4 encrypted CnC that is ought to respond later with a valid JSON configuration. (Figure 9). It should be noted that, the key RC4 key for decrypting the CnC domain is different than the one used before, but it is hardcoded in the sample (7effd829b15db71f1e5431670f17da25).

Figure 8

Figure 9

Figure 10


After that, it's time for the UUID of the infected workstation to be built. This is done by getting the machine GUID, user's name and the previously encrypted config in the sample all together concentrated. The parameter is encoded with base64 encoding and a POST request is performed to the previously decrypted CnC domain. (Figure 11, 12)

Figure 11



Figure 12


The malware ensures that a response is valid by either checking for the existence of the string 'Wrong config id' or by the string 'url'. Also, if the response does not contain the 'Wrong config id' but somehow contains the string 'url', will later fail during the parsing of the configuration. (C++ exceptions). (Figure 13)

Figure 13

If the JSON config is valid, then the the value of 'url' json property is acquired. Also, a folder is created in TEMP with named 'TrashCan' which was not used during execution. (Figure 14). It should be also noted that, all the file operations are performed via transactions, something that Fumiko[5], another malware researcher has described in one of his blog posts.

Figure 14

Another check of properties in the confiq occurs, for 'config' and 'mask'. If succesfull located the sample continues to create a string 'C:\\Users\\user\\AppData\\Local\\Temp\\Log.zip' for future usage. Another property check happens, for 'attachment_url'. If it exists, its value will be acquired, which in our case is a .dll. The malware will start preparing the ground to download and load the particular library.As it is common with Raccoon, a string is decrypted in memory and in this case it is 'sqlite3.dll'. Later, a full TEMP path will be built in order to be used by UrlDownloadToFileA to download and save the file.( Figure 15)

Figure 15

Then it proceeds into checking the value of the 'history_is_enabled' property and begins its first stealing operation - loading the dropped sqlite3.dll library getting the data from Chrome-like browsers by searching in specific folders. This is done by iterating an array of structures with size 18h, containing 4 offsets to various paths like Login Data, Cookies, Web Data and User Data for various browsers descendant of Chrome. The index is saved in ESI register while accessing the various members of the structures holding the information about the browsers. (Figure 16, 17)

Figure 16

Figure 17

Next, the sample attempts to steal all the data associated with Internet Explorer. This is done by calling three different functions, each aimed at stealing different data such as auto complete information, http basic authentication passwords stored in credentials store and finally data from Vault. The methods used here are known to the public and were detailed explained here[2]. (Figure 18)
Figure 18

Lastly, another string is decrypted in memory 'libraries' and again its existence is checked against the response. If the property exists then the its value will be acquired resulting in a URL containing additional libraries. The sample will attempt to perform its next stealing operation targeting Firefox - like browsers by downloading the additional libraries, loading them and resolving the needed APIs in order to steal the data. It should be noted that, the path that the additional libraries were extracted is added as a value in the enviromental variable 'PATH'. (Figure 19)

Figure 19

Firstly the malware checks if the "C:\\Users\\user\\AppData\\Local\\Temp\\AdLibs\\nss3.dll" (Figure 20 ) exists in disk and if not, it then proceeds into downloading the set of libraries to the path "C:\\Users\\user\\AppData\\Local\\Temp\\AdLibs\\ff-funcs.zip" and unzip the libraries at "C:\\Users\\user\\AppData\\Local\\Temp\\AdLibs\\". Finally, the 'nss3.dll' library will be loaded and the associated APIs will be resolved in order to be used in the next function (Figure 21). The malware attempts to steal History, Signons, cookies, places.sqlite by looping again against an array of structures holding information for Firefox-like browsers.
Figure 20


Figure 21

Finishing with the browsers, the sample proceeds into stealing email data associated with Outlook browser. This was covered by Cybereason's blogspot so we will proceed into the next stealing operation - taking data from FoxMail client. It is thourogly checks for the existence of( Figure 22):
  1. D:\\Program Files\\Foxmail 7.2\\Storage
  2. D:\\Program Files (x86)\\Foxmail 7.2\\Storage
  3. D:\\Foxmail 7.2\\Storage
  4. C:\\Program Files\\Foxmail 7.2\\Storage
  5. C:\\Program Files (x86)\\Foxmail 7.2\\Storage
  6. C:\\Foxmail 7.2\\Storage
and collects all the relative data.
Figure 22


Finishing from stealing data from Email clients, next thing is to collect information from the infected workstation. The value of the 'IP' property is being parsed from the JSON response and then a function is responsible for gathering and writing to a file named 'machineinfo.txt' information. The installed programs are being determined by looping through the entries of 'SOFTWARE\\WOW6432Node\\Microsoft\\Windows\\CurrentVersion\\Uninstall'(Table 1) . The information that is collected and written is the following (Figure 23, 24):

Property
Value
Raccoon's version
Hardcoded
Build Time:
Hardcoded
Bot ID
Concatenation of strings (SOFTWARE\Microsoft\Cryptography\ MachineGuid + '_' + GetComputerNameA()
System Language
GetLocaleInfoA()
Username
GetUserNameA()
External IP
(Returned by the configuration)
Windows version
SOFTWARE\Microsoft\Windows NT\CurrentVersion\ ProductName
System arch
GetSystemWow64DirectoryW()
CPU
CPUID
RAM (MB Used etc etc)
GlobalMemoryStatusEx()
Display Devices
EnumDisplayDevicesA()
Screen Resolution
GetSystemMetrics()
Installed APPs
SOFTWARE\WOW6432Node\Microsoft\Windows\CurrentVersi on\Uninstall
(Table 1)

Figure 23


Figure 24

The malware also has the capability of taking a screenshot[1]. If the 'is_screen_enabled' property exists in the JSON config and its value is 1, then the malware will take a screenshot and saved it as screen.png in TEMP. (Figure 25)
Figure 25

Last but not least the sample looks for various crypto coin wallets and attempt to steal their data. There is a general search in the APPDATA for files named as 'wallet.dat' and after that, famous wallets are targeted such as (Figures 26,27):
  • Electrum
  • Ethereum Wallet
  • Exodus
  • Jaxx
  • Monero
  • Bither
Figure 26

Figure 27

Finally, the sample is preparing for the exfiltration phase. That means collecting all the information that was written to Temp and zipping them up to a zip file named 'Log.zip'. The following files are searched up to be included in the zip and are products of the previous attempts to steal data:
  1. password.txt
  2. CC.txt
  3. browsers\\firefox_cookie.txt
  4. browsers\\firefox_urls.txt
  5. browsers\\chrome_urls.txt
  6. browsers\\chrome_cookie.txt
  7. browsers\\chrome_autofill.txt
  8. browsers\\ie_autofill.txt
  9. browsers\\ie_ftp_data.txt
  10. mails\\outlook.txt
  11. mails\\thunderbird.txt
  12. mails\\foxmail.txt
  13. Wallets\\Electrum
  14. Wallets\\Ethereum
  15. Wallets\\Exodus
  16. Wallets\\Jaxx
  17. Wallets\\Monero
  18. machineinfo.txt but included in the zip as System Info.txt
The additional libraries that were dropped in disk are deleted, and so all the files that were included in the zip file. Before the sample deletes itself and terminate its execution[1] it does something interesting: it checks for the existence of property 'loader_urls' in the JSON config. If it exists, then the sample will generate a random 10 letter name, part of an executable path.(Picture). This will be the location that the executable will be downloaded from the URL, which is the value of the 'loader_urls'. The executable then will be executed. The file will be executed with the ShellExecuteW Windows API. (Figure 28)

Figure 28

Lastly, the malware deletes itself from the infected workstation by executing 'cmd.exe /C ping 1.1.1.1 -n 1 -w 3000 > Nul & Del /f /q "%s"' as it was stated in Cybereason's blogspot[1]. One thing that comes in mind immediately after finishing the analysis is - did we miss to locate the way that the malware is acquiring persistence in the system, or it does not have any persistence method at all? In the next blogspot, we will discuss and analyze the PE file which was downloaded earlier and the way it is enforcing a persistence across the system. (Figure 29)

Figure 29

Evolution

As it is normal for that kind of malware, there was a new version while this article was written. Another malware researcher Fumiko, was kind enough to point me to another Raccoon sample found in his tracker ( MD5:121f7cba18bcb38e68bd4fc4f2e71815 ). During a quick static analysis by running our IDAPython script, it was revealed that there was indeed a new version, specifically called '[Raccoon Stealer] - v1.3.2 UC-International Release'(Figure 30)

Figure 30

While a responsible analyst would take a closer look, diff the functions in order to discover new changes etc etc, we are of the lazy type. So instead of all that, all the strings from a sample with version 1.2 were dumped to a .txt file and diffed against all the strings from the new version. This resulted in the following :
  • Some new targets were added and specifically FileZilla (Figure 31)
  • There was some new SQLITE queries added (maybe to support newer browser versions?) (Figure 32)
Figure 31


Figure 32

In order to confirm our findings, we would have to execute the malware and monitor specific API calls to verify the above. What's the point of working in a sandboxing company if not using the sandbox for that kind of things (Well, apart from malware classification)? Executing the malware[8] and inspecting the logs revealed that indeed the samples is checking for that kind of paths. (Figure 33) Also, a new directory called (TempDir-Extended) is created and the two files are potentially stored there. The new directory also exists in our diffing thus further verifying the validity of our results. (Figure 34)

Figure 33


Figure 34

From a quick look of the static code and based on the execution logs taken from the sandbox execution we concluded that:
  • The content of the wallets that were stolen is stored in new files based on the type of the coin ( trevor, ledge ) also in a new path but will be added to the Log.zip file with the same name.
  • There seems to be a change in the way that the dumped passwords are stored in the associate .txt files based on static code compare to the older versions. (Couldn't verify that as I have a VM without pre-configured data)
  • Seems that in the System Info.txt was added the information of the computer's name. This was later verified by inspecting the dropped .txt file before being deleted by the sample. (Figure 35)

Figure 35

Conclusion

Raccoon is a infostealer capable of performing a variety of actions, justifying its price and its heavy usage from a variety of criminals. From the above analysis, one must remember that:
  • The CnC domain is acquired dynamically - there is an HTTP request beforehand to get the CnC encrypted with RC4 ( It is not hardcoded in the sample)
  • The credentials that were grabbed are saved in TEMP folder with specific names - easy to keep in mind during a IR assessment.
  • In version 1.2/1.3.2 there is not a persistence method - in this particular case thought, the response did have an EXE to be executed which would create a scheduled task but in general, there isn't one.
  • Some numeric constants did not change - if we carefully examine the code, most of the tags used during the completion of the machineinfo.txt file are 128bit constants hardcoded in the sample. Apart from the constant used to define the new version of the malware, the other ones are the same. ( With the addition of one used to decrypt 'ComputerName' string ). (Figure 36)

Figure 36

Lastly, there are some more things to figure out and improve during the analysis of this family such as:
  • There is not a clear explanation for the width property in the JSON - the same applies for the mask property too. It could be a placeholder for a future capability maybe?
  • By inspecting strings, it was revealed that the author is using a famous open source JSON[6] library for C++, and specifically the version 3.4.0. There was an attempt to produce a .lib file in order to use IDA's way of producing FLIRT signatures and make the analysis easier but was not successful. ( There were problems compiling a .hpp header with template definitions and no useful information was generated. )
  • There was not further exploration of the properties of the JSON thus there is no guarantee that this analysis covered all the potential capabilities of the malware.

Special Thanks:

  • xorsthingsv2 for taking the time to review the analysis and the doc.
  • Fumiko, for showing me the new sample
  • @tkanalyst, for posting the raccoon samples

Appendix

Configurations (Table 2):

MD5 HASH
Version
CnC Response
f7bcb18e5814db9fd51d0ab05f2d7ee9
V1.2
{"url":"http://34.89.185.248/file_handler/file.php?hash=252c0d60af493e46d25e7da5e10207c77b5627de&js=1f192856af8a097533d9b8f13e1d168af9f585ca&callback=http://34.89.185.248/gate","attachment_url":"http://34.89.185.248/gate/sqlite3.dll","libraries":"http://34.89.185.248/gate/libs.zip","ip":"xxx","config":{"masks":null,"loader_urls":null},"is_screen_enabled":0,"is_history_enabled":0,"depth":3}
6556a3467ec8e58756af772aa72da99f
V1.2
{"url":"http://34.77.197.252/file_handler/file.php?hash=7a48136f8f459660ec43988e0eb8bf0f77a00f0d&js=2de257efd687492ea3537ea0beed2f3e026413c7&callback=http://34.77.197.252/gate","attachment_url":"http://34.77.197.252/gate/sqlite3.dll","libraries":"http://34.77.197.252/gate/libs.zip","ip":"xxx","config":{"masks":null,"loader_urls":null},"is_screen_enabled":0,"is_history_enabled":0,"depth":3}
121f7cba18bcb38e68bd4fc4f2e71815
V1.3.2
{"url":"http://34.76.145.229/file_handler/file.php?hash=48b77b41f7e1cb233dc4592900244912bdfe7892&js=429835ce099536a23c41ea48c6907cc616a76273&callback=http://34.76.145.229/gate","attachment_url":"http://34.76.145.229/gate/sqlite3.dll","libraries":"http://34.76.145.229/gate/libs.zip","ip":"xxx","config":{"masks":null,"loader_urls":null},"is_screen_enabled":1,"is_history_enabled":1,"depth":3}
80072d5f4bfa1ff22c87be610438792e
V1.2
{"url":"http://34.65.76.39/file_handler/file.php?hash=27c70127350a34268baf46dc23eb4e09fd24f547&js=a044f29dbf33cf8013c2cb40b27fa7e8c07cc4d7&callback=http://34.65.76.39/gate","attachment_url":"http://34.65.76.39/gate/sqlite3.dll","libraries":"http://34.65.76.39/gate/libs.zip","ip":"xxx","config":{"masks":null,"loader_urls":null},"is_screen_enabled":0,"is_history_enabled":0,"depth":3}
126ed436b3531dd857b25b9da2c80462
V1.2
{"url":"http://35.197.207.160/file_handler/file.php?hash=2dfe29b8560662cbd03e409e04c32eb0a3e65028&js=47de3ce52e822b60cd7e21a1d31dfb7a9a904ddf&callback=http://35.197.207.160/gate","attachment_url":"http://35.197.207.160/gate/sqlite3.dll","libraries":"http://35.197.207.160/gate/libs.zip","ip":"xxx","config":{"masks":null,"loader_urls":["http://185.161.210.244/signed.exe"]},"is_screen_enabled":0,"is_history_enabled":0,"depth":3}

(Table 2)

References

[0] https://support.microsoft.com/en-au/help/243330/well-known-security-identifiers-in-windows-operating-systems
[1] https://www.cybereason.com/blog/hunting-raccoon-stealer-the-new-masked-bandit-on-the-block
[2] https://securityxploded.com/iepasswordsecrets.php
[3] https://www.codeproject.com/Articles/1167943/The-Secrets-of-Internet-Explorer-Credentials