by RSS Sousan Yazdi  |  Jan 16, 2014  |  Filed in: Security Research

Sousan Yazdi, Junior Antivirus Analyst
Margarette Joven, Antivirus Manager
Special Technical Contribution by Liang Huang, Senior Antivirus Analyst

CryptoLocker is the name of a ransomware trojan family that emerged late last year. This malware is designed to target Microsoft Windows systems and is renown for its ability to take its victim's files hostage by fully encrypting files on the victim's computer. The victim of the malware then is shown a message informing them that the only possible way to recover the encrypted files is by paying a ransom: in most cases that ransom is around 300 USD, 300 Euro or the rough equivalent in the digital currency Bitcoin. Victims are often informed that they have 72 hours to pay up in order to obtain the necessary decryption key to get their files back; if they don't pay, they're told the encryption key will be deleted and the files will be irretrievable.

When this trojan is first executed on the victim's computer, it creates a copy of itself to the user's Application Data folder using a randomized name, then creates an autorun registry key to automatically execute that copy every time the infected user logs on. It then attempts to connect to one of its command-and-control (C&C) servers in order to generate an RSA-2048 key that will be used to encrypt its target files.

Since this ransomware starts encrypting files only after a connection to its C&C server is established, one way to help prevent its damaging payload from occurring is by blocking the malware from communicating with those sites. The challenge with this is that CryptoLocker uses what's known as Domain Generation Algorithm (DGA) to generate a list of potential C&C servers, which it then attempts to connect to in the hopes that it finds one online.

There are various aspects of CryptoLocker that can be analyzed and discussed; however, in this blog post, we will focus solely on the details of the DGA that it uses.

The Domain Generation Algorithm

The use of domain generation algorithms was introduced a few years ago for the purpose of making it more difficult for security researchers and law enforcement agencies to take down C&C servers that communicate with infected computers. Instead of having a static list of domains that are hard-coded within the malware body which often can easily be found and then entered into blacklists, a DGA is used that can rapidly create thousands of candidate domain names. The DGA used in CryptoLocker is relatively simple, but is made complicated by a long series of loops and jumps that are designed to confuse reverse engineers and malware analysts.

The Overview

To begin, CryptoLocker generates its psuedo-random key, then calls its DGA from within a loop that is capable of creating 0x3E8 (1000) domain names. Once the domain name is generated, it calls InternetConnect using the generated name as a parameter. If the connection is successful, it jumps out of the loop. If not, it continues with the next iteration after a time lapse of 1000 milliseconds. The key that is passed to the DGA is incremented for each iteration.

Cryptolocker DGA Figure 1

Figure 1. Loop that calls the DGA, using the generated key as a parameter.

Building The Key

As shown in the screenshot above, the key is needed before calling the DGA. The process of building this key comes in four stages.

Stage 1: The Initial Seed

Like many other random generators, CryptoLocker's DGA requires seed and key values. The key that is to be used for each iteration in the loop is generated from an initial seed, which is taken from either of two APIs: QueryPerformanceCounter and GetTickCount. CryptoLocker calls QueryPerformanceCounter first, and the value retrieved is saved as the seed, if successful. If not, it calls GetTickCount. This API's return value, which is the number of milliseconds that have elapsed since the system was started, is then saved as the seed. This initial seed is stored into the register ECX.

Cryptolocker DGA Figure 2

Figure 2. Getting the initial seed.

Using this initial seed value, an array of seeds is created.

Stage 2: Generating The Array Of Seeds

This array of seeds has a constant size, which is 0x270 (624) DWORDS. The initial seed value is stored at the beginning of the array, which we will call Address_Start_Array. A combination of SHR, XOR, IMUL, and ADD operations are then applied in order to get the next seed value that is to be stored in the array. The loop exits once it has computed 0x270 DWORDS, which would be at an address that we will call Address_End_Array.

Figure 3 below shows the disassembled codes that build this array.

Cryptolocker DGA Figure 3

Figure 3. Disassembled codes that build the array of seeds.

Figure 4 shows the equivalent pseudo-code.

Cryptolocker DGA Figure 4

Figure 4. Pseudo-code that build the array of seeds.

To generate the KEY value that will be used in the DGA, the values in this array of seeds are then accessed by another function, which we will discuss next.

Stage 3: Processing The Array of Seeds

Processing of the array begins with the first element up to index 0xE3, which is the 227th DWORD element in the array. A combination of ADD, XOR, AND, and SHR operations are used in the computation algorithm, which is shown in Figure 5.

Cryptolocker DGA Figure 5

Figure 5. Pseudo-code that processes the array of seeds.

The operand (Const_1 + ((Temp_a & 1)* 4)) shown in Figure 5 above always calculates to either of the two constant values:

  • [Const_1] = 0
  • [Const_1 + 4] = 0x9908B0DF
  • Also shown in Figure 5 is the operand [(Address_Start_Array + 0x630) + (index*4)]. This refers to the DWORD starting at index 0x18C of the array until the last array element.

    There are more computations done to the rest of the array from element 0xE3, which are used in other portions of the malware. However, since this blog post is focusing on the DGA, we will skip those computations and go directly to the code that computes for the KEY.

    Stage 4: Computing the DWORD value

    In this last stage, the following set of calculations is applied to the second element of the array.

    Cryptolocker DGA Figure 6

    Figure 6. Pseudo-code for computing the KEY.

    The value obtained here is the KEY that is passed to the DGA algorithm, as seen in Figure 1 above.

    Generating Domain Names

    The DGA in CryptoLocker uses four values: the KEY, the current day, the current month, and the current year. The first value, the KEY, has been obtained from the calculations discussed above; the last three values are retrieved from a call to the GetSystemTime API.

    All four values go through a series of mathematical operations, the results of which are then passed to the loop that creates the domain name.

    The Key

    Using a combination of MUL, SHR, IMUL, and ADD operations, the KEY value is modified, which we will call NewKey. Figure 7 shows the algorithm for this calculation.

    Cryptolocker DGA Figure 7

    Figure 7. Calculation for NewKey.

    The Current Day

    Calculations are also done on the value for the current day, which involve just the SHL and XOR operations. We will call the result DayKey.

    Cryptolocker DGA Figure 8

    Figure 8. Calculation for DayKey.

    The Current Month

    The current month is also modified using the SHL and XOR operations. Similarly, we call the new value MonthKey.

    Cryptolocker DGA Figure 9

    Figure 9. Calculation for MonthKey.

    The Current Year

    Last, but not least, the current year is also modified. In this calculation, the operations involved are ADD, SHL, and XOR. We call the resulting value YearKey.

    Cryptolocker DGA Figure 10

    Figure 10. Calculation for YearKey.

    The String Length

    The length of the string for each server name is determined through the following computations:

    Cryptolocker DGA Figure 11 update

    Figure 11. Determining the length of the server name.

    Generating the Name

    The parameters calculated above are then used in the loop below, which generates the letters that form the server name.

    Cryptolocker DGA Figure 12 update

    Figure 12. Generating the letters that form the server name.

    Getting the TLD

    The top-level domain (TLD) that is to be added to the server name is chosen from a list of seven strings. From Figure 1, we can recall that the DGA is called within a loop. For each iteration in this loop, CryptoLocker goes through the list of TLDs in the same order. However, which string comes first in the order is based on the NewKey value that was obtained from the calculation in Figure 7.

    The TLD strings are in Unicode format and are stored in the malware body in encrypted form. Once CryptoLocker has determined which string to start with, it begins building its TLD list by decrypting the strings using simple ADD and XOR operations.

    Cryptolocker DGA Figure 13

    Figure 13. Building the TLD list.

    Below are the strings used for the TLD, as well as the order that CryptoLocker follows:

    Cryptolocker DGA Figure 14

    Figure 14. List of TLDs.

    Since the decrypted TLD string is in wide character format, it is first passed to the WideCharToMultiByte API in order to convert the string to multibyte form. The server name that had been generated is then concatenated with a "." character, followed by the converted TLD string. This completed domain name is then ready to be passed to the InternetConnect API to test if the server is active, as seen in Figure 1.

    Some examples of the domain names generated by CryptoLocker are as follows:

    Wrapping Up

    CryptoLocker has made waves as being one of the most damaging malware last year. Since this malware has been known to spread through emails, always be cautious when opening email attachments or clicking on links. Keep your antivirus software updated and your systems patched. And last, but not least, regularly backup your data to an offline source (like a removable hard drive) for easier restoration of files.

    by RSS Sousan Yazdi  |  Jan 16, 2014  |  Filed in: Security Research