Cryptanalysis of the Sasfis Registry Key

by Doug Macdonald
March 10, 2010 at 10:10 am

Recently I’ve been working on an analysis of Sasfis botnet communications. During the tests I noticed that when the bot installs itself, it adds a registry key named “idid”, with some random looking data in it. The data was added under the name “url0″, so it seemed like it must be an encrypted URL. Here is an example from one of the bot variants:

Key Name:          HKEY_CLASSES_ROOT\idid

Name:            url0

00000000   1e 9b 6d d8  89 e6 c4 50  7f fd 13 6b  fa e2 f4 17

00000010   1a 80 78 cc  d6 bb c4 55  73 b5 07 77  a4 81 3a 71

00000020   a4 98 ba d8  2c 85 17 ad  ce c0 b1 a5  9f c8 07 0b

But what URL could this be, if it is one? Most of these bytes are not in the normal text range, so it would have to be encrypted. Even when there was no network connection, the url0 data was added, so I knew it must be hard coded into the bot. From the tests I had been doing, I also knew that the bot contained a hard coded URL for its Command and Control server. So it seemed possible that the C&C URL was encrypted here, but of course I would have to prove that.

The first 16 bytes of the url0 values, from six bot tests, with their test identifiers (T3, M2 etc.), are listed below. The list is sorted by the opening bytes. They fall into two groups where the first seven bytes are identical. The T2 data is slightly different from the ones below it, but the one different byte (f1) could be the result of an encryption error.

T3   1e 9b 6d d8  89 e6 c4 50  7f fd 13 6b  fa e2 f4 17

M2   1e 9b 6d d8  89 e6 c4 5f  60 ff 12 7b  bd ea f3 4c

T2   f1 9b 20 62  fc 48 d0 3e  27 fc 1d f7  94 5a ff 3f

T1   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

M1   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

M5   f8 9b 20 62  fc 48 d0 2b  2a fd 17 e2  87 46 ea 7e

Looking at this, it seems fairly likely that each group was encrypted with the same key. And if these are URLs, the seven common bytes at the beginning of each line could be “http://”, if we are on the right track.

The obvious move at this point is to test this theory. We can start with the first row of hex data from the T3 and M2 tests, recover the key for T3 using the hard coded URL for that variant, then find out if the key is correct by decrypting M2 with it. The worksheet below shows the hard coded URL and the url0 registry data for T3 in the first two lines. At the bottom is the URL in text format and in the plain line are the equivalent hex bytes.

T3 http://gnfdt.cn/loader/bb.php

00000000   1e 9b 6d d8  89 e6 c4 50  7f fd 13 6b  fa e2 f4 17 (encrypted in reg)

key

plain      68 74 74 70  3a 2f 2f 67  6e 66 64 74  2e 63 6e 2f (url in hex format)

text       h  t  t  p   :  /  /  g   n  f  d  t   .  c  n  /  (known URL)

We will assume that the key was XORed with the plaintext to produce this encryption. That is the most likely case, but if we are wrong it will be necessary to try some other methods. From this basis we will now XOR the encrypted and plain bytes to recover the key.

T3 http://gnfdt.cn/loader/bb.php

00000000   1e 9b 6d d8  89 e6 c4 50  7f fd 13 6b  fa e2 f4 17 (encrypted in reg)

key        76 ef 19 a8  b3 c9 eb 37  11 9b 77 1f  d4 81 9a 38 (recovered key)

plain      68 74 74 70  3a 2f 2f 67  6e 66 64 74  2e 63 6e 2f (url in hex format)

text       h  t  t  p   :  /  /  g   n  f  d  t   .  c  n  /  (known URL)

Now we have some key bytes, but there is no proof that they are real. To prove that, we can use the key bytes to decrypt M2. The result is below. Part of the URL that is hard coded into the M2 bot has been revealed.

M2 http://hqdedikit.com/mld/bb.php

00000000   1e 9b 6d d8  89 e6 c4 5f  60 ff 12 7b  bd ea f3 4c (encrypted in reg)

key        76 ef 19 a8  b3 c9 eb 37  11 9b 77 1f  d4 81 9a 38 (recovered key)

plain      68 74 74 70  3a 2f 2f 68  71 64 65 64  69 6b 69 74 (decrypted hex)

text       h  t  t  p   :  /  /  h   q  d  e  d   i  k  i  t  (decrypted text)

So our case is proved, the hard coded URL is the one hidden in the registry key. We can easily extend this through the rest of the encrypted data to show the whole URL, and remove any lingering doubt.

But what would we do if each bot variant had its own key? The method above would not work, but there are other ways to approach this problem. One way is to check whether this is a repeating key encryption system. They are very common, and if it is we can make comparisons within one URL, instead of using two as we did above.

Let’s try this method with T3. The simple way is to use the whole URL to find as many key bytes as possible, then look for repetitions.

T3 http://gnfdt.cn/loader/bb.php

00000000   1e 9b 6d d8  89 e6 c4 50  7f fd 13 6b  fa e2 f4 17

key        76 ef 19 a8 b3 c9 eb 37  11 9b 77 1f  d4 81 9a 38

plain      68 74 74 70  3a 2f 2f 67  6e 66 64 74  2e 63 6e 2f

text       h  t  t  p   :  /  /  g   n  f  d  t   .  c  n  /

00000010   1a 80 78 cc  d6 bb c4 55  73 b5 07 77  a4 81 3a 71

key        76 ef 19 a8 b3 c9 eb 37  11 9b 77 1f  d4

plain      6c 6f 61 64  65 72 2f 62  62 2e 70 68  70

text       l  o  a  d   e  r  /  b   b  .  p  h   p

Here we can see that the key starts to repeat at the start of the second row. So the key length is 16 bytes, and again we have proved that the key holds the hard coded URL. Decrypting the next byte at the end provides a little bonus, 0×81 XOR 0×81 = 0×00, the null terminator for the string. Decryption from this point onward exposes bytes that appear to be random.

But now consider another scenario, what would we do if we had no idea what the encrypted URLs were? If we have bots with different URLs using the same key, the problem is not beyond solution. To demonstrate I will use the data from T1 and M1, from the other key group. It turns out, in the end, that only the first two lines of hex are needed for this, so the example below will not show the third line.

First we need to locate the key repetition. We can try “http://” at the start to find the first seven key bytes. With these key bytes we can  decrypt at different locations until some URL-like text appears. The bot code probably processed this as DWORDs, so we will take a shortcut by checking at four byte intervals, and use only four key bytes for each decryption. If this fails we will have to try decrypting at different intervals, possibly even at every byte. The “?” marks below indicate decrypted bytes outside the normal text range, which we would not expect in a URL.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff 90 ef 54 12  90 ef 54 12

plain      68 74 74 70  3a 2f 2f     ac 13 43 e3  01 be be 2d

text       h  t  t  p   :  /  /      ?  ?  C  ?   ?  ?  ?  -

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12 90 ef 54 12 90 ef 54 12 90 ef 54 12

plain      63 6e 2f 6d  3a ec 84 2f  b7 51 5c ea  15 d8 10 95

text       c  n  /  m :  ?  ?  /   ?  Q  \  ?   ?  ?  ?  ?

The true decryption appears to be cn/m”, at the start of the second row. None of the others is even close. So it looks like we have found the key repetition and the key length. With this information we can set up our work sheet, with the known key bytes and decryptions they give us filled in. It can be seen below, where the decrypted parts confirm our work so far.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff

plain      68 74 74 70  3a 2f 2f

text       h  t  t  p   :  /  /

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff

plain      63 6e 2f 6d  6c 64 2f

text       c  n  /  m   l  d  /

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff

plain      68 74 74 70  3a 2f 2f

text       h  t  t  p   :  /  /

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff

plain      74 2f 6c 6f  61 64 65

text       t  /  l  o   a  d  e

Now we need to extend the URL text parts to uncover more key bytes. In other words we need to make some good guesses, but because the structure of URLs is well known to us, this should not be too difficult.

Notice that the second text line under T1 starts with “cn/mld/”. This looks like a “.cn” top level domain, so let’s fill in the “.” and apply the key byte we get.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff                           11

plain      68 74 74 70  3a 2f 2f                           2e

text       h  t  t  p   :  /  /                            .

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff 11

plain      63 6e 2f 6d  6c 64 2f                           96

text       c  n  /  m   l  d  /                            ?

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff                           11

plain      68 74 74 70  3a 2f 2f                           65

text       h  t  t  p   :  /  /                            e

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff 11

plain      74 2f 6c 6f  61 64 65                           00

text       t  /  l  o   a  d  e                           

Now we have some more decrypted bytes. There is a null at the end of M1, this must be the URL string terminator, and a non-text byte (0×96), but let’s ignore that one for now. It may be junk from beyond the end of the URL string, and we will know soon enough if this was a bad guess. At the end of the first M1 line the text character is an “e”, so that we now have “et/loade”. This looks like it must be “.net/loader”, so next we will fill this in and decrypt some more.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff 5f 34 98 11

plain      68 74 74 70  3a 2f 2f 6d                  65 72 2e

text       h  t  t  p   :  /  /  m e  r .

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff 5f 34 98 11

plain      63 6e 2f 6d  6c 64 2f 62                  00 dc 96

text       c  n  /  m   l  d  /  b ? ?

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff 5f                  34 98 11

plain      68 74 74 70  3a 2f 2f 75                  2e 6e 65

text       h  t  t  p   :  /  /  u .  n e

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff 5f 34 98 11

plain      74 2f 6c 6f  61 64 65 72                  68 70 00

text       t  /  l  o   a  d  e  r h  p

There is nothing very obvious here, but at the end of the second row of M1 we have “hp”. This looks like it could be “.php”, so let’s try that next.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff 5f 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 6d           61  64 65 72 2e

text       h  t  t  p   :  /  /  m            a   d e  r  .

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff 5f 90  f5 34 98 11

plain      63 6e 2f 6d  6c 64 2f 62           68  70 00 dc 96

text       c  n  /  m   l  d  /  b            h   p ?  ?

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff 5f 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 75           69  74 2e 6e 65

text       h  t  t  p   :  /  /  u            i   t .  n  e

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff 5f           90  f5 34 98 11

plain      74 2f 6c 6f  61 64 65 72           2e  70 68 70 00

text       t  /  l  o   a  d  e  r            .   p h  p

This looks good, and now we have some good hints. In T1, in the first line, it looks like we have “//m?loader.” and in the second line another “.php” is developing. We can put these in.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff 5f     90 78 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 6d     6c 6f 61  64 65 72 2e

text       h  t  t  p   :  /  /  m      l  o a   d  e  r  .

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff 5f 90 78 90  f5 34 98 11

plain      63 6e 2f 6d  6c 64 2f 62     2e 70 68  70 00 dc 96

text       c  n  /  m   l  d  /  b      .  p h   p  ?  ?

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff 5f 90 78 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 75     6c 69 69  74 2e 6e 65

text       h  t  t  p   :  /  /  u      l  i i   t  .  n  e

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff 5f 90 78 90  f5 34 98 11

plain      74 2f 6c 6f  61 64 65 72     62 62 2e  70 68 70 00

text       t  /  l  o   a  d  e  r      b  b .   p  h  p

Now, in the second line of M1, we have “bb.php”, and it looks like this also appears in “mld/b?.php” at second line of T1. With this we can fill in the last missing byte.

T1 00000000   f8 9b 20 62  fc 48 d0 32  3c fc 17 f1  91 51 ea 3f

key        90 ef 54 12  c6 67 ff 5f 45 90 78 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 6d  79 6c 6f 61  64 65 72 2e

text       h  t  t  p   :  /  /  m   y l  o  a   d  e  r  .

00000010   f3 81 7b 7f  aa 03 d0 3d  27 be 08 f8  85 34 44 87

key        90 ef 54 12  c6 67 ff 5f  45 90 78 90  f5 34 98 11

plain      63 6e 2f 6d  6c 64 2f 62  62 2e 70 68  70 00 dc 96

text       c  n  /  m   l  d  /  b   b .  p  h   p  ?  ?

M1 00000000   f8 9b 20 62  fc 48 d0 2a  2e fc 11 f9  81 1a f6 74

key        90 ef 54 12  c6 67 ff 5f 45 90 78 90  f5 34 98 11

plain      68 74 74 70  3a 2f 2f 75  6b 6c 69 69  74 2e 6e 65

text       h  t  t  p   :  /  /  u   k l  i  i   t  .  n  e

00000010   e4 c0 38 7d  a7 03 9a 2d  6a f2 1a be  85 5c e8 11

key        90 ef 54 12  c6 67 ff 5f 45 90 78 90  f5 34 98 11

plain      74 2f 6c 6f  61 64 65 72  2f 62 62 2e  70 68 70 00

text       t  /  l  o   a  d  e  r   / b  b  .   p  h  p

So even if the URLs are unknown, we can still decrypt them if bots with different URLs use the same key. In fact all of the pairs from this group {T1-M1, M1-M5, and T1-M5} can be solved without any really difficult guessing, and using all three makes it much easier. Even when it is not clear what text to fill in next, we can always try different guesses until we find the right one.

Of course the weaknesses in this encryption could have been avoided, or at least reduced. For example, not re-using keys would have helped. What we may be seeing here is evidence that, like many computer users, bot herders don’t take security as seriously as they should.

Author bio: Doug MacDonald has over eight years experience in antivirus research and development. He holds an M.S. in Electronic Engineering.

One Response to “Cryptanalysis of the Sasfis Registry Key”

  1. The problem gets deeper when it becomes nearly impossible to find a legitimate source of information on registry repair software. The issue lies in the wave of spam-review sites which are nothing more than websites promoting affiliate links under the guise of an official “review” site. There main goal is to accomplish one thing, to send you to the site they are promoting and hoping you buy the product they are selling. if you do, they get up to a 80% cut of the sale. In other words, their reviews are up for sale, and are nothing buy thinly vailed sales pitches. For example, if you do a search for the term “Paid Survey” or “Registry Repair” you’ll notice that the paid listings all include sites that say “read our review” or “warning, don’t download anything until you read this…” etc

Leave a Reply