mardi 20 février 2018

Fun with function names. Where resolving goes wild.

Last night, I was looking through C-code and ARM assembly.
I was wondering myself: When a binary calls a function inside a shared lib, how the linker knows where the code resides in the library?
The second question was: can we change the name of functions in a binary and in a library and that everything works after?
And third: can we use some fancy characters in function names? Like changing color of the xterm when doing an objdump or gdb the binary? You know ANSI escape codes? What if we put ANSI escape code in function name?

1/ Start smoothly

My computer is currently a raspberry pi. Everything here has been tested under this architecture. It should work everywhere, but your mileage may vary.

Let's take an example:

mitsurugi@raspi:~/resolv_func/$ cat libpoem.h
int this_is_an_external_func_in_a_lib();
mitsurugi@raspi:~/resolv_func/$ cat libpoem.c
/* Compile with gcc -shared -o libpoem.so libpoem.c */
#include <stdio.h>

int this_is_an_external_func_in_a_lib() {
 puts("ARM disassembly");          //5
 puts("Reading symbol resolving"); //7
 puts("In the cold of night");     //5
 return 42;
}
mitsurugi@raspi:~/resolv_func/$ cat proj.c
/* gcc -o proj -Wl,-rpath=. -L. -I. -l poem proj.c */
#include "libpoem.h"

int main() {
 int ret;
 ret=this_is_an_external_func_in_a_lib();
 return ret;
}
mitsurugi@raspi:~/resolv_func/$

We can compile and run this binary:

mitsurugi@raspi:~/resolv_func/blog$ gcc -shared -o libpoem.so libpoem.c
mitsurugi@raspi:~/resolv_func/blog$ gcc -o proj -Wl,-rpath=. -L. -I. -l poem proj.c
mitsurugi@raspi:~/resolv_func/blog$ ldd proj
 linux-vdso.so.1 (0x7efd9000)
 libpoem.so => ./libpoem.so (0x76f33000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76e30000)
 /lib/ld-linux-armhf.so.3 (0x76f57000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj 
ARM disassembly
Reading symbol resolving
In the cold of night
mitsurugi@raspi:~/resolv_func/blog$

The dynamic linker search for the lib in the current path (which is not really secure, but out of the scope of this blogpost).

This binary runs fine, as expected.

2/ Symbol resolution

The question is, how does the binary knows where to look for the this_is_an_external_func_in_a_lib() call? It's obviously related to string comparison:


mitsurugi@raspi:~/resolv_func/blog$ strings proj libpoem.so | grep external
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Well, if we have the string this_is_an_external_func_in_a_lib in the binary and the library, maybe because they are associated?

Proof: if you alter one of these strings, the program doesn't work anymore:

mitsurugi@raspi:~/resolv_func/blog$ sed s/this_is_an_external_func_in_a_lib/AAAA_is_an_external_func_in_a_lib/g proj > proj2
mitsurugi@raspi:~/resolv_func/blog$ chmod +x proj2
mitsurugi@raspi:~/resolv_func/blog$ ldd proj2
 linux-vdso.so.1 (0x7ed45000)
 libpoem.so => ./libpoem.so (0x76eed000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76dea000)
 /lib/ld-linux-armhf.so.3 (0x76f11000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj2
./proj2: symbol lookup error: ./proj2: undefined symbol: AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

The same happens if you change the function in the library:

mitsurugi@raspi:~/resolv_func/blog$ mv libpoem.so libpoem.so.ori
mitsurugi@raspi:~/resolv_func/blog$ sed s/this_is_an_external_func_in_a_lib/AAAA_is_an_external_func_in_a_lib/g libpoem.so.ori > libpoem.so
mitsurugi@raspi:~/resolv_func/blog$ chmod +x libpoem.so
mitsurugi@raspi:~/resolv_func/blog$ ldd proj                //this is the unaltered binary
 linux-vdso.so.1 (0x7ea81000)
 libpoem.so => ./libpoem.so (0x76ede000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76ddb000)
 /lib/ld-linux-armhf.so.3 (0x76f02000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj 
./proj: symbol lookup error: ./proj: undefined symbol: this_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Seems logic. it search a function by its name.

But wait, what if we change names in BOTH? Would it work? Binary calls for AAAA_is_an_external_func_in_a_lib(), linker will step through all library linked, find libpoem.so, open it, read functions names, fint it and call it. Does it works?

mitsurugi@raspi:~/resolv_func/blog$ ./proj2 
./proj2: symbol lookup error: ./proj2: undefined symbol: AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Still a fail, although we have the same name in library and binary:

mitsurugi@raspi:~/resolv_func/blog$ strings proj2 libpoem.so | grep external_func
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

3/ Read The Freaky Manual (If it exists...)

When you search something, you can read the manual. But in that case, it won't help because there is no manual.
When you google for symbol resolution, you'll end up with a lot of blog post talking about PLT/GOT stuff. Very interesting (yes, read them, it's very valuable), but there is still magic in those blogposts. (In french: https://www.segmentationfault.fr/linux/role-plt-got-ld-so/ ).

And how those blog posts explains how resolution is made?


In the previous blogspot, it just says: "it's a long and complicated code, but in the end, you get the address". I don't like magic in computing.

4/ No magic. Just show me. 

Here are the main links which could help you:
 I'll try to summarize things. First, we have a hash section in ELF files:

mitsurugi@raspi:~/resolv_func/blog$ readelf -x .gnu.hash libpoem.so

Hex dump of section '.gnu.hash':
  0x00000118 03000000 08000000 02000000 06000000 ................
  0x00000128 890020b1 00c44289 08000000 0c000000 .. ...B.........
  0x00000138 0f000000 00af34e8 4245d5ec dea1eacc ......4.BE......
  0x00000148 bbe3927c beda571b d871581c b98df10e ...|..W..qX.....
  0x00000158 76543c94 ead3ef0e 59ef9779          vT<.....Y..y

mitsurugi@raspi:~/resolv_func/blog$

This sections contains a header, bloom filters, and hashes. Libc developers wants to run binary fast. When you solve symbols, you have to step through each symbols and make a strcmp. This is slow. Developers add lots of improvements.

I wrote a parser of .gnu.hash sections (values are displayed both in little and big endian):

mitsurugi@raspi:~/resolv_func/blog$ ./hashparse.py libpoem.so
*** Get GNU HASH section for libpoem.so
[+] Ok, one line. Good
[+] GNU HASH mapping fits perfectly disk and memory layout
    starting at 0x00000118
    and size is 0x00004c long
*** Extracting .gnu.hash
*** Parsing...
[+] Header
3 hash buckets              //we'll use this number later
8 symndx
2 bloom masks
6 bloomshift (minimum 6)
[+] Part 2 - bloom masks
 Mask 0 : 0xb1200089L  | 0x890020b1L
 Mask 1 : 0x8942c400L  | 0xc44289
[+] Part 3 - N Buckets of hash
 Bucket 0 : 0x8  | 0x8000000
 Bucket 1 : 0xc  | 0xc000000
 Bucket 2 : 0xf  | 0xf000000
[+] Part 4 - Hashes
 Hash 0 : 0xe834af00L  | 0xaf34e8
 Hash 1 : 0xecd54542L  | 0x4245d5ec
 Hash 2 : 0xcceaa1deL  | 0xdea1eaccL      //pay attention to this hash
 Hash 3 : 0x7c92e3bb  | 0xbbe3927cL
 Hash 4 : 0x1b57dabe  | 0xbeda571bL
 Hash 5 : 0x1c5871d8  | 0xd871581cL
 Hash 6 : 0xef18db9  | 0xb98df10eL
 Hash 7 : 0x943c5476L  | 0x76543c94
 Hash 8 : 0xeefd3ea  | 0xead3ef0eL
 Hash 9 : 0x7997ef59  | 0x59ef9779
mitsurugi@raspi:~/resolv_func/blog$

4/1/ First speedup: Hash table.

 For quickly find object in a list, use hashtable. Hashtable are a convenient way to sort and find items in a list. The hash function used in the resolver is the djbx33a one:

static uint_fast32_t
dl_new_hash (const char *s)
{
  uint_fast32_t h = 5381;
  for (unsigned char c = *s; c != '\0'; c = *++s)
    h = h * 33 + c;
  return h & 0xffffffff;
}

We can calculate easily the hash of our function:

mitsurugi@raspi:~/resolv_func/blog$ ./dl_new_hash.py this_is_an_external_func_in_a_lib
[+] Calculating hash for this_is_an_external_func_in_a_lib
Output is 0xCCEAA1DF
mitsurugi@raspi:~/resolv_func/blog$

We can find our hash in the .gnu.hash section: 0xcceaa1de (minus the lower bit, but it's nonsignificant when solver compares hashes, although I spent too much time on this detail).

So, if you change the name of the function and its associated hash, it should work? No, not so easily. This is an hash table, you have to get the same bucket. Long story short, your (new_hash % nbuckets) should be equal to (old_hash % nbuckets). nbuckets equals 3 in this library. Let's work with this number:
  • this_is_an_external_func_in_a_lib : hash(func)%3 = 0xCCEAA1DF%3 = 0
  • AAAA_is_an_external_func_in_a_lib : hash(func)%3 = 0xEEA9C6CB%3 = 1 -> Not the same bucket, won't work
  • BAAA_is_an_external_func_in_a_lib : hash(func)%3 = 0xFFE18ACC%3 = 0 -> Good.

So, we change name of the functions, and change hash with 0xFFE18ACC. Will it work? Still not, one last change to do.

4/2/ Second speedup added: Bloom filter

Using hashes is a big speedup, but libc maintenairs adds another big boost: bloom filter. The goal of this is to quickly reject unknown symbols. This bloom filter is made of another hash, and is used as a fast rejection process. If bloom filter fails, the symbol is not in the file. If bloom filter pass, it maybe or maybe not in the file. Apparently, this causes a huge speedup in symbol resolution. That's clever, but I have to change my fnction name.

If you want to bypass this bloom filter, you can recalculate it. Or you can put all bits to 1 which means: always pass. I'm not a programmer, I want things to work the way I want. So let put all bits to 1, and don't try to recalculate anything.

And after the bloom filter change, it will works, because the linker will say:
  • does it pass the bloom filter? Yes
  • does it have an hash? Yes
  • -n the hash bucket, does a function with the same name exists? Yes
  • --> Symbol resolution is done, code is here, work your way.

4/3/ First win:

We have to change function name: Easy, we use BAAA_is_an_external_func_in_a_lib
We have to break bloom filter: Easy, put all bits to 1
We have to change hash value: Easy, just take care of the bucket.

After an hexediting (All bytes have been changed by hand):


mitsurugi@raspi:~/resolv_func/blog$ readelf -x .gnu.hash libpoem.so

Hex dump of section '.gnu.hash':
  0x00000118 03000000 08000000 02000000 06000000 ................
  0x00000128 ffffffff ffffffff 08000000 0c000000 ................
  0x00000138 0f000000 00af34e8 4245d5ec cc8ae1ff ......4.BE......
  0x00000148 bbe3927c beda571b d871581c b98df10e ...|..W..qX.....
  0x00000158 76543c94 ead3ef0e 59ef9779          vT<.....Y..y

mitsurugi@raspi:~/resolv_func/blog$

Look bloom filter (all bits are 1), and hash change.
And now, it works like a charm!


mitsurugi@raspi:~/resolv_func/blog$ ./proj2
ARM disassembly
Reading symbol resolving
In the cold of night
mitsurugi@raspi:~/resolv_func/blog$ gdb -q proj2 
Reading symbols from proj2...(no debugging symbols found)...done.
gdb$ disass main
Dump of assembler code for function main:
   0x000006ac <+0>: push {r7, lr}
   0x000006ae <+2>: sub sp, #8
   0x000006b0 <+4>: add r7, sp, #0
   0x000006b2 <+6>: blx 0x584 <BAAA_is_an_external_func_in_a_lib@plt>
   0x000006b6 <+10>: str r0, [r7, #4]
   0x000006b8 <+12>: ldr r3, [r7, #4]
   0x000006ba <+14>: mov r0, r3
   0x000006bc <+16>: adds r7, #8
   0x000006be <+18>: mov sp, r7
   0x000006c0 <+20>: pop {r7, pc}
End of assembler dump.
gdb$

As you can see, I'm calling the function BAAA_is_an_external_func_in_a_lib(), and it works.

mitsurugi@raspi:~/resolv_func/blog$ strings proj2 libpoem.so | grep BAAA
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

We know how to change a function name inside a binary and its lib without breaking anything!

5/ Now the fun part!

Ok, let's write a quick python patcher, called, patch.py
You can use anything in the range \x01-\xff for function name. Changing a character in a function is not fun. We can be good boyz (or girlz) and use internationalization. Write UTF-8, and be happy with it. But do you know that your xterm interprets escape sequence? \e]34; will print everything in black. Let write black on black and confuse reversers.

5/1/ Fun with ANSI escape code

 we can use a function containing ansi escape code. Ansi escape code can be used to send BEEP, blink characters, change xterm name, change colors, and so on. Here is the fun part, where we change the xterm title when printing the function:

Fun, but can we do better? Ansi escape code can go backward.
So, we can overwrite function name:

Reading symbols from crack...(no debugging symbols found)...done.
(gdb) disass main
Dump of assembler code for function main:
   0x00000688 <+0>: push {r7, lr}
   0x0000068a <+2>: add r7, sp, #0
   0x0000068c <+4>: blx 0x53c <calling@plt>
   0x00000690 <+8>: movs r3, #0
   0x00000692 <+10>: mov r0, r3
   0x00000694 <+12>: pop {r7, pc}
End of assembler dump.
(gdb) q
mitsurugi@raspi:~/resolv_func/blog$

and, the library says:

mitsurugi@raspi:~/resolv_func/blog$ nm libcrack.so | grep ' T '
000004fc T calling
0000050a T calling
00000518 T calling
00000528 T _fini
000003fc T _init
mitsurugi@raspi:~/resolv_func/blog$

Three functions with the same name?!?! Which one is the good one? You can spend a lot of time in this crackme with static analysis only.

Those fuctions are different. Their name is A\x1b[1Dcalling, B\x1b[1Dcalling and C\x1b[1Dcalling. The \x1b[1D is the sequence backward of 1 char, so it overwrites the first char.

5/2/ Fun with IDA

You can play with IDA. IDA doesn't recognize characters and replace them with _. How in the world would you debug a binary
calling functions ____() and ____() and ____() which are different?
I think there is a lot of improvements here, I'll try to make another blogpost with funny sequences.


6/ The End

I think this blogpost is waaaaay too long, so I'll finish it here. Code will be posted to github, it's just a python script which patch address in binary.

Today not possible. Tomorrow possible.
0xMitsurugi

mardi 30 janvier 2018

Solving a CTF chall the [academic|hardest] way (FIC2018)

My previous articles on solving a crackme has gained some attention, so I'm doing the next one (and the last, I promise). This time, I'll explain how to solve a crackme based on a VM. There is a lot more of asm than previous solutions :-)

  1. http://0x90909090.blogspot.fr/2018/01/solving-ctf-chall-easylazy-way-fic2018.html
  2. http://0x90909090.blogspot.fr/2018/01/solving-ctf-chall-hardgood-way-fic2018.html
  3. http://0x90909090.blogspot.fr/2018/01/solving-ctf-chall-crazyomg-way-fic2018.html
This is a more academic blogpost where I'll try to explain how to understand the logic behind the VM and the crackme.

1/ A bit of technic first.


Basically, when you implement a VM, you have to create a virtual CPU. This virtual CPU will have its own registers, memory, CPU flags. This virtual CPU will fetch, decode and execute instructions. Instructions are sequence of bits (for simplification, imagine a byte), and instructions can take 0 to N arguments.
if in pseudo code we want to make 13 xor 37, we can imagine this sequence instructions:
  • PUT 13 in register (say, R1)
  • PUT 37 in another register (say, R2)
  • XOR R1 with R2
this is just encoding after it. If PUT is encoding with a 0x42, register by their numbers, and XOR is encoded as a 0xff, the logical sequence will be:
  • 0x42 0x13 0x1
  • 0x42 0x37 0x2
  • 0xff 0x1 0x2
Easy. That's just conventions. The program is: 0x421301423702ff0102

And the CPU will work with this . Instruction pointer is at offset 0x00. 
  • Fetch: 0x42
  • Decode: that's a push. It takes 2 arguments: value, then register. Increase instruction pointer by 3..
  • Execute: moving value at (instruction pointer +1) to register (instruction pointer +2).

Next:
  • Fetch 0x42
  • and so on..

So if you want to break a VM, you have to learn where the instruction pointer is, where the registers are stored, and how to decode assembly. You have to figure out that 0x42 is a push in the previous example. How? That's the difficulty.

Now, back on our crackme. This is a VM. So, we have the program which emulate a CPU. So, we have to find a big loop: the fetch-decode-execute loop. Once found, you'll know where the instruction pointer is, and where are the instructions.

Next, you'll have to understand the instructions. Once done, this is even more easy: understand program logic, break it, solve the chall, gain points.

2/ Find where things takes place

take time to read the assembly, and follow the dots 1,2,3..


mitsurugi@dojo:~/chall/FIC2018/v24$ gdb -q a.out
Reading symbols from a.out...(no debugging symbols found)...done.
gdb$ disass main
Dump of assembler code for function main:
   0x0000000000400530 <+0>: push   %rbx
   0x0000000000400531 <+1>: mov    $0x1000,%edi
   0x0000000000400536 <+6>: callq  0x400510 <malloc@plt>
   0x000000000040053b <+11>: or     $0xffffffff,%edx
   0x000000000040053e <+14>: test   %rax,%rax
   0x0000000000400541 <+17>: mov    %rax,0x2023d0(%rip)        # 0x602918 <stack>
   0x0000000000400548 <+24>: je     0x40059a <main+106>

//Here is a big loop. The fetch-decode-execute one, probably.
//We read something at 0x602914 (regs+20)
1.   0x000000000040054a <+26>: movslq 0x2023c3(%rip),%rax        # 0x602914 <regs+20>
     0x0000000000400551 <+33>: cmpb   $0xee,0x601440(%rax)       //if equals to 0xee goto end
     0x0000000000400558 <+40>: je     0x40058c <main+92>
3.   0x000000000040055a <+42>: mov    $0x602800,%ebx             //ebx will get increments from 0x10 to 0x10
     0x000000000040055f <+47>: mov    (%rbx),%edx                
     0x0000000000400561 <+49>: movslq 0x2023ac(%rip),%rax        # 0x602914 <regs+20>
     0x0000000000400568 <+56>: test   %edx,%edx  
     0x000000000040056a <+58>: je     0x400582 <main+82>         
4.   0x000000000040056c <+60>: movzbl 0x601440(%rax),%eax        //we fetch the byte @0x601440+rax
     0x0000000000400573 <+67>: cmp    %edx,%eax                  //if eax==edx we call something. That's decode part.
     0x0000000000400575 <+69>: jne    0x40057c <main+76>
     0x0000000000400577 <+71>: xor    %eax,%eax
5.   0x0000000000400579 <+73>: callq  *0x8(%rbx)                 //the call. Probably the execute part.
     0x000000000040057c <+76>: add    $0x10,%rbx
     0x0000000000400580 <+80>: jmp    0x40055f <main+47>         
     0x0000000000400582 <+82>: inc    %eax                       
//The regs+20 gets increased one by one -> so we step in the VM code probably.
2.   0x0000000000400584 <+84>: mov    %eax,0x20238a(%rip)        # 0x602914 <regs+20> 
     0x000000000040058a <+90>: jmp    0x40054a <main+26>

//from here, this is the end of the program
   0x000000000040058c <+92>: mov    0x202385(%rip),%rdi        # 0x602918 <stack>
   0x0000000000400593 <+99>: callq  0x4004c0 <free@plt>
   0x0000000000400598 <+104>: xor    %edx,%edx
   0x000000000040059a <+106>: mov    %edx,%eax
   0x000000000040059c <+108>: pop    %rbx
   0x000000000040059d <+109>: retq   
End of assembler dump.
gdb$ 

We almost understand how this VM works.

  • The instruction pointer is at regs+20 (0x602914), we fetch the instruction at 0x601440+the value in regs+20.
  • The byte is read, then compared to something on 0x602800, 0x602810, 0x602820 and so on. We say this is the decode part.
  • Then, the callq rbx+0x8 is the execute part.

Fetch, decode, execute.

We know how the virtual CPU works. Lets dive into details. First, what do we have around the instruction pointer:


gdb$ x/16wx 0x601440
0x601440 <g_data>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601450 <g_data+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601460 <g_data+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601470 <g_data+48>: 0x00000000 0x00000000 0x00000000 0x00000000
gdb$

We have a lot of 00 (a NOP maybe?). What next?
gdb$ x/160wx 0x601440
0x601440 <g_data>: 0x00000000 0x00000000 0x00000000 0x00000000
 (snip ... snip ...snip)
0x601570 <g_data+304>: 0x00000000 0x0001e155 0x0c0d0300 0x0000e255
0x601580 <g_data+320>: 0x0cf20000 0x0000bb33 0xddf20000 0xcc1300bb
0x601590 <g_data+336>: 0x000000bb 0xbbdd0100 0xbbcc3700 0x00000000
0x6015a0 <g_data+352>: 0x00bbdd01 0x00bbccd3 0x01000000 0x3d00bbdd
0x6015b0 <g_data+368>: 0x0000bbcc 0xdd010000 0xccc000bb 0x000000bb
0x6015c0 <g_data+384>: 0xbbdd0100 0xbbccde00 0x00000000 0x00bbdd01
0x6015d0 <g_data+400>: 0x00bbccab 0x01000000 0xad00bbdd 0x0000bbcc
0x6015e0 <g_data+416>: 0xdd010000 0xcc1d00bb 0x000000bb 0xbbdd0100
0x6015f0 <g_data+432>: 0xbbccea00 0x00000000 0x00bbdd01 0x00bbcc13
0x601600 <g_data+448>: 0x01000000 0x3700bbdd 0x0000bbcc 0xaa010000
0x601610 <g_data+464>: 0x000000bb 0xbb33f200 0x00000001 0x02bbaa7a
0x601620 <g_data+480>: 0xf3000000 0x0003bb33 0x66600000 0x990100ab
0x601630 <g_data+496>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x601640 <g_data+512>: 0x000000bb 0xbb33f400 0x00000001 0x02bbaab2
0x601650 <g_data+528>: 0xf5000000 0x0003bb33 0x664e0000 0x990100ab
0x601660 <g_data+544>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x601670 <g_data+560>: 0x000000bb 0xbb33f600 0x00000001 0x02bbaab4
0x601680 <g_data+576>: 0xf7000000 0x0003bb33 0x66bb0000 0x990100ab
0x601690 <g_data+592>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x6016a0 <g_data+608>: 0x000000bb 0xbb33f800 0x00000001 0x02bbaae6
0x6016b0 <g_data+624>: 0xf9000000 0x0003bb33 0x66d40000 0x990100ab

The first non-zero byte is 0x55. This is probably the beginning of the code.
Now the decode part, if we look at what we have in 0x602800:
gdb$ x/20wx 0x602800
0x602800 <vm_func>: 0x00000011 0x00000000 0x00400696 0x00000000
0x602810 <vm_func+16>: 0x00000099 0x00000000 0x00400a40 0x00000000
0x602820 <vm_func+32>: 0x00000022 0x00000000 0x00400798 0x00000000
0x602830 <vm_func+48>: 0x00000033 0x00000000 0x00400697 0x00000000
0x602840 <vm_func+64>: 0x00000044 0x00000000 0x00400799 0x00000000
gdb$ x/x 0x00400696                  //What is there?
0x400696 <vm_ret>: 0x77058bc3   // oooh, the beginning of vm_ret :)
gdb$ x/x 0x00400a40
0x400a40 <vm_jnz>: 0x1ece058b   //and vm_jnz, and the others ^_^
gdb$

Ok. So the program reads a byte in the g_data part. Then it calls a function depending on this byte.

That's really, really a good point. We have a byte, and a function. Doesn't take long to understand that this the assembly:

  • 0x11 is vm_ret  RETURN
  • 0x99 is vm_jnz  JUMP if NON ZERO
  • 0x22 is vm_cll  CALL
  • 0x33 is vm_mov  MOVE
  • 0x44 is vm_push PUSH
  • 0x55 is vm_ecl  ??
  • 0x66 is vm_cmp  COMPARE
  • 0x77 is vm_jmp  JUMP
  • 0x88 is vm_jzz  JUMP if ZERO
  • 0xaa is vm_mvp  ?? move? pointer maybe? 
  • 0xbb is vm_and  AND
  • 0xcc is vm_add  ADD
  • 0xdd is vm_xor  XOR
  • 0x00 is NOP (we guessed it)
  • 0xee is END (we guessed it also)

Ladies and gentlemen, the asm of the VM.

3/ Let see what happens

So, the first byte is vm_ecl. In order to quickly run the binary, we break at 0x000000000040054a only if $rax!=0


gdb$ b * 0x0000000000400573 if $rax!=0
Breakpoint 3 at 0x400573
gdb$ c
Continuing.
gdb$ info reg rax
rax            0x55 0x55
gdb$ disass vm_ecl
Dump of assembler code for function vm_ecl:
   0x0000000000400d28 <+0>: mov    eax,DWORD PTR [rip+0x201be6]        # 0x602914 <regs+20>
   0x0000000000400d2e <+6>: lea    edx,[rax+0x1]
   0x0000000000400d31 <+9>: mov    DWORD PTR [rip+0x201bdd],edx        # 0x602914 <regs+20>
   0x0000000000400d37 <+15>: movsxd rdx,edx
   0x0000000000400d3a <+18>: mov    dl,BYTE PTR [rdx+0x601440]
   0x0000000000400d40 <+24>: cmp    dl,0xe1
   0x0000000000400d43 <+27>: je     0x400d6f <vm_ecl+71>
   0x0000000000400d45 <+29>: cmp    dl,0xe2
   0x0000000000400d48 <+32>: je     0x400de2 <vm_ecl+186>
   0x0000000000400d4e <+38>: cmp    dl,0xe0
   0x0000000000400d51 <+41>: jne    0x400e55 <vm_ecl+301>
(...)
   0x0000000000400d6a <+66>: call   0x400520 <exit@plt>
(...)
   0x0000000000400ddd <+181>: jmp    0x4004d0 <write@plt>
(...)
   0x0000000000400e50 <+296>: jmp    0x4004e0 <read@plt>

Well, a switch case. If next byte is 0xe1 0xe2 or 0xe0, this function behaves differently. We have read, write and exit function in it. That should be for input/output. Let's step over for the moment, and see what's happening:
gdb$ stepo
Temporary breakpoint 4 at 0x40057c
ENTER PASS :
gdb$

That's it. Let's go back to the VM disassembly a bit. We had:

0x601570 <g_data+304>: 0x00000000 0x0001e155 0x0c0d0300 0x0000e255
0x601580 <g_data+320>: 0x0cf20000 0x0000bb33 0xddf20000 0xcc1300bb

Put in right order: 55 e1 01 00 00 03 0d 0c 55 e2 00 00 f2 cf 33 bb... 55 is I/O, e1 seems to be output, numbers after are unknown (adress of the string probably), and next instructions should be 55 e2 (waiting for input). Let see the next instruction:

gdb$ info reg rax
rax            0x0c 0x0c
gdb$

Next instructions is 0x0c ?? As if the instruction pointer missed a step (?).
In our case, that's not really important because 0xc is not a valid instruction, so it will loop around all vm_functions, the iterate, then read 0x55. Let continue, stepover the vm_ecl function:

gdb$ stepo
Temporary breakpoint 5 at 0x40057c
ABCDEFABCDEF                       //entered myself
0x0000000000400580 in main ()
gdb$
gdb$ x/x 0x602914
0x602914 <regs+20>: 0x00000143    //offset of the instruction pointer
gdb$ x/wx 0x601440+0x143
0x601583 <g_data+323>: 0x00bb330c    //the instruction pointer
gdb$

and once again, the 0x0c invalid instruction. vm_ecl doesn't increment the instruction pointer to the next instruction. The VM is built on a way that it doesn't matter, as long as the instruction is invalid... This is kind of a bug!

Let's fast forward a bit, until a 0xdd instruction (XOR):

Now, a bit of refactoring, this is just the VM assembly extracted from g_data:
0xdd 0xbb 0x00 0x13                //vm_xor
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01 //vm_add
0xdd 0xbb 0x00 0x37                //vm_xor
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01 //vm_add
0xdd 0xbb 0x00 0xd3
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01
0xdd 0xbb 0x00 0x3d
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01
0xdd 0xbb 0x00 0xc0
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01

Seeing a pattern? 0xbb shoud be an offset to somewhere, XOR is the key, and we slide this offset one by one, in pseudo code, it becomes:

xor(pass[i], 0x13)
i = i+1
xor(pass[i], 0x37)
i = i+1
etc...

We extract the key: 0x1337d33dc0 just by reading the vm assembly.

And what about the instruction pointer? Does it point to the right instruction after a vm_xor?

gdb$ info reg rax
rax            0xcc 0xcc
gdb$

yep, it works, so the vm_xor instruction advance the instruction pointer right.

The next steps are to understand other vm_XXX instructions, where data is stored, what is done with it, and so on.

Just step through the function and mark all known addresses (base address, offsets, registers, CPU flags, and you'll quickly be able to reverse any VM code. Follow the vm_cmp instruction, learn where are the offsets, and compare yourself the bytes.

4/ Why the crackme accepts more than one solution?

As we saw, sometimes, the instruction pointer is not incremented to the next instruction. If the instruction is illegal, nothing happen. But if the instruction pointer falls on a known instruction, a different behavior is done.
0xF4b found that the vm_mov is also buggy, and an 0xbb instruction is called (vm_and) instead of the vm_cmp, and the JNZ is never called afterwards.

5/ Conclusion

Thank you for scrolling this far ;-) Learn to pwn crackme.
If you want the a.out file to play with it, drop me a DM or email.

Those who want to do will find a way. 
Those who don't want to do search an excuse.
0xMitsurugi

dimanche 28 janvier 2018

Solving a CTF chall the [crazy|OMG] way (FIC2018)

This is the third blogpost about the same crackme, you can read the first two ones here:
This time, an extremely simple solution. You thought that pin was almost cheating? Get prepared to see worse.

1/ Basic recon

mitsurugi@dojo:~/chall/FIC2018/v24$ ls -l a.out 
-rwxr-xr-x 1 mitsurugi mitsurugi 15568 janv. 24 14:30 a.out
mitsurugi@dojo:~/chall/FIC2018/v24$ strings a.out 
/lib64/ld-linux-x86-64.so.2
libc.so.6
exit
read
malloc
__libc_start_main
write
free
__gmon_start__
GLIBC_2.2.5
tPHc
fffff.
[]A\A]A^A_
;*3$"
FAILEDnWINnENTER PASS :              //look this line
 (... snip snip snip ...)
mitsurugi@dojo:~/chall/FIC2018/v24$

So, we imagine that FAILED means fail and WIN is the winning message, right?

That's all we need to know! 
Yes. gdb? nope. asm? nope. reversing capabilities? nope. Lazinest? A lot.

2/ Hey, you like surprises and python?

you know angr, right? If not, check this awesome program. It can explore binaries, instrument them, modify them on the fly, explore all paths, and all by itself!

It blews my minds me on this:

#! /usr/bin/env python
# You are not judged because you fall. 
# You are judged by the way you get up after a fall.
#                          0xMitsurugi

import angr, datetime

print "Starting"
start = datetime.datetime.now()
#Loading the binary
proj = angr.Project('./a.out')
#Create a simulation manager
simgr = proj.factory.simgr()
#We search the word WIN somewhere in the file descriptor 1 (standard output)
simgr.explore(find=lambda s: "WIN" in s.posix.dumps(1))

#angr works hard here ...

#Let's see which input produce a 'WIN' in output
s = simgr.found[0]
# file descriptor 0 is standard input
flag = s.posix.dumps(0)

print "The flag is: %s " % flag
#At least we know this challenge is buggy..
print "\nWe know this challenge is bugged :("
print "Flag in hex is: %s" % (flag.encode('hex'))
end = datetime.datetime.now()
print "Time used: %s" % (end - start)

Basically we tell angr to open the binary, and explore it (like fuzzing, but better :) ) until it found the word "WIN" in the standard output. And then, we print the standard input which generates this output. Sounds crazy?

And, as you guess, in only 7 minutes, without any prior knowledge:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./solver.py 
Starting
The flag is: iWaseMyTime 

We know this challenge is bugged :(
Flag in hex is: 6957617300654d7954696d65
Time used: 0:07:22.198219
mitsurugi@dojo:~/chall/FIC2018/v24$

Without the bug, angr would have found the good flag, but it remains impressive: the angr solution works. The only thing to know is that the binary prints WIN for victory, which could be found with a strings command..

Ready? Prepare yourself!
0xMitsurugi

Solving a CTF chall the [hard|good] way (FIC2018)

Hope you liked my last blogpost http://0x90909090.blogspot.fr/2018/01/solving-ctf-chall-easylazy-way-fic2018.html , this is the same binary, with a different analysis.

This crackme is simple enough to use it for learning purpose. I'm taking it for another round of reversing. This time, it's gdb-fu! In the CTF, this was my first approach, but pin was faster :-)

1/ Basics

We remember the function name, vm_xor and vm_cmp. Other function looks more like standard operation (JNZ, JMP, call, and so on).
The binary is not stripped, and we see a variable called 'regs'. We guess that's the registers of the VM. While running under gdb, we can print their contents with x/8wx &regs


2/ XOR part

In gdb, we have breakpoints, and we can execute commands at each breakpoints. We'll break at vm_xor, and see what's happening. I don't use any gdbinit scripts because it's a VM, and my gdbscripts are meant to be used in conjunction with known CPUs (ARM/Intel). The goal here is to count how many times the vm_xor function is called, and see if we can gain some insights of what is going on:

mitsurugi@dojo:~/chall/FIC2018/v24$ gdb -q -nx ./a.out
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) b * vm_xor 
Breakpoint 1 at 0x400c9e
(gdb) commands 
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>echo vm_xor is called\n
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :first try

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
FAILEDn[Inferior 1 (process 4165) exited normally]
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :1

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
FAILEDn[Inferior 1 (process 4169) exited normally]
(gdb) 

Ok, so we understand that vm_xor is called 12 times, whichever the size of the PASS is.

We hope now that vm_xor will works like the XOR: it takes two args from registers. Let's inspect this:

(gdb) commands 1
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>x/8wx &regs
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456ABCDEF

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f2 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000014b 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f3 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000156 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f4 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000161 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f5 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000016c 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f6 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000177 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f7 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000182 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f8 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000018d 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f9 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000198 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fa 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001a3 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fb 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001ae 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fc 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001b9 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fd 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001c4 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4291) exited normally]
(gdb)

Ok, so we don't see any of our PASS in register. The first one seems to progress one by one, and the 6th one makes progression. It could be relative address, or offset, or anything. Let dive into the vm_xor function. It must have an XOR operation, and it could be interesting to see the operands of the command:

(gdb) disassemble vm_xor 
Dump of assembler code for function vm_xor:
   0x0000000000400c9e <+0>: mov    0x201c70(%rip),%eax        # 0x602914 <regs+20>
   0x0000000000400ca4 <+6>: lea    0x1(%rax),%edx
   0x0000000000400ca7 <+9>: mov    %edx,0x201c67(%rip)        # 0x602914 <regs+20>
   0x0000000000400cad <+15>: movslq %edx,%rdx
   0x0000000000400cb0 <+18>: mov    0x601440(%rdx),%dl
   0x0000000000400cb6 <+24>: cmp    $0xab,%dl
   0x0000000000400cb9 <+27>: jne    0x400cf0 <vm_xor+82>
   0x0000000000400cbb <+29>: lea    0x2(%rax),%edx
   0x0000000000400cbe <+32>: add    $0x3,%eax
   0x0000000000400cc1 <+35>: mov    %eax,0x201c4d(%rip)        # 0x602914 <regs+20>
   0x0000000000400cc7 <+41>: cltq   
   0x0000000000400cc9 <+43>: movslq %edx,%rdx
   0x0000000000400ccc <+46>: movzbl 0x601440(%rax),%eax
   0x0000000000400cd3 <+53>: movzbl 0x601440(%rdx),%edx
   0x0000000000400cda <+60>: mov    0x602900(,%rax,4),%eax
   0x0000000000400ce1 <+67>: movslq 0x602900(,%rdx,4),%rdx
   0x0000000000400ce9 <+75>: xor    %al,0x601440(%rdx)         //HERE
   0x0000000000400cef <+81>: retq   
   0x0000000000400cf0 <+82>: cmp    $0xbb,%dl
   0x0000000000400cf3 <+85>: jne    0x400d27 <vm_xor+137>
   0x0000000000400cf5 <+87>: lea    0x2(%rax),%edx
   0x0000000000400cf8 <+90>: add    $0x3,%eax
   0x0000000000400cfb <+93>: mov    %eax,0x201c13(%rip)        # 0x602914 <regs+20>
   0x0000000000400d01 <+99>: cltq   
   0x0000000000400d03 <+101>: movslq %edx,%rdx
   0x0000000000400d06 <+104>: movzbl 0x601440(%rdx),%edx
   0x0000000000400d0d <+111>: movslq 0x602900(,%rdx,4),%rcx
   0x0000000000400d15 <+119>: mov    0x601440(%rcx),%dl
   0x0000000000400d1b <+125>: xor    0x601440(%rax),%dl         //and HERE 
   0x0000000000400d21 <+131>: mov    %dl,0x601440(%rcx)
   0x0000000000400d27 <+137>: retq   
End of assembler dump.
(gdb)

Easy, disable breakpoint 1, and create two more:

(gdb) disable 1
(gdb) b * 0x0000000000400ce9
Breakpoint 2 at 0x400ce9
(gdb) commands
Type commands for breakpoint(s) 2, one per line.
End with a line saying just "end".
>echo first XOR in vm_xor\n
>info reg al
>x/x 0x601440+$rdx
>c
>end
(gdb) b * 0x0000000000400d1b
Breakpoint 3 at 0x400d1b
(gdb) commands
Type commands for breakpoint(s) 3, one per line.
End with a line saying just "end".
>echo second XOR in vm_func\n
>x/x 0x601440+$rax
>info reg dl
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :ABCDEF123456

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x60158e <g_data+334>: 0x13
dl             0x41 65                    //seems our key

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x601599 <g_data+345>: 0x37
dl             0x42 66                    //Yup it is

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015a4 <g_data+356>: 0xd3                  //So this one is the xor key
dl             0x43 67

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015af <g_data+367>: 0x3d
dl             0x44 68

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015ba <g_data+378>: 0xc0
dl             0x45 69

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015c5 <g_data+389>: 0xde
dl             0x46 70

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015d0 <g_data+400>: 0xab
dl             0x31 49

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015db <g_data+411>: 0xad
dl             0x32 50

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015e6 <g_data+422>: 0x1d
dl             0x33 51

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015f1 <g_data+433>: 0xea
dl             0x34 52

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015fc <g_data+444>: 0x13
dl             0x35 53

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x601607 <g_data+455>: 0x37
dl             0x36 54
FAILEDn[Inferior 1 (process 4575) exited normally]
(gdb) 

Well, only the second XOR is used, but it's not really important at this point. The important point is to see that each of our PASS has been XORed with a constant string (you can repeat to veroify this).

So, we have an XOR key, if we copy it we have: 1337d33dc0deabad1dea1337

(1337d33dc0de?? 1337d34dc0de would have sound better, I think. Another bug? ^_^ )

Now we have to find the expected solution, because PASS^key=solution and with simple math, we can say that: PASS = key ^ solution.

3/ CMP part

Well, we use the same technics. Let's break on vm_cmp and see what happens in registers:

(gdb) disable 2
(gdb) disable 3
(gdb) b * vm_cmp 
Breakpoint 4 at 0x400852
(gdb) commands 
Type commands for breakpoint(s) 4, one per line.
End with a line saying just "end".
>x/8wx &regs
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456ABCDEF

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000022 0x0000007a 0x00000005 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4746) exited normally]
(gdb) 
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :ABCDEF123456

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000052 0x0000007a 0x00000075 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4751) exited normally]
(gdb) 

Interesting. We see that register 1 changes, and register 2 stays the same.
One more check, because the key is 1337d33dc0deabad1dea1337:
  •  '1' XOR '0x13' => 0x22
  •  'A' XOR '0x13' => 0x52
So, we know that register 1 is PASS XOR key and register 2 is solution.

But, have you spotted something else really weird?

Register 3 changes too!!
And it don't take a lot of time to understand that register 3 holds the "PASS XOR key":
  •  '2' XOR '0x37' => 0x05
  •  'B' XOR '0x37' => 0x75
We know how to extract all bytes of the solution.
Once again, we use the gdb commands function to force register to have the good values, and let print this value.

(gdb) commands
Type commands for breakpoint(s) 4, one per line.
End with a line saying just "end".
>x/8wx &regs
>set *(char *) 0x602900 = *0x602904
>set *(char *) 0x602908 = *0x60290c
>c
>end
(gdb) r 
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456123456

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000022 0x0000007a 0x00000005 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000007a 0x0000007a 0x00000060 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001f5 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000e0 0x000000b2 0x00000009 0x0000004e
0x602910 <regs+16>: 0x00000000 0x0000021b 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000b2 0x000000b2 0x0000004e 0x0000004e
0x602910 <regs+16>: 0x00000000 0x00000225 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000f5 0x000000b4 0x000000e8 0x000000bb
0x602910 <regs+16>: 0x00000000 0x00000255 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000009a 0x000000e6 0x0000009f 0x000000d4
0x602910 <regs+16>: 0x00000000 0x0000027b 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000e6 0x000000e6 0x000000d4 0x000000d4
0x602910 <regs+16>: 0x00000000 0x00000285 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000002e 0x00000049 0x000000de 0x00000083
0x602910 <regs+16>: 0x00000000 0x000002ab 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000049 0x00000049 0x00000083 0x00000083
0x602910 <regs+16>: 0x00000000 0x000002b5 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000026 0x0000007e 0x00000001 0x00000052
0x602910 <regs+16>: 0x00000000 0x000002db 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000007e 0x0000007e 0x00000052 0x00000052
0x602910 <regs+16>: 0x00000000 0x000002e5 0x00603010 0x00000000
WINn[Inferior 1 (process 5104) exited with code 0377]
(gdb)


Ok, job done, we just have to copy bytes from register 2 and 4. vm_cmp has two times two bytes to compare, but function is called twice. I really should dig inside this func to understand how this work :)

solution is: 7a60b24eb4bbe6d449837e52

4/ Victory


#! /usr/bin/python
# First you have to be hit to know how to defend.
#                                     0xMitsurugi

key='1337d33dc0deabad1dea1337'.decode('hex')
sol='7a60b24eb4bbe6d449837e52'.decode('hex')

sol=[]
for i in range(len(sol)):
    c=chr(ord(sol[i]) ^ ord(key[i]))
    sol.append(c)

print ''.join(sol)

And without any surprise:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.py 
iWasteMyTime
mitsurugi@dojo:~/chall/FIC2018/v24$

Job done, time to drink beer for victory.
I have not failed 700 times, I have not failed once.
I have succeeded in proving those 700 ways will not work.
When I have eliminated the ways that will not work,
I will find the way that will work 
0xMitsurugi

vendredi 26 janvier 2018

Solving a CTF chall the [easy|lazy] way (FIC2018)

During the FIC2018, there was a CTF. One of the challenge was reverse, and we were given an a.out file.

1/ First steps:


mitsurugi@dojo:~/chall/FIC2018/v24$ ls -l a.out
-rwxr-xr-x 1 mitsurugi mitsurugi 15568 Jan 24 14:30 a.out
mitsurugi@dojo:~/chall/FIC2018/v24$ file a.out 
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=4cde05f176c1b36218ba56a4f1fb1249ad1a8c1b, not stripped
mitsurugi@dojo:~/chall/FIC2018/v24$ ./a.out 
ENTER PASS :aaaa
FAILEDnmitsurugi@dojo:~/chall/FIC2018/v24$ 

Ok, so it's a crackme. Let's toy a little bit with it:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./a.out 
ENTER PASS ::aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
FAILEDnmitsurugi@dojo:~/chall/FIC2018/v24$ aaaaaaaaaaaaaaaaaaaa
bash: aaaaaaaaaaaaaaaaaaaa: command not found
mitsurugi@dojo:~/chall/FIC2018/v24$

Weird. It seems that it take only 12 chars in consideration:
mitsurugi@dojo:~/chall/FIC2018/v24$ ./a.out 
ENTER PASS :123456123456abcde
FAILEDnmitsurugi@dojo:~/chall/FIC2018/v24$ abcde
bash: abcde: command not found
mitsurugi@dojo:~/chall/FIC2018/v24$

Ok, time to dig in:

mitsurugi@dojo:~/chall/FIC2018/v24$ nm -A a.out 
a.out:00000000006011c8 d _DYNAMIC
a.out:00000000006013a0 d _GLOBAL_OFFSET_TABLE_
a.out:0000000000400ee0 R _IO_stdin_used
a.out:                 w _ITM_deregisterTMCloneTable
a.out:                 w _ITM_registerTMCloneTable
a.out:                 w _Jv_RegisterClasses
a.out:00000000004011a8 r __FRAME_END__
a.out:00000000006011c0 d __JCR_END__
a.out:00000000006011c0 d __JCR_LIST__
a.out:00000000006028e0 D __TMC_END__
a.out:00000000006028e0 B __bss_start
a.out:0000000000601400 D __data_start
a.out:0000000000400650 t __do_global_dtors_aux
a.out:00000000006011b8 t __do_global_dtors_aux_fini_array_entry
a.out:0000000000601408 D __dso_handle
a.out:00000000006011b0 t __frame_dummy_init_array_entry
a.out:                 w __gmon_start__
a.out:00000000006011b8 t __init_array_end
a.out:00000000006011b0 t __init_array_start
a.out:0000000000400ed0 T __libc_csu_fini
a.out:0000000000400e60 T __libc_csu_init
a.out:                 U __libc_start_main@@GLIBC_2.2.5
a.out:00000000006028e0 D _edata
a.out:0000000000602920 B _end
a.out:0000000000400ed4 T _fini
a.out:0000000000400488 T _init
a.out:000000000040059e T _start
a.out:00000000006028e0 b completed.6661
a.out:0000000000601400 W data_start
a.out:00000000004005d0 t deregister_tm_clones
a.out:                 U exit@@GLIBC_2.2.5
a.out:00000000006028f0 B flags
a.out:0000000000400670 t frame_dummy
a.out:                 U free@@GLIBC_2.2.5
a.out:0000000000601440 D g_data
a.out:0000000000400530 T main
a.out:                 U malloc@@GLIBC_2.2.5
a.out:                 U read@@GLIBC_2.2.5
a.out:0000000000400610 t register_tm_clones
a.out:0000000000602900 B regs
a.out:0000000000602918 B stack
a.out:0000000000400ab9 t vm_add
a.out:0000000000400bed t vm_and
a.out:0000000000400798 t vm_cll
a.out:0000000000400852 t vm_cmp
a.out:0000000000400d28 t vm_ecl
a.out:0000000000602800 D vm_func
a.out:0000000000400963 t vm_jmp
a.out:0000000000400a40 t vm_jnz
a.out:00000000004009c7 t vm_jzz
a.out:0000000000400697 t vm_mov
a.out:0000000000400b6a t vm_mvp
a.out:0000000000400799 t vm_psh
a.out:0000000000400696 t vm_ret
a.out:0000000000400c9e t vm_xor
a.out:                 U write@@GLIBC_2.2.5
mitsurugi@dojo:~/chall/FIC2018/v24$

That's really, really interesting. Function names let think this is a VM. The function vm_xor leads us to imagine that the input will be XORED, then compared, thanks to vm_cmp function.

We have two ways for solving this:

  1. bruteforce solution
  2. doing it in a clean way

This is a CTF, lets work dirty.

2/ Straight to the winning point

The strategy is this one: let try by bruteforce any character and count how many instructions this program will do before saying the password is bad. If we have one (or more) good characters, the program will run longer to check other characters. Easy.

pin is a program which can count instructions. The inscount library is in the source tree. Inscount can count how many instructions a program will compute during its lifetime. Compile it, and use it.

mitsurugi@dojo:~/chall/FIC2018/v24$ ./pin-3.5-97503-gac534ca30-gcc-linux/pin -t pin-3.5-97503-gac534ca30-gcc-linux/source/tools/ManualExamples/obj-intel64/inscount0.so -- ./a.out
ENTER PASS :aaaa
mitsurugi@dojo:~/chall/FIC2018/v24$ cat inscount.out 
Count 191674
mitsurugi@dojo:~/chall/FIC2018/v24$ ./pin-3.5-97503-gac534ca30-gcc-linux/pin -t pin-3.5-97503-gac534ca30-gcc-linux/source/tools/ManualExamples/obj-intel64/inscount0.so -- ./a.out
ENTER PASS :bbbb
mitsurugi@dojo:~/chall/FIC2018/v24$ cat inscount.out 
Count 191674
mitsurugi@dojo:~/chall/FIC2018/v24$

Ok, two wrong password use the same numbers of instructions.

This is the shell script:

#! /bin/bash
# If there is a will, there is a way
#                        0xMitsurugi

pass=$1

for char in a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H J K L M N O P Q R S T U V W X Y Z
do
  ./pin-3.5-97503-gac534ca30-gcc-linux/pin -t pin-3.5-97503-gac534ca30-gcc-linux/source/tools/ManualExamples/obj-intel64/inscount0.so -- ./a.out <<< $pass$char
  cat inscount.out | tr -d '\n'
  echo "  "$pass$char
done


Let's industrialize it:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh
ENTER PASS :FAILEDnCount 191674  a
ENTER PASS :FAILEDnCount 191674  b
ENTER PASS :FAILEDnCount 191674  c
ENTER PASS :FAILEDnCount 191674  d
ENTER PASS :FAILEDnCount 191674  e
ENTER PASS :FAILEDnCount 191674  f
ENTER PASS :FAILEDnCount 191674  g
ENTER PASS :FAILEDnCount 191675  h
ENTER PASS :FAILEDnCount 191964  i
ENTER PASS :FAILEDnCount 191674  j
ENTER PASS :FAILEDnCount 191674  k
ENTER PASS :FAILEDnCount 191675  l
ENTER PASS :FAILEDnCount 191675  m
ENTER PASS :FAILEDnCount 191675  n
ENTER PASS :FAILEDnCount 191675  o
ENTER PASS :FAILEDnCount 191674  p
ENTER PASS :FAILEDnCount 191674  q
ENTER PASS :FAILEDnCount 191674  r
ENTER PASS :FAILEDnCount 191690  s
ENTER PASS :FAILEDnCount 191674  t
ENTER PASS :FAILEDnCount 191674  u
ENTER PASS :FAILEDnCount 191674  v
ENTER PASS :FAILEDnCount 191674  w
ENTER PASS :FAILEDnCount 191674  x
ENTER PASS :FAILEDnCount 191674  y
ENTER PASS :FAILEDnCount 191674  z
ENTER PASS :FAILEDnCount 191674  A
ENTER PASS :FAILEDnCount 191674  B
ENTER PASS :FAILEDnCount 191674  C
ENTER PASS :FAILEDnCount 191674  D
ENTER PASS :FAILEDnCount 191674  E
ENTER PASS :FAILEDnCount 191674  F
ENTER PASS :FAILEDnCount 191674  G
ENTER PASS :FAILEDnCount 191674  H
ENTER PASS :FAILEDnCount 191674  J
ENTER PASS :FAILEDnCount 191674  K
ENTER PASS :FAILEDnCount 191674  L
ENTER PASS :FAILEDnCount 191674  M
ENTER PASS :FAILEDnCount 191674  N
ENTER PASS :FAILEDnCount 191690  O
ENTER PASS :FAILEDnCount 191674  P
ENTER PASS :FAILEDnCount 191674  Q
ENTER PASS :FAILEDnCount 191674  R
ENTER PASS :FAILEDnCount 191674  S
ENTER PASS :FAILEDnCount 191674  T
ENTER PASS :FAILEDnCount 191674  U
ENTER PASS :FAILEDnCount 191674  V
ENTER PASS :FAILEDnCount 191674  W
ENTER PASS :FAILEDnCount 191674  X
ENTER PASS :FAILEDnCount 191674  Y
ENTER PASS :FAILEDnCount 191674  Z
mitsurugi@dojo:~/chall/FIC2018/v24$

It seems that we have some artefacts (???). Let's reduce the output by taking the biggest numbers only, and iterate until we got the solution:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh | sort | tail -3
ENTER PASS :FAILEDnCount 191690  Q
ENTER PASS :FAILEDnCount 191690  f
ENTER PASS :FAILEDnCount 191964  i
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh i | sort | tail -3
ENTER PASS :FAILEDnCount 191981  iZ
ENTER PASS :FAILEDnCount 191993  iS
ENTER PASS :FAILEDnCount 192931  iW
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iW | sort | tail -3
ENTER PASS :FAILEDnCount 192931  iWo
ENTER PASS :FAILEDnCount 192946  iWO
ENTER PASS :FAILEDnCount 193218  iWa
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWa | sort | tail -3
ENTER PASS :FAILEDnCount 193221  iWao
ENTER PASS :FAILEDnCount 193221  iWar
ENTER PASS :FAILEDnCount 194724  iWas           //it looks like an english sentence
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWas | sort | tail -3
ENTER PASS :FAILEDnCount 194724  iWasx
ENTER PASS :FAILEDnCount 194724  iWasy
ENTER PASS :FAILEDnCount 194724  iWasz          //??? Maybe not
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWasz | sort | tail -3
ENTER PASS :FAILEDnCount 194739  iWaszR
ENTER PASS :FAILEDnCount 194739  iWaszn
ENTER PASS :FAILEDnCount 195689  iWasze
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWasze | sort | tail -3
ENTER PASS :FAILEDnCount 195690  iWaszeY
ENTER PASS :FAILEDnCount 195690  iWaszeZ
ENTER PASS :FAILEDnCount 195979  iWaszeM
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWaszeM | sort | tail -3
ENTER PASS :FAILEDnCount 195980  iWaszeMz
ENTER PASS :FAILEDnCount 195995  iWaszeMn
ENTER PASS :FAILEDnCount 196945  iWaszeMy
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWaszeMy | sort | tail -3
ENTER PASS :FAILEDnCount 196962  iWaszeMye
ENTER PASS :FAILEDnCount 196962  iWaszeMym
ENTER PASS :FAILEDnCount 197236  iWaszeMyT
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWaszeMyT | sort | tail -3
ENTER PASS :FAILEDnCount 197236  iWaszeMyTz
ENTER PASS :FAILEDnCount 197264  iWaszeMyTE
ENTER PASS :FAILEDnCount 198201  iWaszeMyTi
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWaszeMyTi | sort | tail -3
ENTER PASS :FAILEDnCount 198201  iWaszeMyTiz
ENTER PASS :FAILEDnCount 198202  iWaszeMyTil
ENTER PASS :FAILEDnCount 198491  iWaszeMyTim
mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.sh iWaszeMyTim | sort | tail -3
ENTER PASS :FAILEDnCount 198492  iWaszeMyTimn
ENTER PASS :FAILEDnCount 198492  iWaszeMyTimo
ENTER PASS :WINnCount 195256  iWaszeMyTime
mitsurugi@dojo:~/chall/FIC2018/v24$

And here, I was WTF?? "iWaszeMyTime" ?? This one didn't flag on the platform.

Curiously:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./a.out 
ENTER PASS :iWas#eMyTime
WINnmitsurugi@dojo:~/chall/FIC2018/v24$ 
mitsurugi@dojo:~/chall/FIC2018/v24$ ./a.out 
ENTER PASS :iWas*eMyTime
WINnmitsurugi@dojo:~/chall/FIC2018/v24$ 
mitsurugi@dojo:~/chall/FIC2018/v24$


I thing you already find the real answer: 'iWasteMyTime'.

This is really a strange side effect of the binary. I don't understand why it validate any strings.
In CTF, you run for flag, you don't dissect binaries :-)

Maybe the challenge is buggy ^_^

Let me give you a taste of my steel.
~Soulcalibur III - Mitsurugi