mercredi 21 octobre 2015

Playing with RSA and python


In a challenge, I was confronted with RSA. Usual stuff, which means you have to break a key, and decipher data. This blogpost is just a cookbook of what I used. This is not how to break a RSA key, this is more how you use RSA with python in order to have better understanding of what you can do with python.

A bit of theory, at first

A public key is composed of 2 numbers: n and e
A private key is composed of 2 numbers: n and d

With a private key, you decipher messages, with public key you cipher messages.

n is the results of two prime number, p and q.
e is a coprime of 'n', which means the GCD of n and e is 1.
d is choosen as d*e % (p-1)(q-1)=1

Confused? Read the excellent paper here:

Creating RSA keys manually

Let's choose some numbers in order to create an RSA key:
  • p=47 (prime)
  • q=13 (prime)
  • n=p*q=611
  • m=(p-1)*(q-1)=552
  • e=5 because PGCD(552,5)=1
  • d=221 because (1+2×552)÷5=221 which is not a decimal number.
  • (still confused? read again Paj's paper)

Go, python, go:

Let's follow the documentation:

We can create an RSA key with our numbers, encrypt data with public key and decrypt data with private key:
$ ipython
In [1]: from Crypto.PublicKey import *
In [2]: key=RSA.construct((611L,5L,221L,47L,13L))
In [3]: key.publickey().encrypt('Z','x')[0]
Out[3]: '\x01\xe0'
In [4]: key.decrypt('\x01\xe0')
Out[4]: 'Z'

  • The encrypted message here is really short, it must be shorter than the key.
  • The second parameter of encrypt() function is mandatory, but is useless, put whatever you want.
  • key represent the private key. If you want the public key, you have to specify it with key.publickey().

And of course, you can import/export keys in PEM/DER format:

Ok, and when you have only the public key?

That's (not so) easy. Let's replay with our really weak key:
$ ipython
In [1]: from Crypto.PublicKey import *
In [2]: pubkey=RSA.construct((611L,5L))
In [3]: pubkey.has_private()
Out[3]: False
In [4]: pubkey.e
Out[4]: 5L
In [5]: pubkey.n
Out[5]: 611L

You can print out the 'n' parameter and try to find the prime numbers.. Once done, just verify the 'e' parameter and find the 'd' param. Next, build the private key, and key.decrypt() everything.

611 = 47*13  (that's the hard part when n is big)
for i in range(1,25):
    if (((1.0+i*m)/e) % 1)==0:
        print (1+i*m)/e


221  which is the 'd' parameter

And you have everything to create a private key.

And with openssl?

$ ipython
In [1]: from Crypto.PublicKey import *
In [2]: key=RSA.construct((611L,5L,221L,47L,13L))

In [3]: print key.exportKey()

Copy paste the key in a file called privkey.pem, and:
mitsurugi@mitsu:~$ printf '\x01\xe0' | openssl rsautl -decrypt -inkey privkey.pem -raw

lundi 5 octobre 2015

Hi gdb, this is python : break a VM crackme with class!

I like to play crackmes and one of my favorite tool is (of course) gdb. The title of this post is a tribute to an 0vercl0k's blogpost. Few days ago, I was trying to break a VM crackme (no spoil, no link).

Once you figure you are facing a VM, you are confronted with two CPU: the real one, where every gdb instructions make sense (it was x86), and the emulated one, where you have to deal with it to get the juicy part of the crackme.
At first, I spent a lot of time to step through all the x86 instructions and memory to read the instruction pointer, the opcode and the internal registers of the VM.

And I was thinking: Is there a way to emulate commands in gdb where a stepi could step efficiently in the VM, where info reg would dump the registers of the VM, and so on. This blogpost explain how to do it in a very simple and convenient manner. This is not a generic way to break VM, this is more how python can help you to break VM.

1/ Hi gdb this is python

Since a long time, you can call python script from gdb:

or from a source script:
(gdb) source /path/to/
You can also execute any gdb command within python and grab the result for analysis:
(gdb) python gdb.execute("info reg %s"%("eax"))
eax            0x1    1

But you can do a lot more. Python ca define new command in gdb, as defined in documentation (and excellent paper from 0vercl0k):
mitsurugi@mitsu:~/crackmes/VM$ cat
import gdb

class HelloWorld(gdb.Command):
    """Greet the world"""
    def __init__(self):
        super(HelloWorld, self).__init__("mitsurugi", gdb.COMMAND_SUPPORT)

    def invoke(self, args, from_tty):
        print "The name's 0xMitsurugi\nRemember it!"

mitsurugi@mitsu:~/crackmes/VM$ gdb -q -nx ./VM.bin
Reading symbols from ./VM.bin...(no debugging symbols found)...done.
(gdb) source ./
(gdb) mitsurugi
The name's 0xMitsurugi
Remember it!

And you can use anything in the invoke function, like:
  • gdb.execute("gdb command")
  • python evaluation

2/ Go VM, go!

When I was breaking this VM, I've quickly found a pattern:
  • adress 0x0804843b
  • read a byte at [esi]
  • explode it in 5 bytes
  • do stuff
  • return to 0x0804843b
Another finding was that 8 bytes where used as temporary storage, at adresse 0x08049a84 (think registers).

What I need now is to have new gdb command that will do:
  • stepi ==> named walk
  • info reg ==> named registry
  • print opcode ==> named  opcode
  • x/5i $eip ==> named lookahead
with this superset of command, I would navigate through the VM code like if it was native.

3/ python with a touch of class

For the clarity of the blogpost, all code has been moved at the end.

3/1/ Walk

Ok, the first one is very easy to do, named Walk. We just have to create aclasse doing a :

and check if we stop at the right breakpoint. We can add a lot of debug code, verify if the breakpoint is present, if there are another ones, and so on, but for simplicity, a single breakpoint and a single line do the trick.

3/2/ Register

We have 8 registers of one byte long. The goal here is to print them on the screen. I've choose to do it really simply:
The invoke function within the class Registry:
registry(gdb.execute("x/8bx 0x08049a84",to_string=True))
and the registry function doing the pretty printing. The eight registers appears as R1, R2,... R8

3/3/ Opcodes

Opcode is an interesting one. We learned that the VM read one byte at a time, splitted in 5 byte. First version of opcode just explode the byte:
 print "Opcode is: "+ opc + " ",
 a,b,c,d,e = exploding(opc)
 print "value 0%d-0%d-0%d-0%d-0%d" % (a,b,c,d,e)

this saved me a lot of time. Everything depends of these 5 bytes, and patterns can be revealed easily.

The next work was to disass these opcodes. I filled slowly this function with what I've found:
if opc=='00': print 'NOP'
if opc=='c3': 
    print 'JUMP $esi+ %s - %s" %(get_esi(addr+"+1"),get_esi(addr+"+2"))
if c==6:
    print 'MOV %s to R%d' % (get_esi(addr+"+1"),b)

and so on. It's really convenient, because you can hot reload the python script while working:
(gdb) opcode
Opcode is 00
value 00-00-00-00-00
(gdb) source ./   #now telling that 00 is a NOP:
(gdb) opcode
Opcode is 00 NOP
value 00-00-00-00-00

There are two way to disassemble opcodes: reversing the x86 part, or just watching register before/after the operation.

3/4/ lookahead

The lookahead is equivalent to x/5i $eip. This is achieved easily, by calling opcode() function, with a cursor telling if we have to walk to next byte, or skip some. That's just a reuse of function opcode() and disassemble().

4/ Breaking VM with class

The challenge quickly resume with using walk, opcode, registry to find what opcode do, and run after run, we get a better understanding of what's going on. This is live debugging session, just after the first opcodes where discovered. Note how the VM code reveals itself, and disassembled opcodes are printed:

mitsurugi@mitsu:~/crackmes/VM$ gdb -q -nx ./VM.bin
Reading symbols from ./VM.bin...(no debugging symbols found)...done.
(gdb) b * 0x0804843b
Breakpoint 1 at 0x804843b
(gdb) r
Starting program: /home/mitsurugi/crackmes/VM/VM.bin
Please crack Me :toto

Breakpoint 1, 0x0804843b in ?? ()
(gdb) source ./
(gdb) opcode
Opcode is: 0xc3  JUMP $esi+ 0x01 - 0x00
value 03-00-03-00-00
(gdb) walk
(gdb) lookahead
Opcode is 0x00
     == > NOP
Opcode is 0x00
     == > NOP
Opcode is 0x26
     == > MOV 0x20 to R4
Opcode is 0x3e
     == > MOV 0x00 to R7
Opcode is 0x01
     == > Mov 0x01-0x42 to R1 and R0
(gdb) registry
    R0      R1      R2      R3      R4      R5      R6      R7
    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb) walk
(gdb) walk
(gdb) walk
(gdb) walk
(gdb) registry
    R0      R1      R2      R3      R4      R5      R6      R7
    0x00    0x00    0x00    0x00    0x20    0x00    0x00    0x00
(gdb) lookahead
Opcode is 0x01
     == > Mov 0x01-0x42 to R1 and R0
Opcode is 0x87
     == > Unknown Opcode
Opcode is 0x3c
     == > Unknown Opcode
Opcode is 0x02
     == > Unknown Opcode
Opcode is 0x03
     == > Unknown Opcode
(gdb) walk

(gdb) registry
    R0      R1      R2      R3      R4      R5      R6      R7
    0x42    0x01    0x00    0x00    0x20    0x00    0x00    0x00

And that's waaaaay simpler than trying to use gdb commands. Understanding the VM was a piece of cake after that: we just focus on the important things, and we totally abstract the underlying CPU and instructions.

5/ Conclusion

After winning and reading other solutions, I figure that there were a lot of other ways to break it (Next in my TODO list: learn pin). It wasn't very complicated, but I've had a lot of fun with this VM crackme; Writing python was overkill for this one, but it will help me for beating next VM crackmes.

gdb + python can be really, really powerfull. For those who wondering, the VM was emulating a Z80 CPU, but I didn't recognizing the instructions ^_^ 

6/ Show me the code or GTFO

This code can not be read as a school example of coding. "It just works" (tm)

import gdb

#Main BP of the VM: 0x0804843b

def lookahead():
    for i in range(0,5):
        print "cursor is at:",
        print "Opcode is %s" % (opcode)
        print "\t == > %s" % (val)

def disassemble(addr):
    if opc=='0x00':
        o= 'NOP'
    if opc=='0xc3':
        o= 'JUMP $esi+ %s - %s' % (get_esi(addr+"+2"),get_esi(addr+"+1"))
    if c==6:
        o= 'MOV %s to R%d' % (get_esi(addr+"+1"),b)
    if c==1:
        o= 'Mov %s-%s to R1 and R0' % (get_esi(addr+"+2"),get_esi(addr+"+1"))
    if o=="": o="Unknown Opcode"
    return o,cursor

def exploding(opc):    
    """Found by reversing"""
    a = int(opc, 16) >> 6
    b = int(opc, 16) >> 3 & int ("0x7",0)
    c = int(opc, 16) & int("0x7",0)
    d = b & 1
    e = b >> 1
    return a,b,c,d,e

def get_esi(n):
    """esi can be seen as a global var, so we don't care passing it"""
    byte=gdb.execute("x/bx "+n,to_string=True)
    return byte.split()[1].rstrip()

def registry(regs):
    #One of them is Z flags, should print it
    for i in range(0,8):
        print '\tR%d' % (i),
    print regs.split(':')[1].rstrip()

def walk():
    gdb.execute('c', to_string=True)
    vv = gdb.execute("info reg $eip",to_string=True).split('x')[2]
    if vv.rstrip() != '804843b':
        print "Walk another time, we didn't reach the BP"

def opcode():
    print "Opcode is: "+ opc + " ",
    print disassemble("$esi")[0]
    print "value 0%d-0%d-0%d-0%d-0%d" % (exploding(opc))

#And now, the classes

class Opcode(gdb.Command):
    """Dump the Opcode"""
    def __init__(self):
        super(Opcode, self).__init__("opcode", gdb.COMMAND_SUPPORT)
    def invoke(self, args, from_tty):

class Walk(gdb.Command):
    """Walk one step"""
    def __init__(self):
        super(Walk, self).__init__("walk", gdb.COMMAND_SUPPORT)
    def invoke(self, args, from_tty):

class Registry(gdb.Command):
    """Pretty print the VM registry"""
    def __init__(self):
        super(Registry, self).__init__("registry", gdb.COMMAND_SUPPORT)
    def invoke(self, args, from_tty):
        registry(gdb.execute("x/8bx 0x08049a84",to_string=True))

class Lookahead(gdb.Command):
    """Pretty print the VM registry"""
    def __init__(self):
        super(Lookahead, self).__init__("lookahead", gdb.COMMAND_SUPPORT)
    def invoke(self, args, from_tty):


vendredi 3 juillet 2015

No one expect command execution !

Unix is a beautiful world where your shell gives you the power of launching any command you like. But sometimes, command can be used to launch another commands, and that's sometimes unexpected.

For example, here is how you can launch arbitrary command from tar:
But with a little research, it seems there is ton of way to achieve this with popular unix commands. The rules are simple:
  • don't launch the command from the shell
  • use a side effect of another command
  • have the command executed
I used a little script, called
mitsurugi@mitsu:~/tmp$ cat
#! /bin/bash
echo "The name's 0xMitsurugi!"
echo "Remember it!"

If it's printed on the script, you win. Here we go, without any ordering:

1/ tcpdump

$ tcpdump -n -i lo -G1 -w /dev/null -z ./
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
The name's 0xMitsurugi!
Remember it!
The name's 0xMitsurugi!
Remember it!
^C6 packets captured
12 packets received by filter
0 packets dropped by kernel
Because each packet (-G), we rotate the file (-w) with the specified program (-z).

2/ tar

$ tar c a.tar -I ./ a
tar: a.tar: Cannot stat: No such file or directory
The name's 0xMitsurugi!
Remember it!
tar: Exiting with failure status due to previous errors
Because you can specify any compress program (-I) you want on command line for tar and we don't care for errors.

3/ zip

$ zip a -T -TT ./
The name's 0xMitsurugi!
Remember it!
test of OK

zip is a nice boy which can autotest its zip file (-T), and test decompression with another program (-TT).

4/ ftp (and many others...)

A lot of program can give you access back to shell, I choose ftp because it's an old unix utility.
$ ftp
ftp> ! ./
The name's 0xMitsurugi!
Remember it!

and also, vi, gdb, etc.. you can even escape an ssh session with ~^Z

5/ man

This one is just for fun, but you have to give full path of program. The -P switch change the default pager program.
$ man -P /tmp/ man
The name's 0xMitsurugi!
Remember it!


6/ git (and man, and...)

If you have access to environment variable, there is a lot of fun to do:
$ export PAGER=./
$ git -p help
The name's 0xMitsurugi!
Remember it!

And yes, it works with "man" the same way, andu surely many other which define a PAGER env variable.

But wait, there's more to do with git. If you can write to any directory in $PATH, you can do:
$ export PATH=/tmp:$PATH
$ ln -sf /tmp/ /tmp/git-help
$ git --exec-path=/tmp help
The name's 0xmitsurugi
remember it!

Although I don't have any idea to play with, I'm sure someone will find out :-) For an unknown reason, the --exec-path must be in $PATH in order to be executed.

7/ The most unexpected: bash $HOME variable

And yes, there is a way for subshells:
$ pwd
$ ls -la .bashrc
lrwxrwxrwx 1 mitsurugi mitsurugi    8 juin  19 14:03 .bashrc ->
$ export HOME=.
$ bash
The name's 0xMitsurugi!
Remember it!

I was almost 100% sure this would fail if I hadn't tried it.

8/ awk (and many others)

That's too easy when you have a "system" command:
$ awk 'BEGIN {system("./")}'
The name's 0xMitsurugi!
Remember it!

and I'm sure there are a lot more.

That's all for today, I'm sure there are many more, don't be shy and send me (twitter, mail, blog) your best unexpected command execution :-)

jeudi 2 avril 2015

Let's go fishing with fake sstic challenge

Yesterday, as I was idling on the interwebs, I noticed a challenge on twitter:

The URL is .

Yesterday was the 1st april, fool's day. I didn't expected much of that, but clicked on the link anyway:


We noticed immediately the Salted__ which is an openssl header, we know that it is followed by the 8 bytes of salt, and after this header, the encrypted message. Documentation says that nothing can be known of the underlying cipher or mode used from this header.

So, if we want to break by bruteforce this encryption, we have to guess the encryption scheme and the password. And we are the 1st of april, so it could just be pure random in order to laugh at us wasting CPU cycles.

But, if you watch closely the picture, you can clearly see a pattern repeating.

That's unusual. Good crypto should be undistinguishable from random. So, either this is a fake, either this is bad crypto. And we know one bad crypto mode: ECB, where identical cleartext gives identical crypt.

Let's continue with this idea. File containing a lot of identical data are uncompressed images, as of BMP. BMP is made of an header, then groups of three bytes representing a pixel (Red Green Blue value in hex).

We have a file of  3000144 bytes. 16 bytes are from openssl header, some bytes from BMP header, and the pixels. If we imagine a 1000x1000 image , we need 300000 bytes of pixel data. Let's skip those 144 bytes and watch at that.

In two lines of shell:


We can see something :-)

If we reverse the image with convert -flip, we can better read the email and we're done.

Please, SSTIC commitee, can you reserve a place for me? (I'll pay it, it's just to be sure to have one ;) )

EDIT: it was just too easy. No need to bruteforce, first try, win:


mardi 20 janvier 2015

Let's have (not) fun with afl

And that's a part two for an extreme exploitation of my previous bug found with afl. It was a crash in cabextract.

And it's a double fail.

1/ First fail
This crash is just a null pointer dereference. There is a lot of documentation in internet for it, and in that case, that's just not interesting at all. In order to be exploitable, you have to map an executable page at offset 0 and trigger the crash.

Here is a nice doc talking about null deref. Here is another blogspot explaining how to exploit null deref in linux kernel (now it's protected).

Short story short: "In the realm of userland applications, exploiting them usually requires being able to somehow control the target's allocations until you get page zero mapped, and this can be very hard."

2/ Second fail
This bug is known and corrected! You can see a nice explanation in debian bugtracker.

So,  nothing to do here, and moving for next target!


mercredi 14 janvier 2015

Let's have some fun with afl

1/ Let's play

Let's play with the fuzzer afl. I'll write some blogposts about my research with afl. This can be seen as a diary of a fuzzer :-)

2/ At first, choose your target

I think that afl has been pretty much used on usual unix binaries (compressor, file format, and so on), so I did a
$ ls /usr/bin

and choose one of them in order to fuzz. afl is great at fuzzing programs which take a file as argument and don't produce too much output. There is a lot of them which can be fuzzed.

That's easy, just grab the sources of the binary you want to fuzz, recompile it using afl-gcc (or afl-g++), and roll!
Sometimes, it's better to desactivate checksum computations: fuzzing goes faster, fuzzing goes farther (doesn't stop for a bad checksum) and in case of a crash, you can easily recalculate the checksum and see if crash is still here.

After fuzzing a lot of software, I managed to reach two crashes:

That's enough for a good start. Those two crashes will be investigated.

I don't look at hangs: my computer do a lot of things (loadavg > 1 most of the time), and afl has an aggressive value of 50ms for saying that program has stopped. So, there is a lot of false positive. And I'm mostly interested in crashes anyway.

3/ Keep focused

Afl comes with an handy helper call Peruvian were rabbit. Just feed afl-fuzz with some known files causing a crash and it will make some variation around it in order to see if some new crashes appears.

$ cd ..
$ mkdir crashes crashes/findings crashes/testcases
$ cp fuzzy/findings/crashes/id\:000001* crashes/testcases/case1
$ cp fuzzy/findings/crashes/id\:000002* crashes/testcases/case2
$ cd crashes
$ afl-gcc -C -i testcases/ -o findings/ unnamed -t @@

And unsurprisingly, number of crashes grow very fast:

I have now 255 files causing crashes.

What's really interesting in my case is that the unmodified program (the one doing checksum) segfaults too :-) It writes a message "bad checksum", but segfault like a boss:

mitsurugi@mitsu:~/fuzz$ ./unnamed-no-cksum -t case1
Segmentation fault
mitsurugi@mitsu:~/fuzz$ ./unnamed-ori -t case1
    failed (checksum error)
Segmentation fault

mitsurugi@mitsu:~/fuzz$ ./unnamed-no-cksum -t case1
Segmentation fault

mitsurugi@mitsu:~/fuzz$ dmesg | tail -1
[784784.901989] unnamed-no-cksum[29344]: segfault at 0 ip   (null) sp bf89d38c error 14 in unnamed-no-cksum[8048000+57000]
And it seems promising, because it crash on a bad EIP value (and not a losy read or write in forbidden location). My senseï said:

"If you manage to control EIP, you'll soon manage to control the world"

4/ Let's investigate

I have a good lead, the next step is to force EIP to a choosen value (0x41414141 for example) instead of 0x00000000.

The crash appears here:
=> 0x804daf0:    call   DWORD PTR [eax+0x40]
   0x804daf3:    cmp    eax,0x3
   0x804daf6:    je     0x804db54

And the memory layout around eax is:
gdb$ x/12dwx $eax + 0x30
0x80618b0:   0x0804a580    0x0804a560    0x00000000    0x0000eaea
0x80618c0:   0x00000000    0x00000000    0x080601e8    0x08060080
0x80618d0:   0x00000000    0x080618dc    0x080618dc    0x00000000

unfortunately, I don't have 0xeaea followed by some 0x00 in source file. So I think I'll have to dig really further.

I also have 255 files, I choose 10 of them, all of them crash with EIP=0x00000000. It's time to industrialize the checkings.

That's all for today :-)

mardi 6 janvier 2015

Is fuzzing obsolete? No, afl rocks!

1/ Intro

Is fuzzing obsolete? I've read an interesting presentation given at NosuchCon about analyzing execution of binaries.

One slide was saying that fuzzing is over:

So, let's try it with afl, or american fuzzy lop. As everybody knows, american fuzzy lop is a rabbit, but few knows that it's a really powerful fuzzer made by lcamtuf.

Afl can instrumentate the code in order to follow some path, instead of relying on chance only.

2/ Dumb mode, before afl

If I rely with chance only, with an initial buffer "good", and with 1000 try/seconds, I will have to wait (2^32/1000/3600/24) = 49 days to trigger a crash. That's not really efficient.

3/ AFL powerful execution mode

In order to use afl, you have to recompile your binaries with afl-gcc (wrapper around gcc). It's easy: download afl, compile it, then add in your path the build directories like:

export PATH=$PATH:/home/mitsurugi/fuzz/afl-0.88b/
export AFL_PATH=/home/mitsurugi/fuzz/afl-0.88b/

(protip: your PATH should be set before afl, i.e. PATH=afl-dir/:$PATH won't work)

and rocks:

Now the code is instrumented, we just have to launch afl-fuzz:

And in less than an hour (!!!) afl trigger the crash:

That's incredible. Dumb fuzz was expecting 49 days.

4/ conclusion

So, I think that we can say that fuzzing is not over yet :-) Let's start afl everywhere in every progs!

I've currently trigger a crash in a common unix utility, but it's not exploitable (working on it, but it's hard, I can write some 0 in a not really arbitrary location :-( ).

What I miss so much is the same fuzzer for networking utility. Kind of :
printf 'GET / HTTP/1.1\r\nHost: localhost\r\n\r\n' > tescases/request
afl-fuzz -i testcases/ -o findings httpd @@