Sunday, October 27, 2013

Reverse Engineering Automation using Pydbg - I

Pydbg is an open source Python debugger. I've been using Pydbg for many days to automate many boring parts of reverse engineering. In this post I will share one technique sometimes I use for crash debugging.

So, suppose after fuzzing an application we found an interesting crash. And we want to get into the root cause of this crash. It can be sometime very difficult to find root cause of any crash because may be corruption happened inside one function but you are getting access violation inside a different function.

I'm not saying its the most efficient way to trace application flow but sometimes I find it very helpful.

Here I will use one sample crash of "SampleParser.exe". We have two files. One is the file which is crashing the SampleParser.exe and another is the base file on which our Fuzzer did modification. So we have a reproducible crash and we want to reach to the vulnerable function causing the crash. So before we start reverse engineering, we must a clear view of call graph of the application. Obviously it has many functions. Not all of them are involved into parsing input files. First we will narrow down our RE scope to the functions which are taking part into parsing the input file.

Finding out Subroutines taking part in File parsing:

To find out which functions are taking part in parsing the input file, we must have all functions belongs to SampleParser.exe. How how to get that list. Here IDA pro can help us getting the list of all functions. Open up the executable using IDA Pro when its completely loaded, From function window just you can get all function list. (If the application loads arbitrary DLL at run time for parsing input file you need to open the dll file with IDA Pro. )


Now just copy all functions from there and put it in a text file and save it as ida-export.txt. Now we will run the application inside the pydbg and do following things.
  1. From the text file ida-export.txt, we will only take those function addresses which contains the word "sub_" and set a break point on all of them.
  2. Next we will run the program with pydbg.
  3. If there is any break point hit, we will print the value of EIP in command prompt. 
Here is the python script to do the above steps. (Make sure you named the function list ida-export.txt )


Time for The First Run:

So when we start the application, we will get few function addresses in command prompt. So these are the function, responsible for starting up the application. For obvious reason we are not interested in these function.

But we will simply copy the list from command window and paste it in an excel sheet for later analysis.


Now the application is running and our debugger is watching it. So we will open the base file using the SampleParser.exe (The base file is the file, on which our Fuzzer did modification, and this file should not crash the application).

Now in our command window we will see some new functions are getting called and after few seconds when the file is completely loaded inside SampleParser.exe , it will be idle and you will not see any new call in command window. Here roughly we can say these are the functions, responsible for parsing the input file. Again we will copy the list from command prompt and paste it in the next column of excel sheet.

So now we have an excel like this.


And we have a rough list of functions responsible for parsing/loading the input file.

Now we rerun the target application using the above script, and this time we will open the input file which was causing the crash. This time we will get a list of functions but after few seconds the target application will crash. Now again we will copy the new list of function from command prompt and paste it in the next column of excel sheet. So the final excel look like this,


In above excel we can see, we ran the application thrice and function calls upto the blue line are common. At this level application is started and its running.

Below the blue line, few functions are marked in Green ,We see this calls when we feed a normal file to that application and the application starts parsing the file. So these are are the function responsible for parsing we can roughly say.

In the next column the red marked function is where our SampleParser.exe program crashed with an access violation. We can see in the 3rd attempt there are few common function which tell us that it started parsing the file but it crashed suddenly due to some error.

From this analysis now we have 5 function involved into parsing the input file and we are pretty sure any of these did something wrong for which we got the access violation.

0x004057d0
0x004057c0
0x004014a0
0x00402470
0x00404210

So now we have narrowed down our analysis scope to only 5 functions.

Fuzzing Facebook for $$$ using Burpy

Fuzzing is an automated / sometime semi automated software security/bug testing technique which allows us to find different types of bugs with very less efforts. It actually involves providing invalid, unexpected, or random data to the inputs of a computer program. The best thing about fuzzing is, you spend few sleepless nights to write your Fuzzer, and after that you just sleep, and your Fuzzer brings you bugs (or money some time ;)) without any positive effort .


Months back I have blogged about two (this & this)XSRF vulnerabilities, I've found in Twitter application. Both XSRF bugs were found using a web application Fuzzer I've developed,I named it Bupry. I've already shared that tool. You can download the tool from my GitHub page.

The tool is pretty simple, straight forward and flexible. You can easily write your own application specific modules to perform various test cases. Burpy actually takes Burp suite log as input and after that it parses all request response from the Burp log xml file. After parsing that log it performs various tests depending on the module you've provided. In this tool I've also included on raw http request manipulation library which is rawweb.py. You can easily manipulate or modify raw http requests using this library and its very simple. In this post, I am going show how I wrote a Facebook application specific plugin for Burpy to fuzz Facebook application.

So, like Twitter , here also in Facebook, my target was to find broken XSRF protection, in an automated manner using Burpy.


As I've already said, Burpy takes Burp suite log as input. So to do that,I've opened up Burp proxy and started surfing Facebook application randomly with a test account. At the same time Burp suite was capturing all the request responses in background.

So after that, I exported the captured log from Burp. Now here, Burpy's job would be to parse burp log and get a list of all http request (normal client's requests without any modification)present in the Burp suite log and depending on the modules provided, it will modify/manipulate those raw http request (using rawweb.py)and re-send them to web server. And depending on the response, Burpy will generate an HTML report powered by Twitter Bootstrap. A Sample report with single issue is here.

So I wrote a Burpy plugin which removes the xsrf token parameter from all raw http request found in burp log, and re-fire them to server one by one. Now like twitter Facebook also behaves in same manner every time, when you send any request to server using without XSRF token ( XSRF token name is constant for main facebook application which is fb_dtsg). So if you send any request without this XSRF token, maximum time Facebook will give you back an error saying, Sorry, something went wrong - Please try closing and re-opening your browser window or it will throw 500 Server Error response code.

Now if we remove the XSRF token from each and every request that client side code sends to Facebook server, and if we dont find above mentioned error in server response, then we can say that there is a chance that XSRF token validation is not properly implemented. But Facebook application send thousands of GET and POST request. So its almost impossible to do this manually for each and every request that Facebook sends.

So I wrote this tiny Burpy plugin to automate this test case. It simply checks whether XSRF token validation is present in server side or not, by removing XSRF token from request and replaying it. Since Facebook application always throws a generic error message for XSRF error, So if this error is not present in response after removing the token it returns +ve.


So after running this 10-15 times I've gathered few suspicious request response from Burpy report.But  most of them were technically XSRF but not very critical or harmful. But I was able to track down near about 8-10 pretty serious issues.

So from the next day I started reporting them to Facebook Security Team using their online form.

After few weeks I started getting responses back from Facebook security team. Unfortunately few of them did not qualify for Facebook Bug Bounty program because few were previously known to them and few were not very critical.





etc..
etc..
etc..

Bad Luck..huhhh..:(


But 2013 is not so unlucky. Few bugs did qualify for reward.......:) :) :)



So at last I've earned some money out of Facebook and the best thing was Facebook guys mentioned my name on their White Hat Hall of Fame page which is definitely a great CV builder.