Sunday, December 10, 2017


This December 2017 Me & my colleague Yakun Zhang delivered a talk at Blackhat Europe 2017 Briefings on VMWare escapes. Blackhat Europe is an annual information security conference, scheduled on December 4 2017 to December 7, 2017, in the ExCeL London, located at 1 Western Gateway, London E16 1XL.We have talked about reverse engineering of vmware, attacking hypervisor isolation and some virtual machine escape attacks against vmware.

Talk Abstract

Virtual machine escape is the process of breaking out of the virtual machine and interacting with the host operating system. VMWare recently fixed several bugs in their products that were allowing malicious code to escape sandbox. Some of these issues were exploited and reported during exploitation contest and while others reported individually by researchers. For very obvious reason details of this bugs are undisclosed. This paper presents a case study of VMWare VM escape vulnerabilities based on the analysis of different patches released by VMWare in recent past. 

Looking at the advisories published by VMWare in the last few months, reveals that there are many surfaces, that are being targeted by security researchers. To summarize, the attack surfaces would be as follows: 

A) RPC Request handler.
B) Virtual Printer.
C) VMWare Graphics Implementation.

Talking about vulnerabilities fixed in VMWare RPC layer, we see several CVEs (CVE-2017-4901, CVE-2016-7461 etc.) fixing security issues in RPC layers. This talk will cover end to end RPC implementation in VMWare workstation. It will cover everything from VMWare Backdoor in guest OS to different RPC command handler in host OS. We will uncover some of these fixed bugs in VMWare RPC layer by performing binary diffing on VMWare Workstation binaries. This talk will also showcase some of the PoCs developed from different VMware workstation patches.

VMWare's EMF file handler is one of most popular attack surfaces, when it comes to guest to host escape. VMSA-2016-0014 fixed several security issues in EMF file handling mechanism. EMF format is composed of many EMR data structures. TPView.dll parses every EMR structure in EMF file. In VMware, COM1 port is used by Guest to interact with Host printing proxy. EMF files are spool file format used in printing by windows. When a printing EMF file request comes from Guest, in host TPView.dll render the printing page. The TPView.dll holds the actual code which parses the EMF file structures. In our talk, we will be diving deep into this attack surface & uncover some of the vulnerabilities fixed in this area recently by performing binary diffing on VMWare work station binaries.

VMSA-2017-0006 resolved several security vulnerabilities in Workstation, Fusion graphics implementation which allows Guest to Host Escape. These vulnerabilities were mostly present in VMWare SVGA implementation. In this section of our talk we will cover implementation of VMWare virtual GPU through reverse engineering different guest components (vmx_fb.dll - VMware SVGA II Display Driver, vmx_svga.sys - VMware SVGA II Miniport) as well as host component (vmware-vmx.exe) where virtualize GPU code exist. The VMware virtual GPU provides several memory ranges which is used by Guest OS to communicate with the emulated device. These memory ranges are 2D frame buffer and FIFO Memory Queue. In FIFO memory queue, we write command that we want our GPU to process. The way VMWare handles and process these commands is error prone. This talk will uncover some of these bugs in SVGA command processing code and try to understand anatomy of issues by bin-diffing through VMWare binaries.




Sunday, October 8, 2017

BruCON 2017: Browser Exploits? Grab ’em by the Collar!

This year in October I have delivered a talk about reliable detection of Browser Exploit at Brucon 2017. BruCON is an annual security and hacker(*) conference providing two days of an interesting atmosphere for open discussions of critical infosec issues, privacy, information technology and its cultural/technical implications on society. Organized in Belgium, BruCON offers a high quality line up of speakers, security challenges and interesting workshops.

Talk Abstract:

APT has become a hot topic in enterprise IT today. One of the softwares that we see becomes victim of APT attack more often is web browsers and the attack surface is becoming bigger and bigger every day.

TCP Live Stream Injection ( is a technique that we have seen, is being abused by various Internet Service Providers, Router vendors for decades. We have seen in the past, using this technique ISPs, router vendors intercepts HTTP traffic and inject arbitrary data silently into HTTP responses. This is usually done by injecting arbitrary JavaScript code into actual HTTP response body in real time. When the injected JavaScript code reaches client browser it performs various operations such as loading advertisements, information gathering etc.

This paper presents a generic browser exploit detection technique that uses the same Live Network Stream Code Injection technique to reliably catch browser exploits. The detection system can be considered as completely agent less and capable of detecting various techniques, used in modern browser exploitation. Unlike any other Host Based Intrusion Prevention Systems, to be able to generically detect and block browser exploits, no OS API hooking, dll injection or code injection is required in browser process.

Slides & Video Demos:

Talk Video:



Saturday, May 20, 2017

OpenXMolar - A MS OpenXML Format Fuzzing Framework

i) OpenXMolar v 1.0

alt text
OpenXMolar is a Microsoft Open XML file format fuzzing framework, written in Python.

ii) Motivation Behind OpenXMolar

MS OpenXML office files are widely used and the attack surface is huge, due to complexity of the softwares that supports OpenXML format. Office Open XML files are zipped, XML-based file format. I could not find any easy to use OpenXML auditing tools/framework available on the internet which provides software security auditors a easy to use platform using which auditors can write their own test cases and tweak internal structure of Open XML files and run fuzz test (Example : Microsoft Office).

Hence OpenXMolar was developed, using which software security auditors can focus, only on writing test cases for tweaking OpenXML internal (XML and other ) files and the framework takes care of rest of the things like unpacking, packing of OpenXML files, Error handling, etc.

iii) Dependencies

OpenXMolar is written and tested on Python v2.7. OpenXMolar uses following third party libraries

winappdbg / pydbg

Debugger is an immense part of any Fuzzer. Open X-Molar supports two python debugger, one is winappdbg and another is pydbg. Sometimes installing pydbg on windows environment can be painful, and pydbg code base is not well maintained hence winappdbg support added to Open X-Molar. Its recommended that user use winappdbg.


Since we feed random yet valid data into target application during fuzzing, target application reacts in many different ways. During fuzzing the target application may throw different errors through different pop-up windows. To continue the fuzzing process, the fuzzer must handle these pop-up error windows properly. OpenXMolar uses PyAutoIT to suppress different application pop-up windows. PyAutoIt is Python binding for AutoItX3.dll

crash_binning is part of sulley framework. is used only when you've selected pydbg as debugger. is used to dump crash information. This is only required when you are using pydbg as debugger.


This is not core part of the Open X-Molar. The XML String Mutation module (FileFormatHandlers\ was written using xmltodict library.

iv) Architecture:

On a high level, OpenXMolar can be divided into few components.

This is the core component of this Tool and responsible for doing many important stuffs like the main fuzzing loop.

This component mostly handles processing of OpenXML document such as packing, unpacking of openxml files, mapping them in memory, converting OpenXML document to python data structures etc. - PopUp/Error Message Handlers :

This component suppresses/kills unwanted pop-ups appeared during fuzzing.


An OpenXML file may contain various files like XML files, Binary files etc. FileFormatHandlers are basically a collection of mutation scripts, responsible for handling different files found inside an OpenXML document and mutate them. decompresses OpenXML files provided in folder "OpenXMolar\BaseOfficeDocs\OpenXMLFiles" and output a python list of files present in the OpenXML file. accepts comma separated file extensions. is useful when you are targeting any specific set of files present in any OpenXML document. summarizes crashes found during fuzzing process in tabular format. The output of should look like this:

alt text

v) Configuration File Walk through

The default configuration file '' is very well commented and explains all of its parameters really well. Please review the default file thoroughly before running the fuzzer to avoid unwanted errors.

vi) Writing your Open XML internal File Mutation Scripts:

As said earlier, an OpenXML file package may contain various files like XML files, Binary files etc. FileFormatHandlers are basically a collection of mutation scripts, responsible for handling different files found inside an OpenXML document and mutate them. Generating effective test cases is the most important step in any fuzz testing process.

The motive behind OpenXMolar was to provide security auditors an easy & flexible platform on which fuzz tester can write their own test cases very easily for OpenXML files. When it comes to effective OpenXML format fuzzing, the main part is how we mutate different files (*.xml, *.bin etc) present inside OpenXML package (zip alike). To give users an idea of how file format handlers are written, two file format handlers are provided with this fuzzer, however they are very dumb in nature and not very effective.

Any file format handler module should be of following structure

# Import whatever you want.
class Handler():# The class name should be always 'Handler'
 def __init__(self):
 def Fuzzit(self,actual_data_stream): 
  # A function called Fuzzit must be present in Handler class
  # and it should return fuzzed data/xml string/whatever.
  # Note: Data type of actual_data_stream and data_after_mutation should always be same.

  return data_after_mutation

Once your file format handler module is ready you need to place the *.py file in FileFormatHandlers// folder and add the handler entry and associated file extension in file like this :


vii)Adding More POPUP / Errors Windows Handler

The default file provided with Open X-Molar, is having few most occurred pop up / error windows handler for MS Word, MS Excel & Power Point. Using AutoIT Window Info tool ( you can add more POPUP / Errors Windows Handlers into ''. One example is given below.
alt text

So to be able to Handle the error pop up window shown in screen shot, following lines need to be added in :

if "PowerPoint found a problem with content"  in autoit.win_get_text('Microsoft PowerPoint'):
 autoit.control_click("[Class:#32770]", "Button1")

viii)The First Run

This fuzzer is well tested on 32 Bit and 64 Bit Windows Platforms (32 Bit Office Process). All the required libraries are distributed with this fuzzer in 'ExtDepLibs/' folder. Hence if you have installed python v2.7, you are good to go.

To verify everything is at right place, better to run Open X-Molar with Microsoft Default XPS Viewer first time(C:\Windows\System32\xpsrchvw.exe). Place any *.oxps file in '\BaseOfficeDocs\OpenXMLOfficeFiles' and run accepts one command line argument which is the configuration file.


[Warning] Pydbg was not found. Which is required to run this fuzzer. Install Pydbg First. Ignore if you have winappdbg installed.

   ____                    __   ____  __       _
  / __ \                   \ \ / /  \/  |     | |
 | |  | |_ __   ___ _ __    \ V /| \  / | ___ | | __ _ _ __
 | |  | | '_ \ / _ \ '_ \    > < | |\/| |/ _ \| |/ _` | '__|
 | |__| | |_) |  __/ | | |  / . \| |  | | (_) | | (_| | |
  \____/| .__/ \___|_| |_| /_/ \_\_|  |_|\___/|_|\__,_|_|
        | |
        An MS OpenXML File Format Fuzzing Framework.
        Author : Debasish Mandal (

[+] 2017:05:05::23:11:23 Using debugger :  winappdbg
[+] 2017:05:05::23:11:23 POP Up killer Thread started..
[+] 2017:05:05::23:11:24 Loading base files in memory from :  BaseOfficeDocs\UnpackedMSOpenXMLFormatFiles
[+] 2017:05:05::23:11:24 Loading File Format Handler for extension :  xml =>
[+] 2017:05:05::23:11:24 Loading File Format Handler for extension :  rels =>
[+] 2017:05:05::23:11:24 Loading File Format Handler Done !!
[+] 2017:05:05::23:11:24 Starting Fuzzing
[+] 2017:05:05::23:11:25 Temp cleaner started...
[+] 2017:05:05::23:11:25 Cleaning Temp Directory...

ix) Open X-Molar in Action

Here is a very short video on running fuzztest on MS Office Word:

x) Fuzzing Non-OpenXML Applications :

Due to the flexible structure of the fuzzer, this Fuzzer can also be used to fuzz other windows application. You just need do following :

  • In add the target application binary (exe) and extension in APP_LIST of
  • In change OpenXMLFormat to False
  • Write your own File format mutation handler and place it in FileFormatHandlers/ folder
  • Add the newly added FileFormatHandler in FILE_FORMAT_HANDLERS of
  • Provide some base files in folder OtherFileFormats/
  • Add custom error / popup windows handler in using Au3Info tool if required.And you're good to go.

xi) Few More Points about OpenXMolar:

Fuzzing Efficiency: To maximize fuzzing efficiency OpenXMolar doesn't read the provided base files again and from disk. While starting up, it loads all base files in memory and convert them into easy to manage python data structures and mutate them straight from memory.

Auto identification of internal files of OpenXML package : An Open XML file package may contain various files like XML files, Binary files etc. OpenXMolar has capability to identify internal file types and based that chooses mutation script and mutate them. Please refer to the default file (Param : AUTO_IDENTIFY_INTERNAL_FILE_FORAMT) for details.

xii) TODO

Improve Fuzzing Speed
New Feature / Bugs ->

xiii) Licence

This software is licenced under New BSD License although the following libraries are included with Open X-Molar and are licensed separately.

xiv) Source 

The source code is available here :

Thursday, December 24, 2015

IEFuzz - A Static Internet Explorer Fuzzer

Today I'm sharing an IE Fuzzer, which was developed almost from scratch. Like many other softwares, browsers can also be fuzzed in two ways, a) Static and b) Dynamic.

Dynamic browser fuzzers are very popular, due to its speed, since they are purely written in JavaScript. However one common problem software security auditors face, while fuzzing browser dynamically, is 'Crash Reproduction'. You have to very careful while crafting your JS browser fuzzer (by placing logging code in right place), otherwise crash will not be reproducible.

Another option is, Static fuzzer. If you are fuzzing browsers using Static Test Cases, in 99% cases 'A crash' == 'A reproducible crash'.

How does 'IEFuzz' work?

  1. Launch IE
  2. Attach 'iexplore.exe' to debugger(pydbg) - To monitor any type of crash(Both in parent and child process).
  3. Generate a test case (html + javascript).
  4. Load the test case locally as file (file://c:/fuzzer/testcases/temp.html)in IE using win32COM.
  5. If no crash, re-generate a html test case and reload the test case using win32COM.(Note, we are not closing, re-opening IE here, We are just refreshing the same page but code/content of the page is different in every time. Which saves time significantly )
  6. In case of any kind of access violation, copy/save the test case to separate folder,  and kill IE completely.
  7. Go to step 1

This Static IE fuzzer is written in python. And following modules were used.

  1. pywin32com - Load / Reload *.html Test Cases
  2. pydbg - Monitor IE for Access Violation / Guard Page Violation.
  3. paimei - For crash dump generation.

Required Configuration Changes in IE

To run this Fuzzer you have to make following changes in IE: 

1. Since this fuzzer loads the test cases locally (eg. file://c:/fuzzer/testcases/temp.html) as .html file.
You must turn off IE's ActiveX warning prompt by following below instructions.

Tools (menu) -> Internet Options -> Security (tab) -> Custom Level (button) -> Disable Automatic prompting for ActiveX controls.

2. You also need to disable IE protected mode to be able to control Internet Explorer using Python 'win32com'. Please be aware of the risks.

 -> Internet Options -> Security -> Trusted Sites    : Low
 -> Internet Options -> Security -> Internet         : Medium + unchecked Enable Protected Mode
 -> Internet Options -> Security -> Restricted Sites : unchecked Enable Protected Mode

Writing Test Cases:

You can write you own static test case generator for this fuzzer in python. You have to place it inside /TestCases folder. For your reference one sample is given here 'TestCases/'. While writing test cases do remember, it should have a 'TestCase' class and 'getFinalTestCase()' method in it. This getFinalTestCase() method should return the entire html page. 

In case of dynamic fuzzer, attributes of different html elements extracted from object and fuzzed on the fly at runtime , since its a static fuzzer we can pre define html elements and their attributes our test case as python dict.

attr = {'CANVAS':['height','width','getContext', ... , ... , ... ]}

For this attribute list generation, one JavaScript application is provided here : MiscTools/Generate_Elements_Dict.html

Source Code:

Source code of IEFuzz is available for download @ my github page.


This software is licenced under BEER WARE licence although the following libraries are included with 'IEFuzz' and are licensed separately.

Running This Fuzzer:

One video demo is available here, on how to run this fuzzer and reproduce crashes.

Happy Fuzzing :) :)

Tuesday, February 17, 2015

Walking Heap Using Pydbg

I'm a big fan of Pydbg. Although it has many awesome features , it also has few limitations. One of them is lack of control over process heap. For a long time I'm thinking of writing something which makes Heap Manipulation / Heap parsing / Traversing using pydbg little easier for reverse engineers. So finally last weekend I wrote couple of small py scripts which can parse Windows 7 process heaps on the fly.
In this blog post I'm going to share one of them.

This is the simplest implementation of HeapWalk() API based on pydbg. Heap walk API enumerates the memory blocks in the specified heap. If you are not very familiar with HeapWalk() API this page has a very good example in C++.

Right now best available tool available for heap analysis is windbg. The script I'm going to share  does something similar to windbg's "!heap -a 0xmyheaphandle" command.

You can use the function HeapWalk() [@ Line 103] as break point hander in your pydbg script. In below example actually I did something similar.

First I'm running an application (on 32 bit Windows 7) which uses user32!MessageBoxA API somewhere.

After that I'm attaching my pydbg script with that process and setting up a break point at user32!MessageBoxA and also setting up HeapWalk() as the breakpoint handler.

Now whenever the application will make a call to MessageBoxA api our breakpoint handler HeapWalk() will be invoked and it will start traversing all the available process heap and their segments.

Script 1:

The output of this script will be something similar:

Since this script will give you addresses of all all heap blocks and their size, now you should have more control over process heap. You should be able to search for string/data / byets / pointer in process heaps very easily.

Thank you for reading. Hope you've enjoyed :)


Wednesday, January 28, 2015

qHooK - Not Just a Win32 API Hooking Script

Hello everyone. Hope every one is doing good. After a long gap I'm about to post something. Sometimes its easier to build something on your own, than finding something similar which has already been developed in the past. I'm not sure whether any script / tool already present in the wild which does the same, but I definitely needed a tool / script, which can reduce efforts of analysing unknown exploits/ shellcode. I developed this tool one and half years back to mainly analyse shellcodes / exploits etc etc. Obviously when I wrote this there was no name, I just given this a name "qHooK" before writing this post :)

So what it does ?

Its very simple and straight forward python script (dependent on pydbg) which hooks user defined Win32 APIs in any process and prepare a CSV report with various interesting information which can  help reverse engineer to track down / analyse unknown exploit samples / shellcode. Please refer to demo video.

qHooK Final CSV Report:

Video Demo(With Voice):

I guess I've become pretty lazy so I'm not going to write each and every thing about this script. Just adding a video demo (with voice) which explains few real life scenarios, where it may help you. Sorry about my weak voice. My laptop mic sucks :( cant help.

Source Code:

Saturday, July 5, 2014

Releasing Stupid v0.1 - The Dumbest File Format Fuzzer (Python+Pydbg)

I developed Stupid in late 2011 to automate fuzzing and problem/app fault detection process of different file formats( mainly Music/Video players etc). I've been receiving many email from my readers asking me to release POC of a python + pydbg fuzzer. So today I'm very happy to make this small yet effective Fuzzer open to everyone. This is highly prototypal and I recommend to rewrite/modify the test case generator sub routine to make this fuzzer more effective.

Happy fuzzing guys. If you are lucky enough to find any zero day using this fuzzer, you can drop me a TY email or buy me a beer in return if we meet someday :)

Source Code:

Stupid source code is available @


This software is licenced under a Beerware licence although the following libraries are included with Stupid and are licensed separately.

  • pydbg
  • paimei -

Running this Fuzzer:

Stupid was developed and tested with Win32 Python 2.7(x86). So it's recommended to use the same version of python. Also make sure pydbg(x86) is installed on the system.

You need to provide the target application binary path (.exe) and at least one base file to run this fuzzer. You can to modify the configuration section of "" as per your requirement.

Test Case Generation:

mutate() routine is responsible for generating test cases from given bases files. It has two sub parts:
  • Bitflip
  • Random Byte Flip

You may want to change / modify these routines to make this fuzzer more effective. ;)


To monitor target application for different types of crashes (access violation), Stupid uses pydbg(Python debugger).It also uses "utils" of framework to collect crash information which can be used later to identify/distinguish interesting app crashes. Sample crash synopsis file is below,

Reproducing Crashes:

Crash files and crash information can be found in "Crashes" folder which can be used to reproduce app crashes.

Saturday, April 19, 2014

Attacking Audio "reCaptcha" using Google's Web Speech API

I had a fun project months back, Where I had to deal with digital signal processing and low level audio processing. I was never interested in DSP and all other control system stuffs, But when question arises about breaking things, every thing becomes interesting :) . In this post i'm going to share one technique to fully/ partially bypass reCaptcha test. This is not actually a vulnerability but its better if we call it "Abuse of functionality".

Disclaimer : Please remember this information is for Educational Purpose only and should not be used for malicious purpose. I will not assume any liability or responsibility to any person or entity with respect to loss or damages incurred from information contained in this article.

1. What is Captcha

A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot. The term CAPTCHA (for Completely Automated Public Turing Test To Tell Computers and Humans Apart) was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University.

2. What is Re-captcha

reCAPTCHA is a free CAPTCHA service by Google, that helps to digitize books, newspapers and old time radio shows. More details can be found here.

3. Audio reCaptcha

reCAPTCHA also comes with an audio test to ensure that blind users can freely navigate.

4. Main Idea: Attacking Audio reCaptcha using Google's Web Speech API Service

5. Google Web Speech API

Chrome has a really interesting new feature for HTML5 speech input API. Using this user can talk to computer using microphone and Chrome will interpret it. This feature is also available for Android devices. If you are not aware of this feature you can find a live demo here.

I was always very curious about the Speech recognition API of chrome. I tried sniff the api/voice traffic using Wireshirk but this API uses SSL. :(.

So finally I started browsing the Chromium source code repo. Finally I found exactly what I wanted.

It pretty simple, First the audio is collected from the mic, and then it posts it to Google web service, which responds with a JSON object with the results.  The URL which handles the request is :

Another important thing is this api only accepts flac audio format.

6. Programatically Accessing Google Web Speech API(Python)

Below python script was written to send a flac audio file to Google Web Speech API and print out the JSON response.

./ hello.flac

Accessing Google Web Speech API using Pyhon
Author : Debasish Mandal


import httplib
import sys

print '[+] Sending clean file to Google voice API'
f = open(sys.argv[1])
data =
google_speech = httplib.HTTPConnection('')
google_speech.request('POST','/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US',data,{'Content-type': 'audio/x-flac; rate=16000'})
print google_speech.getresponse().read()

7. Thoughts on complexity of reCaptcha Audio Challenges

While dealing with audio reCaptcha, you may know , it basically gives two types of audio challenges. One is pretty clean and simple (Example : . percentage of noise is very less in this type of challenges. 

Another one is very very noisy and its very difficult for even human to guess (Example : Constant hissss noise and overlapping voice makes it really difficult to crack human. You may wanna read this discussion on complexity of audio reCapctha.

In this post I will mainly cover the technique / tricks to solve the easier one using Google Speech API. Although I've tried several approaches to solve the complex one, but as I've already said, its very very had to guess digits even for human :( .

8. Cracking the Easy Captcha Manually Using Audacity and Google Speech API

Google Re-captcha allows user to download audio challenges in mp3 format. And Google web speech API accepts audio in flac format. So if we just normally convert the mp3 audio challenge to flac format of frame rate 16000 its does not work :( .  Google Chrome Speech to text api does not respond to this sound.

But after some experiment and head scratching, it was found that we can actually make Google web speech api convert the easy captcha challenge to text for us, if we can process the audio challenge little bit. In this section i will show how this audio manipulation can be done using Audacity.

To manually verify that first I'm going to use a tool called Audacity to do necessary changes to the downloaded mp3 file. 

Step 1: Download the challenge as mp3 file.
Step 2: Open the challenge audio in Audacity.

Step 3: Copy the first digit speaking sound from main window and paste it in a new window. So here we will only have a one digit speaking sound.

Step 4: From effect option make it repetitive once. (Now It should speak the same digit twice).

Lets say for example if the main challenge is  7 6 2 4 6, Now we have only first digit challenge in wav format which having the digit 7 twice.

Step 5: Export the updated audio in WAV format.
Step 6: Now convert the wav file to flac format using sox tool and send it to Google speech server using the python script posted in section 6. And we will see something like this.

Note: In some cases little bit amplification might be required if voice strength is too low.

debasish@debasish ~/Desktop/audio/heart attack/final $ sox cut_0.wav -r 16000 -b 16 -c 1 cut_0.flac lowpass -2 2500
debasish@debasish ~/Desktop/audio/heart attack/final $ python cut_0.flac 

Great! As you can see first digit of the audio challenge has been resolved by Google Speech. :) :) :) Now in same manner we can solve the entire challenge. In next section we will automate the same thing using python and it's wave module. 

9. Automation using Python and it's WAVE Module

Before we jump into processing of raw WAV audio using low level python API, its important to have some idea of how digital audio actually works. In above process we've extracted the most louder voices using audacity but to do it automatically using python, we must have some understanding of how digital audio is actually represented in numbers.

9.1. How is audio represented with numbers

There is an excellent stackoverflow post which explains the same. In short ,we can say audio is nothing but a vibration. Typically, when we're talking about vibrations of air between approximately 20Hz and 20,000Hz. Which means the air is moving back and forth 20 to 20,000 times per second. If somehow we can measure that vibration and convert it to an electrical signal using a microphone, we'll get an electrical signal with the voltage varying in the same waveform as the sound. In our pure-tone hypothetical, that waveform will match that of the sine function.

Now, we have an analogue signal, the voltage. Still not digital. But, we know this voltage varies between (for example) -1V and +1V. We can, of course, attach a volt meter to the wires and read the voltage.  Arbitrarily, we'll change the scale on our volt meter. We'll multiple the volts by 32767. It now calls -1V -32767 and +1V 32767. Oh, and it'll round to the nearest integer.

Now after having a set of signed integers we can easily draw an waveform using the data sets.

X axis -> Time
Y axis -> Amplitude (signed integers)

Now, if we attach our volt meter to a computer, and instruct the computer to read the meter 44,100 times per second. Add a second volt meter (for the other stereo channel), and we now have the data that goes on an audio CD. This format is called stereo 44,100 Hz, 16-bit linear PCM. And it really is just a bunch of voltage measurements.

9.2. WAVE File Format walk through using Python

As an example lets open up a very small wav file with a hex editor.


9.3. Parsing the same WAV file using Python

The wave module provides a convenient interface to the WAV sound format. It does not support compression/decompression, but it does support mono/stereo. Now we are going to parse the same wav file using python wave module and try to relate what we have just seen in hex editor.

Let's write a python script:

import wave 
f ='sample.wav', 'r') 
print '[+] WAV parameters ',f.getparams() 
print '[+] No. of Frames ',f.getnframes() 
for i in range(f.getnframes()): 
    single_frame = f.readframes(1) 
    print single_frame.encode('hex') 

Line 1 imports python wav module.
Line 2: Opens up the sample.wav file.
Line 3: getparams() routine returns a tuple (nchannels, sampwidth, framerate, nframes, comptype, compname), equivalent to output of the get*() methods.
Line 4: getnframes() returns number of audio frames.
Line 5,6,7: Now we are iterating through all the frames present in the sample.wav file and printing them one by one.
Line 8: Closes the opened file

Now if we run the script we will find something like this:

[+] WAV parameters (1, 2, 44100, 937, 'NONE', 'not compressed')
[+] No. of Frames 937
[+] Sample 0 = 62fe    <- Sample 1
[+] Sample 1 = 99fe   <- Sample 2
[+] Sample 2 = c1ff    <- Sample 3
[+] Sample 3 = 9000
[+] Sample 4 = 8700
[+] Sample 5 = b9ff
[+] Sample 6 = 5cfe
[+] Sample 7 = 35fd
[+] Sample 8 = b1fc
[+] Sample 9 = f5fc
[+] Sample 10 = 9afd
[+] Sample 11 = 3cfe
[+] Sample 12 = 83fe
[+] ....
and so on,

It should make sense now. In first line we get number of channels, sample width , frame/sample rate,total number of frames etc etc. Which is exact same what we saw in the hex editor (Section 9.2). From second line it stars printing the frames/sample which is also same as what we have seen in hex editor. Each channel is 2 bytes long because the audio is 16 bit. Each channel will only be one byte. We can use the getsampwidth() method to determine this. Also, getchannels() will determine if its mono or stereo.

Now its time to decode each and every frames of that file. They're actually little-endian. So we will now modify the python script little bit so that we can get the exact value of each frame. We can use python struct module to decode the frame values to signed integers.

import wave 
import struct 

f ='sample.wav', 'r') 
print '[+] WAV parameters ',f.getparams() 
print '[+] No. of Frames ',f.getnframes() 
for i in range(f.getnframes()): 
    single_frame = f.readframes(1) 
    sint = struct.unpack('<h', single_frame) [0]
    print "[+] Sample ",i," = ",single_frame.encode('hex')," -> ",sint[0] 

This script will print something like this:

[+] WAV parameters (1, 2, 44100, 937, 'NONE', 'not compressed')
[+] No. of Frames 937
[+] Sample 0 = 62fe -> -414
[+] Sample 1 = 99fe -> -359
[+] Sample 2 = c1ff -> -63
[+] Sample 3 = 9000 -> 144
[+] Sample 4 = 8700 -> 135
[+] Sample 5 = b9ff -> -71
[+] Sample 6 = 5cfe -> -420
[+] Sample 7 = 35fd -> -715
[+] Sample 8 = b1fc -> -847
[+] Sample 9 = f5fc -> -779
[+] Sample 10 = 9afd -> -614
[+] Sample 11 = 3cfe -> -452
[+] Sample 12 = 83fe -> -381
[+] Sample 13 = 52fe -> -430
[+] Sample 14 = e2fd -> -542

Now what we can see we have a set of positive and negative integers. Now you should be able to connect the dots. What I have explained in section 9.1. 

So now if we plot the same positive and negative values in a graph will find complete wave form. Lets do it using python matlab module.

import wave 
import struct 
import matplotlib.pyplot as plt 

data_set = [] 
f ='sample.wav', 'r') 
print '[+] WAV parameters ',f.getparams() 
print '[+] No. of Frames ',f.getnframes() 
for i in range(f.getnframes()): 
    single_frame = f.readframes(1)
    sint = struct.unpack('<h', single_frame)[0]

This should form following graph

Now you must be familiar with this type of graph. This is what you see in SoundCloud, But definitely more complex one.

So now we have clear understanding of how audio represented in numbers. Now it will be easier for readers to understand how the python script ( shared in section 9.3 ) actually works.

9.3. Python Script

In this section we will develop a script which automate the steps we did using Audacity in Section 8. Below python script will try extract loud voices from input wav file and generate separate wav files.

Once the main challenge is broken into parts we can easily convert it to flac format and send each parts of the challenge to Google speech API using the Python script shared in section 6.

9.4. Demo:

10. Attempt to Crack the Difficult(noisy) audio challenge

So we have successfully broken down the easy challenge.Now its time to give the difficult one a try. So I started with one noisy captcha challenge. You can see the matlab plot of the same noisy audio challenge below.

In above figure we can understand presence of a constant hisss noise. One of the standard ways to analyze sound is to look at the frequencies that are present in a sample. The standard way of doing that is with a discrete Fourier transform using the fast Fourier transform or FFT algorithm. What these basically in this case is to take a sound signal isolate the frequencies of sine waves that make up that sound.

10.1. Signal Filtering using Fourier Transform

Lets get started with a  simple example. Consider a signal consisting of a single sine wave, s(t)=sin(w∗t). Let the signal be subject to white noise which is added in during measurement, Smeasured(t)=s(t)+n. Let F be the Fourier transform of S. Now by setting the value of F to zero for frequencies above and below w, the noise can be reduced. Let Ffiltered be the filtered Fourier transform. Taking the inverse Fourier transform of Ffiltered yields Sfiltered(t). 

The way to filter that sound is to set the amplitudes of the fft values around X Hz to 0. In addition to filtering this peak, It's better to remove the frequencies below the human hearing range and above the normal human voice range. Then we recreate the original signal via an inverse FFT.

I have written couple of scripts which successfully removes the constant hiss noise from the audio file but main challenge is the overlapping voice. Over lapping voice makes it very very difficult even for human to guess digits. Although I was not able to successfully crack any of given difficult challenges using Google Speech API still I've shared few noise removal scrips (using Fourier Transform). 

These scripts can be found in the GitHub project page. There is tons of room for improvement of all this scripts.

11. Code Download

Every code I've written during this project is hosted here:  

12. Conclusion

When I reported this issue to Google security team, they've confirmed that, this mechanism is working as intended. The more difficult audio patterns are only triggered only when abuse/non-human interaction is suspected. So as per the email communication noting is going to be changed to stop this.

Thanks for reading. I hope you have enjoyed. Please drop me an email/comment in case of any doubt and confusion.

13. References

Sunday, March 16, 2014

In-Memory Kernel Driver(IOCTL)Fuzzing using Python

I'm sharing one of my Kernel Driver IOCTL Fuzzer which operates completely from user land. To run this script you should know at least one process which sends IOCTL to your target device you are fuzzing.

This script is very simple and straight forward. It basically operate in two modes. One is in-memory fuzzing mode and another is logging mode.

In fuzzing mode it attaches it self to given user mode process and hooks DeviceIoControl!Kernel32. After that when DeviceIoControl is get called by theprocess it fuzzes the input/output buffer length, input buffer content etc inside memory and at the same time logs actual buffer and mutated buffer length / content in a xml log file. Which can be helpful while reproducing os crashes.

When running in logging mode it tries to dump all I/O Control code I/O Buffer pointer, I/O buffer length that given process is sending to Kernel mode device. This XML log can be used to fuzz any driver further.


This tool can be downloaded from my github page : iofuzz

Source Code:

Friday, February 28, 2014

Reversing A Tiny Built-In Windows Kernel Module [Journey from Kernel32 to HAL]

Hello readers. Hope you are doing great. In this post I am going to explore our very own windows kernel little bit by reverse engineering a built in kernel module. If you have ever developed any kernel driver/module for windows it will be very easy for you to understand. If you are not very familiar with how device drivers work then has some really good resources to start up. So let's get started.

But Before we can start reversing the core component, take a look at the diagram mentioned below.

In above picture, the green elements are user mode components (Ring3). The diagram actually shows how apps.exe (a user mode application ) calls the kernel mode driver. We will try to cover each an every section mentioned in above diagram and try to reverse them, to understand how thing works.

Choosing a Target Driver to Reverse:

Its always better to start with simple thing. In this article we will reverse the Beep driver. The Beep Driver component provides the beep driver in the beep.sys file. This component also provides some supporting registry information. This is probably the smallest built in Kernel module in windows OS. It has only 6 routines.

First Step : Building apps.exe

We will start with section 1. So first we will start with a very basic C code. [beep.c]

int main(){
     Beep( 50, 750 );
     return 0;

The code is very simple and straight forward. You can see its calling Beep function which produces Beep sound. The function beep resides in Kernel32.dll. After compiling the code you should get beep.exe file. Running beep.exe should generate beep sound.

  _In_  DWORD dwFreq,
  _In_  DWORD dwDuration

Locating the Driver\Device:

From below screen shot we can see, Beep driver has actually created a Device called "Beep".And you can also see many other information like major functions supported by this driver and many more.

Sniffing all I/O Request Packets (IRP) to Beep Device using IRP Tracker utility:

IRP Tracker is very cool and powerful utility. It can actually sniff the Ring3 and Ring0 gateway and show details of messages passed from user mode process and kernel driver. Using this tool we are going to sniff all request which beep.exe actually actually sending from ring3 to ring0 to produce the beep sound.

Ok, So to start sniffing we need to provide the tool the driver name we want to sniff.  Go to File and select driver. Now you need to provide the driver name which we want to sniff. In this case we are only interested in sniffing the "Beep" driver. So now we started sniffing all messages between user and kernel. Now its time to execute the beep.exe we just compiled . When you execute the beep.exe file you will see few new entries in IRP Tracker window.

Now if you look at the IRP Address Sequence Number column you will see first entry is ntCreateFile() and the last entry is ntClose(). In between them you can see ntDeviceIoControlFile is getting called. Now if you look at the major function column you will find "DEVICE_CONTROL". If you look in msdn you will find that this API is actually used to send IOCTL codes from user land to kernel driver.

NTSTATUS WINAPI NtDeviceIoControlFile(
  _In_   HANDLE FileHandle,
  _In_   HANDLE Event,
  _In_   PIO_APC_ROUTINE ApcRoutine,
  _In_   PVOID ApcContext,
  _Out_  PIO_STATUS_BLOCK IoStatusBlock,
  _In_   ULONG IoControlCode,
  _In_   PVOID InputBuffer,
  _In_   ULONG InputBufferLength,
  _Out_  PVOID OutputBuffer,
  _In_   ULONG OutputBufferLength

IRP tracker utility can also provide us the IOCTL code the user mode application sending to kernel driver.

We can see its sending IOCTL code 0x10000 (BEEP_SET) to that device. Keep this in mind. We are going to come back to this in a minute.

Reversing Beep() [Kernel32.dll]

To reverse the Beep routine, lets load Kernel32.dll in IDA Pro. After its loaded lets jump into Beep routine and you should see something like this.

And we can see, its trying to communicate with the device Beep \\Device\\Beep by calling NtCreateFile. When communicating with a Kernel mode driver , any user land application uses NtCreateFile. If its successful this function returns one handle the target device. Using that handle we can read / write to that device. We will come to this later.

If we go little further inside Beep!Kernel32 we should see its trying to verify few parameters passed to it and after that it calls NtDeviceIoControlFile().

I hope now you are able to connect the dots now. If you can remember its the same sequence of function call you have seen in IRP tracker utility. Its already known to us that this NtDeviceIoControlFile is used for sending IOCTL codes to kernel driver.

Reversing ntdll.dll [NtDeviceIoControlFile()]

Now we will have a closer look at the call of NtDeviceIoControlFile. We have seen in msdn that the 6th parameter of NtDeviceIoControlFile is the IOCTL code.

NTSTATUS WINAPI NtDeviceIoControlFile(
  _In_   HANDLE FileHandle,
  _In_   HANDLE Event,
  _In_   PIO_APC_ROUTINE ApcRoutine,
  _In_   PVOID ApcContext,
  _Out_  PIO_STATUS_BLOCK IoStatusBlock,
  _In_   ULONG IoControlCode,
  _In_   PVOID InputBuffer,
  _In_   ULONG InputBufferLength,
  _Out_  PVOID OutputBuffer,
  _In_   ULONG OutputBufferLength

Lets verify that in NtDeviceIoControlFile function routine.

Hope you are able the connect the dots now. In above image you can see, first its loading 10000h into ebx register and then passing it to NtDeviceIoControlFile(). This is the same IOCTL code we have seen in IRP tracker utility.

Now lets attach the beep.exe file with immunity debugger and set a break point at NtDeviceIoControlFile(). After setting up the break point if we continue the execution, we should break at this point as shown in screen shot below.

If you look at the entry point of NtDeviceIoControlFile() routine, you will see below instruction,


Now from this sequence of instructions we can understand that its probably going for a system call. Now if we go further and follow the CALL DWORD PTR DS:[EDX] instruction we will get something like this.


From this INT 2E it's now absolutely clear that its going to invoke a software interrupt. Now EAX is actually pointing to 0x42. So we can say this is the system call number. We can verify this from any SDDT dumping utility. In this case I've used ICESword tool to dump the System Service Descriptor Table.

You can see systemcall 0x42 is actually pointing to Kernel version of NtDeviceIoControlFile() and its resides in the main windows kernel component which is ntoskrnl.exe. After invoking the interrupt OS will switch to kernel-mode to execute a system service. KiSytemServices is going to take the call.
The ‘int’ instructor causes the CPU to execute a software interrupt, i.e. it will go into the Interrupt Descriptor Table at index 2e and read the Interrupt Gate Descriptor at that location. The CPU switches automatically to the kernel-mode stack. The CPU automatically saves the user-mode program’s SS, ESP, EFLAGS, CS and EIP registers on the kernel-mode stack.

More Details Here

Breaking the Beep Driver(Beep.sys):

So till this point we have seen that how user land application is sending request to the kernel driver. Now we will see, how the kernel driver actually process the user land application request and act accordingly. For this we will reverse the Beep.sys file which is the main driver PE file.

After loading the driver into IDA you should first see the Driver Entry subroutine. We know DriverEntry is the first routine called after a driver is loaded. Since its responsible for initializing the driver we should find all the IOCTL handler functions in this driver entry. First thing you will see IoCreateDevice() is getting called.

NTSTATUS IoCreateDevice(
  _In_      PDRIVER_OBJECT DriverObject,
  _In_      ULONG DeviceExtensionSize,
  _In_opt_  PUNICODE_STRING DeviceName,
  _In_      DEVICE_TYPE DeviceType,
  _In_      ULONG DeviceCharacteristics,
  _In_      BOOLEAN Exclusive,
  _Out_     PDEVICE_OBJECT *DeviceObject

The IoCreateDevice() routine creates a device object for use by a driver. You should see call to this function on every driver's driver entry routine.

Now that we have successfully created our \Device\Beep device driver. So going further into the DriverEntry we will get a structure like this.

Equivalent C code will be something like this:

DriverObject->DriverStartIo = sub_1051A;
DriverObject->DriverUnload = DriverUnload;
DriverObject->MajorFunction[0] = sub_1046A;
DriverObject->MajorFunction[2] = sub_104B8;
DriverObject->MajorFunction[14] = sub_10400;
DriverObject->MajorFunction[18] = sub_10354;

More practical idea about IRP Major Functions can be found here

Digging further into all above mentioned IRP handlers (sub_xxxx), it was found that sub_10354 actually responsible for handling all IoControls. So we can conclude,

DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = Beephandler;

Now lets jump into sub_10354 in IDA pro. and have a look what its trying to do. Inside sub_10354 you should see somthing like this.

On a bigger picture you should notice calls to below functions.

KeRemoveDeviceQueue,KfRaiseIrql,IoAcquireCancelSpinLock,IoReleaseCancelSpinLock etc etc.Right now we are not very interested in any of above functions. If you are interested you can explore msdn. The call we are interested in is HalMakeBeep.

Call To HalMakeBeep

Reversing the HAL.dll:

Now we will jump into HalMakeBeep routine. HalMakeBeep routine actually resides into hal.dll.

The Windows Hardware Abstraction Layer (HAL) is implemented in Hal.dll. Hardware abstractions are sets of routines in software that emulate some platform-specific details, giving programs direct access to the hardware resources.They often allow programmers to write device-independent applications by providing standard Operating System (OS) calls to hardware. Each type of CPU has a specific instruction set architecture or ISA. One of the main functions of a compiler is to allow a programmer to write an algorithm in a high-level language without having to care about CPU-specific instructions. Then it is the job of the compiler to generate a CPU-specific executable. The same type of abstraction is made in operating systems, but OS APIs now represent the primitive operations of the machine, rather than an ISA.  This allows a programmer to use OS-level operations (i.e. task creation/deletion) in their programs while still remaining portable over a variety of different platforms.[Source : Wiki]

So to look at the assembly of HalMakeBeep we have to load up hal.dll in IDA. After loading hal.dll we will jump into HalMakeBeep routine. You should see lot of inline assembly inside this function.

Every PC has an internal speaker. It can generating beeps of different frequencies. We can actually control the speaker by providing a frequency number which defines the pitch of the beep, then turning the speaker on for the duration of the beep.

Here the frequency number we provide is nothing but a a counter value. Our computer uses it to determine how long is to wait between sending pulses to the internal speaker. More clearly a smaller frequency number will cause the pulses to be sent quicker, and it will result a higher pitch.Here the frequency number actually tells the PC how many of these cycles to wait before sending another pulse.

Mainly we can communicate with the speaker controller using IN and OUT instructions. Below I've mentioned few steps in generating a beep:

  1. First we need to send the value 182 to port 43h. This will actually set the speaker up.
  2. Next thing is, sending the frequency number to port 42h. Since this is an 8-bit port, we must use two OUT instructions to do this. Send the least significant byte first, then the most significant byte.
  3. After that, to start the beep sound, bits 1 and 0 of port 61h must be set to 1. Since the other bits of port 61h have other uses, they must not be modified. Therefore, you must use an IN instruction first to get the value from the port, then do an OR to set the two bits, then use an OUT instruction to send the new value to the port.
  4. Pause for the duration of the beep.
  5. We can turn off the beep by resetting bits 1 and 0 of port 61h to 0. Remember that since the other bits of this port must not be modified, you must read the value, set just bits 1 and 0 to 0, then output the new value.

So now if we look at the HalMakeBeep routine it should make sense. You can see that its doing the same thing just describe above.

Thanks for reading. Hope you have enjoyed this post. If you believe i did something wrong anywhere and you want me to correct or i've missed something to cover, please drop an email or comment below.



Wiki :