Basic Reverse Engineering with GDB

January 13, 2012

In computers, debugging is the process of locating and fixing or bypassing bugs (errors) in computer program code or the engineering of a hardware device.Debugging is the Fundamentals part of Exploit Development .When you are writing an exploit you are going to need to be able to execute the code in your target application in a variety of different ways, to give you the appropriate amount of control to monitor the code and memory closely when needed. You may want to run normally at one point, to go step by step through each individual instruction at another, and sometimes to have it run quickly to a particular point allowing you to take control once that point is reached.
Luckily, this is all possible via the use of a debugger by using breakpoints as well as the various methods for stepping through code.In this article will try to describe most common features of GDB.First we will take a simple C program.Compile it, And after that break it with GDB.

GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.

GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:

Start your program, specifying anything that might affect its behavior.
Make your program stop on specified conditions.
Examine what has happened, when your program has stopped.
Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.

After some basic debugging we will use some portable Linux based tools to gather more information about a Linux Executable.

So here we will debug this simple C program using gdb.

#include<stdio.h>
#include<wchar.h>
int my_function(wchar_t *a)
{
return wprintf(a);
}
int main()
{
return my_function(L"Hello World!\n");
}

First of all we will use gcc compiler to compile the C prog.

debasish@debasish-desktop:~$ nano MYprog.c
debasish@debasish-desktop:~$ gcc -o MYprog MYprog.c
MYprog.c:2:18: warning: extra tokens at end of #include directive
debasish@debasish-desktop:~$
debasish@debasish-desktop:~$ ./MYprog
Hello World!
debasish@debasish-desktop:~$ ^C

So we have successfully compiled our C program and its working fine.

Now we will debug this program with gdb debugger.We will use following commands.

debasish@debasish-desktop:~$ gdb MYprog
GNU gdb (GDB) 7.1-ubuntu
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/debasish/MYprog...(no debugging symbols found)...done.
(gdb)

So now gdb will load the program and at entry point it will pause the execution.
Then we will use the command "start" to start the debugging process.

(gdb) start
Temporary breakpoint 1 at 0x804841a
Starting program: /home/debasish/MYprog

Temporary breakpoint 1, 0x0804841a in main ()
(gdb)

We can see that is showing the break point is at 0x0804841a.

Now we will use the command "layout asm" to see the assembly code in a proper order.

Now you should get a window like this.

0x804841a                   and    $0xfffffff0,%esp                                                                                      

   ¦0x804841d               sub    $0x10,%esp                                                                                          

   ¦0x8048420               movl   $0x80484f0,(%esp)                                                                                     

   ¦0x8048427              call   0x8048404                                                                                

   ¦0x804842c              leave                                                                                                        

   ¦0x804842d              ret                                                                                                          

   ¦0x804842e                       nop                                                                                                          

   ¦0x804842f                       nop                                                                                                          

   ¦0x8048430 <__libc_csu_fini>     push   %ebp                                                                                                 

   ¦0x8048431 <__libc_csu_fini+1>   mov    %esp,%ebp                                                                                             

   ¦0x8048433 <__libc_csu_fini+3>   pop    %ebp                                                                                                  

   ¦0x8048434 <__libc_csu_fini+4>   ret                                                                                                          

   ¦0x8048435                       lea    0x0(%esi,%eiz,1),%esi                                                                                 

   ¦0x8048439                       lea    0x0(%edi,%eiz,1),%edi                                                                                 

   ¦0x8048440 <__libc_csu_init>     push   %ebp                                                                                                  

   ¦0x8048441 <__libc_csu_init+1>   mov    %esp,%ebp                                                                                             

   ¦0x8048443 <__libc_csu_init+3>   push   %edi                                                                                                  

   ¦0x8048444 <__libc_csu_init+4>   push   %esi                                                                                                  

   ¦0x8048445 <__libc_csu_init+5>   push   %ebx

Now in extreme left side the address shown, is the virtual address. The ">" sign indicates that the Break point is at 0x804841a.Which is our main function.

The first instruction is
sub $0x10,%esp
This will substructure the 10 from the ESP.
Next move instruction takes the value $0x80484f0 and put it in stack.We all know that Stack grows downward in memory!
Now more interestingly if you look at the 2nd line of the code you can see $0x80484f0 is the starting address of the string Hello World.
To validate that we can use this command.

(gdb) printf "%s\n",0x80484f0

Now it will return the first character of our string that is a H.
One thing to note that GDB cant print wide character to it will just return "H".

Now its obvious that adding 4 with this we will get our next character.

And adding more bytes will give our full string "Hello World"

Now step by step execution of assembly instructions is very important while trying to understand flow of any program.We can do this using "si" command."si" stands for "step into". When si is entered gdb will execute the next instruction just after break point.

Cont is another gdb command which can be used to run rest of the instructions at a time.

Now when playing with debugger its very important that at the same time you look at the status of the stack and registers.In interactive disassembler like Immunity,Olly debug in windows you can just easily monitor them.But for a command line debugger it will be not that easy.
At any point of time when you wanna check any register content you can do this just by using the command "print"
so to check the value at which EAX is pointing we have to enter

"print $eax"

There are more in gdb. Hopefully I will write another article on it.
One other tool that can be very useful for reverse engineering Linux based prog is "hexdump"

Use the hexdump tool with -C option will dump raw hex dump of executable.Which we usually get at the lower left corner in case of Immunity debugger or Ollydebug.

Now if you wanna see first 16 bytes of the executable then you can use the option -n.

For example

hexdump -C -n 16 MYprog

This will print the header part of executable.
The command "file" also can be used to retrieve some useful information about any executable.

readelf -h Myprog

This command will give the header information of this executable in detail.This will also retrieve the program entry pint.

ndisasm is another cool tool comes with Ubuntu using that you can actually disassemble the binary.
ndisasm -u -o 0x[entry-point] -e 0x320 MYprog | less

the option -e will escape fist 320 bytes.Which is nothing but the header part.

But if you notice you can see this is not the code we have just seen in gdb.

The reason is it the entry point.The code present here is used by the application for setting up the stack.

Now after this following instructions when stack is already configured ,if we jump at the address 0x8048358 we can have the assembly code we just saw in gdb.

08048395 51                push ecx
08048396 56                push esi
08048397 6817840408        push dword 0x8048417
0804839C E8B7FFFFFF        call dword 0x8048358

Look at the screen shot [red marked]. After the NOP sleds we can see the codes we have just seen in gdb.

It was the most fundamental of debugging linux application.I hope it was helpful.I will try to write more on gdb later on.