Several days ago we received, from an IDA user, a small harmless executable (test00.exe) that could not be debugged in IDA. Breakpoints would not break and, the program would run out of control, as if the debugger was too slow to catch it. When we first loaded the program in IDA, it complained that it could not find the imports section. That type of situation is frequent with protected executables, packed worms etc....
The second remarkable thing is that the entry point jumping... nowhere. Addresses marked in red usually reflect a location that IDA can't resolve.

This code employs attempts to prevent the disassembly and, as a result, the default load parameters are not appropriate. This type of obfuscated code demonstrates the problems inherent to the a one-click approach.
But what if we investigate a bit further, for example by loading the file in the manual mode? In this mode the user can specify which sections of the file should be loaded. To be on the safe side, let's load all sections. Let's uncheck the 'make imports section' checkbox to avoid the "missing imports" message. We have this

Once we have answered the questions about each section of the file we will get this listing: much better!

Now that we got rid of the unresolved address, we can analyze
the program. The first instruction of our executable is a jump, and it
jumps to the program header: loc_400158. Hmmmm,
the program header is not supposed to contain any code but this program abuses
the conventions and jumps to it. An interesting side effect results of the
fact that the program header is read only. That could explain
why breakpoints can't be put there.
Anyway, let's see how the program works. We see that the program loads a
pointer into ESI which gets immediately copied to EBX:
Later the value of EBX is used to call a subroutine:
HEADER:00400158 mov esi, offset off_40601C
HEADER:0040015D mov ebx, esi
(Ctrl-O converted the hexadecimal number in the first instruction
to a label expression)
HEADER:00400169 call dword ptr [ebx]
Calls like this are frequent in the listing, so let's find
out the function and what it does. Apparently a pointer to the function is located
here:
__u_____:0040601C off_40601C dd offset __ImageBase+130h
If we click on __ImageBase, what we'll see is an array of
dwords. IDA represented the program header as an array which is incorrect in
our case. We undefine the array (hotkey U), go back to the pointer (hotkey Esc)
and follow the pointer again. This time we will end up at the address 0x400130
which should contain a function. We are sure of that because the instruction
at 0x400169 calls 0x400130 indirectly. We press P (create procedure or function)
to tell IDA that there should be a function at the current address.While the
function is now on screen, we only have half of it! It seems that the
person who wrote that program wanted it to obfuscate it and separated
the function into several pieces. IDA now knows how
to deal with those fragmented functions and displays information about the other
function parts on the screen:
But it has only references to other parts. It would be nice
to have the whole function on one page. There is a special command to help
us: the command to generate flow charts of functions in IDA, it's hotkey is
F12. This command is especially interesting for fragmented functions like
ours because all pieces of the function will be on the screen:
It might be interesting to display the flow chart of the main
function (very long function, keep scrolling!):
A quick glance at the flow chart reveals that there is
only one exit from the function at its "ret" instruction (0x4001FA).
We could put a breakpoint there and let the program run. Now, before
we do that, let's repeat that it is not a good
idea to run untrusted code on your computer. It is much better
to have a separate "sandbox" machine
for such tests, for example using the remote
debugging facilities IDA offers. Therefore, IDA displays a warning
when a new file is going to be started under debugger: ignore at your
own risk.
Since the breakpoint is located in the program header,
and the program header is write protected by the system, we can not
use a plain software breakpoint. We have to use a hardware breakpoint:
first press F2 to create a breakpoint, then right click and select "edit breakpoint" to change
it to a hardware breakpoint on the "execution" event:
After having set the breakpoint, we start the debugger
by pressing F9. When we reach the breakpoint, the program will
be unpacked into the 'MEW' segment. We jump there and convert
everything to code (the fastest way to do this is to press F7 at the
breakpoint).
Now we have a very nice listing but with one major
problem: it is ephemeral - as soon as we'll stop
the debugging session, the listing will disappear.
The reason is of course that the listing displays the
memory content and that the memory will cease to exist when the process
will die. It would be nice to be able to save the memory into the database
and continue the analysis without the debugger. We will think about adding
that feature into future versions of IDA, but meanwhile we'll have to
do it manually. By "manually" we
do not mean to copy byte one by one on a paper, of course. We can
use the built-in IDC language to achieve this.
There are two things to be saved because they will disappear when the
debugger stops: the memory contents and the imported function names.
The memory contents can be saved by using the following 4-line script:
Please note that it is not necessary to create a segment,
it already exists (clear the "create segments" flag). Also, the address is
specified in paragraphs, i.e. it is shifted to the right by 4.
Load the file, press P at 0x401000 and voila, you have a nice listing:
The rest of the analysis is a pleasant and agreeable task left
to the reader as.... you guessed it.

auto fp, ea;
fp = fopen("bin", "wb");
for ( ea=0x401000; ea < 0x406000; ea++ )
fputc(Byte(ea), fp);
When the script has run, we will have a file named "bin" on the disk. It will
contain the bytes from the "MEW" segment. As you can see, I hardcoded the hexadecimal
addresses: after all, it is a disposable script intended to be run once.
We have to save the imported function names too. Look at the call at 0x401002,
for example:
MEW:00401002 call sub_4012DC
If we want to know the name of the called function, we press Enter several times
to follow the links and finally get the name:
kernel32.dll:77E7AD86
kernel32.dll:77E7AD86 kernel32_GetModuleHandleA: ; CODE XREF: sub_4012DCj
kernel32.dll:77E7AD86 ; DATA XREF: MEW:off_402000o
kernel32.dll:77E7AD86 cmp dword ptr [esp+4], 0
When we quit the debugger, the kernel32.dll segment will disappear from the listing
along with all its names, instructions, functions, everything. We have to copy
the function names before that:
auto ea, name;
for (ea = 0x401270; ea<0x4012e2; ea =ea+6 )
{
name = Name(Dword(Dfirst(ea))); /* get name */
name = substr(name, strstr(name, "_")+1, -1); /* drop the prefix */
MakeName(ea, name);
}
Now that we have run those scripts, we may stop the
debugger (press Ctrl-F2) and copy back the memory contents. The "Load
additional binary file" command
in the File, Load menu is the way to go: