Hex Rays
Hex Rays Blog —  State of the art code analysis

Decompiling floating point

It is a nice feeling, when, after long debugging nights, your software finally runs and produces meaningful results. Another hallmark is when other users start to use it and obtain useful results. Usually this period is very busy: lots of new bugs are discovered and fixed, unforeseen corner cases are handled. Then another period starts: when users come back for more copies,with more ideas, request more functionality, etc. This is what is happening with the decompiler now and I feel it is time to update you with the latest news.

In short, things go well. We currently can handle floating point instructions for Borland and Visual Studio, and some GCC generated stuff. Problems remain (especially with optimized code)but we advance well. Below are a couple of samples. The first one is very simple. The following assembly function:

_my_sincos proc near arg_0 = qword ptr 8 push ebp mov ebp, esp fld [ebp+arg_0] fsincos fxch st(1) fmul st, st fxch st(1) fmul st, st faddp st(1), st fsqrt mov esp, ebp pop ebp retn _my_sincos endp

is converted into the following one-liner:

long double __cdecl my_sincos(double a1) { return sqrt(sin(a1) * sin(a1) + cos(a1) * cos(a1)); }

Pretty simple, you may say… Well, here’s a longer one (sorry for the length of the assembler listing, please scroll down):

[email protected]@[email protected] proc near var_40 = qword ptr -40h var_38 = qword ptr -38h var_30 = qword ptr -30h var_28 = qword ptr -28h var_20 = qword ptr -20h var_18 = qword ptr -18h var_10 = qword ptr -10h var_8 = qword ptr -8 arg_0 = qword ptr 8 arg_8 = qword ptr 10h push ebp mov ebp, esp sub esp, 28h mov eax, dword ptr [ebp+arg_8+4] push eax ; int mov ecx, dword ptr [ebp+arg_8] push ecx ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+38h+var_38] call [email protected]@[email protected] add esp, 10h fstp [ebp+arg_0] sub esp, 8 fld [ebp+arg_0] fstp [esp+30h+var_30] call [email protected]@[email protected] add esp, 8 mov dword ptr [ebp+arg_8], eax mov dword ptr [ebp+arg_8+4], edx mov edx, dword ptr [ebp+arg_8+4] push edx ; int mov eax, dword ptr [ebp+arg_8] push eax ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+38h+var_38] call [email protected]@[email protected] add esp, 10h fstp [ebp+arg_0] mov ecx, dword ptr [ebp+arg_8+4] push ecx ; int mov edx, dword ptr [ebp+arg_8] push edx ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+38h+var_38] call [email protected]@[email protected] add esp, 10h fstp [ebp+arg_0] mov eax, dword ptr [ebp+arg_8] mov ecx, dword ptr [ebp+arg_8+4] mov dword ptr [ebp+var_8], eax mov dword ptr [ebp+var_8+4], ecx mov edx, dword ptr [ebp+var_8+4] mov dword ptr [ebp+var_10+4], edx and dword ptr [ebp+var_8+4], 7FFFFFFFh fild [ebp+var_8] and dword ptr [ebp+var_10+4], 80000000h mov dword ptr [ebp+var_10], 0 fild [ebp+var_10] fchs faddp st(1), st fcomp [ebp+arg_0] fnstsw ax test ah, 41h jnz short loc_F0A mov eax, dword ptr [ebp+arg_8+4] push eax ; int mov ecx, dword ptr [ebp+arg_8] push ecx ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+38h+var_38] call [email protected]@[email protected] add esp, 10h fstp [ebp+arg_0] jmp short loc_F2D ; ————————————————————————— loc_F0A: ; int push 0 push 4D2h ; int mov edx, dword ptr [ebp+arg_8+4] push edx ; int mov eax, dword ptr [ebp+arg_8] push eax ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+40h+var_40] call [email protected]@[email protected] add esp, 18h fstp [ebp+arg_0] loc_F2D: mov ecx, dword ptr [ebp+arg_8+4] push ecx ; int mov edx, dword ptr [ebp+arg_8] push edx ; int sub esp, 8 fld [ebp+arg_0] fstp [esp+38h+var_38] call [email protected]@[email protected] add esp, 10h movzx eax, al test eax, eax jz short loc_F83 mov ecx, dword ptr [ebp+arg_8] mov edx, dword ptr [ebp+arg_8+4] mov dword ptr [ebp+var_18], ecx mov dword ptr [ebp+var_18+4], edx mov eax, dword ptr [ebp+var_18+4] mov dword ptr [ebp+var_20+4], eax and dword ptr [ebp+var_18+4], 7FFFFFFFh fild [ebp+var_18] and dword ptr [ebp+var_20+4], 80000000h mov dword ptr [ebp+var_20], 0 fild [ebp+var_20] fchs faddp st(1), st fstp [ebp+var_28] jmp short loc_F89 ; ————————————————————————— loc_F83: fld [ebp+arg_0] fstp [ebp+var_28] loc_F89: fld [ebp+var_28] mov esp, ebp pop ebp retn [email protected]@[email protected] endp

The above code is translated into:

double __cdecl ld_ull_test(double a1, __int64 a2) { double v2; // [email protected] double v4; // [sp+18h] [bp-28h]@5 double v5; // [sp+48h] [bp+8h]@1 double v6; // [sp+48h] [bp+8h]@1 double v7; // [sp+48h] [bp+8h]@2 unsigned __int64 v8; // [sp+50h] [bp+10h]@1 v6 = ld_ull_add(a1, a2); v8 = ld_ull_cvt(v6); v2 = ld_ull_sub(v6, v8); v5 = ld_ull_mul(v2, v8); if ( (double)v8 <= v5 ) v7 = ld_ull_calc(v5, v8, 1234i64); else v7 = ld_ull_div(v5, v8); if ( ld_ull_cmpeq(v7, v8) ) v4 = (double)v8; else v4 = v7; return v4; }

I strongly prefer the second listing to the first. In fact, the more I use the decompiler, the less I want to return to the assembly level (this means that you may expect source level debugging and other similar improvements in the future 😉

In order to handle floating point, we also had to improve many other aspects of the decompiler. Here are the things I remember offhand:

  • We changed the stack variable allocation mechanism to use data flow information. In practice this means that reused stack frame slots are recognized and multiple variables are created for them. No more funny casts because of a stack slot reuse!
  • The stack variables are considered as first class citizens by the propagation and other algorithms. Previous versions of the decompiler were optimizing registers but stack variables were not optimized much. In practice: shorter and cleaner output. This improvement, combined with the previous one, allows us to handle reused function stack arguments very smoothly. It goes without saying that aliased stack variables are still not optimized (unfortunately, it can not be done automatically)
  • Made the optimization rules more robust and more efficient
  • Added more rules to remove unnecessary casts
  • Add a new algorithm to recognize call arguments
  • Better user interface (as usual, improving ui is always a good idea 😉
This list could go on with more details but let’s stop here. Since there are some substantial changes, we will make a beta testing for the next release. It is not that far away now – probably even this month!
Go to top of page