Hex-Rays v1.1 vs. v1.0 Decompiler Comparison Page

Below you will find side-by-side comparisons of v1.0 and v1.1 decompilations. Please maximize the window too see both columns simultaneously.

The following examples are displayed on this page:

More aggressive variable elimination
More for-loops
Better recognition of 64-bit idioms
Better recognition of 64-bit idioms, example 2
Floating point instructions
Better recognition of built-in functions
More aggressive value propagation
New algorithm to detect function arguments
Implicit value propagation

NOTE: The new decompiler can use comparison instructions (and other clues) to determine possible variable values. In the example above, it is clear that the result variable is equal to zero within the if-block. This knowledge allows for more optimizations and makes the code more readable.

More aggressive variable elimination

We improved the decompiler engine to eliminate more variables. This means less useless assignments, shorter and more readable code.

Psedudocode v1.0

HRESULT __stdcall AtlMarshalPtrInProc(LPUNKNOWN pUnk, const IID *riid, LPSTREAM *ppstm) { HRESULT v3; // edi@1 LPSTREAM *v4; // esi@1 HRESULT v6; // eax@1 HRESULT v7; // eax@2 v4 = ppstm; v6 = CreateStreamOnHGlobal(0, 1, ppstm); v3 = v6; if ( v6 >= 0 ) { v7 = CoMarshalInterface(*v4, riid, pUnk, 3u, 0, 1u); v3 = v7; if ( v7 < 0 ) { (*v4)->lpVtbl->Release(*v4); *v4 = 0; } } return v3; }

Pseudocode v1.1

HRESULT __stdcall AtlMarshalPtrInProc(LPUNKNOWN pUnk, const IID *riid, LPSTREAM *ppstm) { HRESULT v3; // edi@1 v3 = CreateStreamOnHGlobal(0, 1, ppstm); if ( v3 >= 0 ) { v3 = CoMarshalInterface(*ppstm, riid, pUnk, 3u, 0, 1u); if ( v3 < 0 ) { (*ppstm)->lpVtbl->Release(*ppstm); *ppstm = 0; } } return v3; }

More for-loops

For-loops are easier to read than while- or do-loops. The decompiler prefers to generate for loops now. Again, note the difference in size and readability!

Psedudocode v1.0

j = 0; while ( j < 5 ) { v3 = 1 << j; tbl2[j + 5 * i] = ((1 << j) & v1) == 1 << j; ++j; }

Pseudocode v1.1

for ( j = 0; j < 5; ++j ) { v3 = 1 << j; tbl2[j + 5 * i] = ((1 << j) & v1) == 1 << j; }

Better recognition of 64-bit idioms

While we still do not recognize all 64-bit idioms (perfections is out of question in binary analysis) we do much better now. Above is just one of many possible examples (v10 on the left and v21 on the right are 64-bit variables.

Psedudocode v1.0

v9 = v21 + v23; v23 += v21; *((_DWORD *)&v10 + 1) = *((_DWORD *)&v23 + 1); *(_DWORD *)&v10 = v9; *(_DWORD *)&v11 = GetShortField(v10, (int)"IMAGE_DEBUG_DIRECTORY", 1); if ( v11 ) break;

Pseudocode v1.1

v21 += v22; *(_DWORD *)&v8 = GetShortField(v21, (int)"IMAGE_DEBUG_DIRECTORY", 1); if ( v8 ) break;

Better recognition of 64-bit idioms, example 2

Yet one more illustration of 64-bit arithmetics.

Psedudocode v1.0

v21 = PageSize * (_DWORD)a6 - a2; v22 = __MKCADD__(v21, qword_6992C7E0); *(_DWORD *)&qword_6992C7E0 = v21 + qword_6992C7E0; *((_DWORD *)&qword_6992C7E0 + 1) += ((unsigned __int64)PageSize * a6 >> 32) - ((unsigned int)(PageSize * (_DWORD)a6 < (_DWORD)a2) + *((_DWORD *)&a2 + 1)) + v22;

Pseudocode v1.1

qword_6992C7E0 += PageSize * a6 - a2;

Floating point instructions

One could say that this comparison is not fair and we would agree. Previous versions of the decompiler could not handle floating point instruction at all, and the new version knows all about them, including conversion intricacies and other subtle details.

Psedudocode v1.0

bool __cdecl ld_ull_cmpge(long double a1, __int64 a2) { __int64 v10; // [sp+0h] [bp-8h]@1 *(_DWORD *)&v10 = a2; *((_DWORD *)&v10 + 1) = *((_DWORD *)&a2 + 1) + -2147483648; __asm { fild [ebp+var_8] fadd ds:dbl_10B74 fld [ebp+arg_0] fcompp fnstsw ax sahf } return !_CF; }

Pseudocode v1.1

bool __cdecl ld_ull_cmpge(long double a1, unsigned __int64 a2) { return a1 >= (long double)a2; }

Better recognition of built-in functions

While there is a lot to do, the new version copes better with inline memcpy(), strlen(), and similar functions.

Psedudocode v1.0

v112 = v70 - 1; memcpy(v112, v71, 4 * (v68 >> 2)); v113 = v118 + 4; memcpy((char *)v112 + 4 * (v68 >> 2), (char *)v71 + 4 * (v68 >> 2), v68 & 3); v114 = *(_DWORD *)v113; v118 = v113; argc = v113;

Pseudocode v1.1

memcpy(v99 - 1, v100, v96); v109 += 4; v111 = (const char **)v109;

More aggressive value propagation

Thanks to the more aggressive propagation algorithm, the output becomes simpler.

Psedudocode v1.0

signed __int64 __cdecl smodushort(signed __int64 a1, unsigned __int16 a2) { int v3; // [sp+8h] [bp-10h]@1 int v4; // [sp+Ch] [bp-Ch]@1 v3 = a2; v4 = 0; return a1 % *(_QWORD *)&v3; }

Pseudocode v1.1

signed __int64 __cdecl smodushort(signed __int64 a1, unsigned __int16 a2) { return a1 % a2; }

New algorithm to detect function arguments

We switched to a new heuristic algorithm to detect arguments of unknown function calls. While it is not perfect, it loses less arguments and tends to produce more reliable output. Please note that this algorithm is required to determine the types of only unknown calls: once the user specifies the function prototype, it will be used by the decompiler.

Psedudocode v1.0

result = sub_100270F(0); if ( result ) { v1 = sub_15F0000();

Pseudocode v1.1

result = sub_100270F(0); if ( result ) { dword_100A480 = sub_15F0000(byte_1009628, -2147483648, 3, 0, 3, 128, 0);

Implicit value propagation

Psedudocode v1.0

result = sub_402EE0(); if ( !result ) { v10 &= result; v6 = v3->pchar4; if ( v3->dword0 <= (unsigned int)result )

Pseudocode v1.1

result = sub_402EE0(v3); if ( !result ) { v10 = 0; v7 = v3->pchar4; if ( v3->dword0 <= 0u )