First of all, read the troubleshooting page. It explains how to deal with most decompilation problems. Below is a mix of other useful information that did not fit into any other page:
Sometimes the decompiler can be overly aggressive and optimize references to volatile memory completely away. A typical situation like the following:device_ready DCD ? ; VOLATILE! MOV R0, =device_ready LDR R1, [R0] LOOP: LDR R2, [R0] SUB R2, R1 BEQ LOOPcan be decompiled intowhile ( 1 ) ;because the decompiler assumes that a variable can not change its value by itself and it can prove that r0 continues to point to the same location during the loop.To prevent such optimization, we need to mark the variable as volatile. Currently the decompiler considers memory to be volatile if it belongs to a segment with one of the following names: IO, IOPORTS, PORTS, VOLATILE. The character case is not important.
Sometimes the decompiler does not optimize the code enough because it assumes that variables may change their values. For example, the following code:LDR R1, =off_45934 MOV R2, #0 ADD R3, SP, #0x14+var_C LDR R1, [R1] LDR R1, [R1] ; int BL _IOServiceOpencan be decompiled intoIOServiceOpen(r0_1, *off_45934, 0)but this code is much better:IOServiceOpen(r0_1, mach_task_self, 0)becauseoff_45934 DCD _mach_task_selfis a pointer that resides in constant memory and will never change its value. The decompiler considers memory to be constant if one of the following conditions hold:
- the segment has access permissions defined but the write permission is not in the list
- the segment type is CODE
- the segment name is one of the following (the list may change in the future): .text, .rdata, .got, .got.plt, __text, __const, __const_coal, __cstring, __literal4, __literal8, __pointers, __nl_symbol_ptr, __la_symbol_ptr, __objc_protorefs, __objc_selrefs, __objc_classrefs, __objc_superrefs, __objc_const, __message_refs, __cls_refs, __inst_meth, __cat_inst_meth, __cat_cls_meth.
The decompiler knows about the CONTAINING_RECORD macro and tries to use it in the output. However, in most cases it is impossible to create this macro automatically, because the information about the containing record is not available. The decompiler uses three sources of information to determine if CONTAINING_RECORD should be used:
- If there is an assignment like this:
v1 = (structype *)((char *)v2 - num);it can be converted intov1 = CONTAINING_RECORD(v2, structype, fieldname);by simply confirming the types of v1 and v2.
NOTE: the variables types must be specified explicitly. Even if the types are displayed as correct, the user should press Y followed by Enter to confirm the variable type.
- Struct offsets applied to numbers in the disassembly listing are used as a hint to create CONTAINING_RECORD. For example, applying structure offset to 0x41C in
sub eax, 41Chwill have the same effect as in the previous point. Please note that it makes sense to confirm the variable types as explained earlier.
- Struct offsets applied to numbers in the decompiler output. For example, applying _DEVICE_INFO structure offset to -131 in the following code:
deviceInfo = (_DEVICE_INFO *)((char *)&thisEntry[-131] - 4);will convert it to:deviceInfo = CONTAINING_RECORD(thisEntry, _DEVICE_INFO, ListEntry);Please note that it makes sense to confirm the variable types as explained earlier.