Hex Rays
Hex Rays Blog —  State of the art code analysis

Trunk, Branches, and Leaves

IDA Pro being and old and time-proven platform for binary analysis, many plugins grew on it. There are custom made plugins for new processors and file formats. There are deobfuscators, exporters, data visualizers, object reconstructors and other stuff.

No one can preview and implement everything. Some “innovations” are the result of software analysis improvements: malware authors come up with something new if the old obfuscation methods do not work anymore. New platforms and compilers require different analysis: for example, the latest GNU compilers generate quite complex code which requires much deeper approach.

Open architecture gives the users the opportunity to extend the core engine and build on it. Be it one-day small script or plugin or something fundamental and serious, it is for the benefit of everyone.

That’s why the decompiler will have an API. While it itself is built on the top of IDA, you will be able build on the top of the decompiler. This is a pretty natural growth pattern:

Below are the descriptions in no particular order:

  • Typist
    This plugin reconstructs object types used in the program. The object boundaries can be approximatively determined as a side effect.
  • Ranger
    This plugin uses data flow analysis to find out possible value ranges of local variables and global data.
  • Classifier
    The output of the Typist is leveraged into class (object) definitions. Class hierarchy emerges as a result. The notion of virtual functions comes into existence.
  • Inliner
    Find code sequences which can be converted into inline functions. The output becomes more readable.
  • Code Slicer
    This plugin optimizes functions by performing ‘slices’ of only possible input argument values. For example, if a function with two argument is known to be always called with the second argument equal to zero, the plugin can remove all code which handles non-zero cases. More generic form of this plugin performs slicing on other data values, not only on function arguments.
  • FlowVisor
    Data flow visualizer. It uses information provided by the decompiler engine and other plugins. May have several different display methods. The least intrusive display is in the form of mouse hints (locations where the current variable is used/defined, its possible values, tainted/no). It can also display graphs and plain text. Other plugins will have their own visualization methods but this plugin will provide services for other plugins to use.
  • TaintStopper
    Performs taint analysis and displays potential uses of untrusted data.
  • VeriHeap
    Memory allocation verifier. Typical problems like failure to verify the result of memory allocation, double frees, frees of non-allocated memory can be detected.
  • CleanBounds
    Verify object boundaries are respected and there are no overflows.
  • JunkCollector
    Detects unreachable functions and removes from the further analysis.
  • Idiomizer
    This is a generic name for plugins which verify consistent use of programming idioms. For example, if before modifying a variable we acquire a lock in all program locations but one, we have a idiom violation. There are many programming idioms and there can be many different idiom verifiers.
  • Exporter
    Generic name for plugins which export information into other systems. The output can be ubiquitous XML or old good SQL databases.
  • Transformer
    Generic name for plugins which modify the decompiler output. The goal can vary tremendously from making the output more human readable to optimizing or instrumenting it. CodeSlicer and Inliner are examples of such plugins.
  • Microgen
    Generic name for plugins which translate assembly text into microcode. Microgens are also responsible for mapping CPU registers into microcode registers and resolving memory references. Microgens ‘port’ the decompiler to new processors and platforms. Ideally, we need to divide them into two parts: processor specific and operating system (environment) specific parts.
  • Procrustes
    Generic name for plugins which modify the assembly text to conform the decompiler assumptions. An example: low level assembly instructions which are not used by compilers and therefore can not be decompiled are replaced by equivalent function calls. These plugins are add-ons to microgens.
  • Vizier
    A plugin which modifies the core decompiler engine by adding a new transformation rule. For example, if some data is known to be read-only but the decompiler has no means of knowing it, a plugin could replace “load memory” instructions by “load constant” instructions for this data.
Plugins like CleanBounds, VeriHeap, and Idiomizer can be used to solve today’s practical problems. Other plugins can be used to facilitate binary analysis and render it less time consuming.

I tried to come up with the list of plugins I’d personally like to have. The list is far from being exhaustive. Feel free to add to it 😉

Plugins names and descriptions are completely fictional.
Go to top of page