A Non-Professional Comparison of Various Open Source Disassembly Engines

Due to my personal interests and work requirements, I have researched and used various popular open source x86/64 assembly and disassembly engines. If you want to analyze and manipulate assembly instructions, you either need to study the Intel instruction set and write your own, or use existing open source engines. Writing your own is a waste of time, labor-intensive, and prone to errors, so it’s better to use something that already exists.Here is a comparison of the disassembly engines I have used:1. Ollydbg’s ODDIsassmODDisassm from Ollydbg was the first open source disassembly engine I used. In 2007, I wrote a simple virtual machine using this library for the “Encryption and Decryption (Part 3)” course at the Xue Academy, as there weren’t many options available at that time. Thanks to this foundational library, the entire virtual machine was designed and developed in just two weeks (the requirements for the disassembly library were not high; it only needed to use string text for intermediate representation for encoding/decoding).The advantages of this disassembly library include having assembly interfaces (i.e., text parsing, which parses text strings and encodes them into binary). Given this feature, it was quite unique at the time, and to this day, there are very few people in the open source community doing this work.However, the new debugger x64dbg has emerged in recent years, which comes with an open source assembly library called XEDParse that has similar functionality to OD’s text parsing, supports a more complete instruction set, has fewer bugs, and also supports x64, with strong ongoing maintenance.But ODDIsassm has many shortcomings, such as:1. Incomplete instruction set support; due to the age of Ollydbg, it doesn’t even fully support the MMX instruction set, and the current Intel/AMD extended instruction set standards have been updated multiple times, with SSE5/AVX/AES/XOP not even being mentioned, making it completely unparseable.2. The decoded structure is not detailed enough; for example, instruction prefix support is not user-friendly, which can be seen in Ollydbg’s disassembly window. Aside from instructions like movs/cmps, repcc and other instructions are separated when combined; also, registers cannot represent high 8-bit registers like ah/bh/ch/dh.3. The author has not maintained the open source version since it was released, making it difficult to fix bugs in disassembly immediately.However, this can be understood because the author’s goal at that time was to perform text assembly/disassembly, so no structures or interfaces were established for the decoded information. Overall, using this disassembly engine today is outdated.2. BeaEngine

BeaEngine was the second library I used when the OD library could no longer meet my needs. When developing a decompiler, a library that can decode more information is required, so I found BeaEngine. I remember that earlier versions did not support high 8-bit register recognition, but the current version does.During use, I did not find any obvious shortcomings; many lesser-used new extended instruction sets have also been implemented.The currently implemented extended instruction sets include:

References:

FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, VMX, CLMUL, AES, MPX

It also categorizes different types of instructions, which is convenient for judging different instructions. Another feature is that it can decode every instruction’s used and affected registers, including flags registers, even down to each position of the flags register. This feature is excellent for optimizers and obfuscators.However, I personally believe that BeaEngine’s coding style is not great; various variable type conversions and naming styles give a chaotic feeling. For someone like me who has a coding obsession, it is simply unbearable, so I later switched to other libraries. If you don’t mind these issues, BeaEngine’s performance is still quite good.One more point, BeaEngine frequently has bugs, so be prepared when using it.3. udis86

udis86 is probably my favorite disassembly engine. The x86 extended instruction sets supported by udis86 include:

References:

MMX, FPU (x87), AMD 3DNow, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, AMD-V, INTEL-VMX, SMX

udis86’s code style is very concise; the functional functions are short, variable naming is clear and simple, the interface is clean and straightforward, and operations are flexible. If you need to maintain your own branch, it won’t take more than a few minutes to familiarize yourself with the entire code structure.So what are its advantages? The advantage of udis86 is that the interface is very flexible; you can choose to use ud_decode to decode a single instruction, and then use ud_translate_intel to convert the decoded structure into text format, or you can use ud_disassemble to complete the entire operation in one go. All these interfaces can be achieved in just one line of code.Due to udis86’s design philosophy of combination mode, it can adapt to various scenarios. If you want to develop a disassembler like IDA, it can do that; if you want to develop an instruction simulator, analyzer, optimizer, or obfuscator, it can do that too.This philosophy directly gives udis86 a powerful adaptability while maintaining performance. I have conducted performance tests, and udis86 is the fastest engine I have used in terms of decoding speed under similar decoding detail capabilities.As for shortcomings, I haven’t found any yet. However, udis86 does not support register analysis like BeaEngine, which is a small regret.4. Capstone

Capstone can be said to be the culmination of all disassembly engines, and I have to spend some words on it because I both love and hate it. Capstone is based on the MC component of the LLVM framework, so it supports all CPU architectures supported by LLVM.The CPU architectures it supports include:

References:

Arm, Arm64 (Armv8), M68K, Mips, PowerPC, Sparc, SystemZ, XCore & X86 (including X86_64)

It also has the most complete support for the x86 architecture instruction set, which is unmatched by other engines. The extended instruction sets supported by it include:

References:

3dnow, 3dnowa, x86_64, adx, aes, atom, avx, avx2, avx512cd, avx512er, avx512f, avx512pf, bmi, bmi2, fma, fma4, fsgsbase, lzcnt, mmx, sha, slm, sse, sse2, sse3, sse4.1, sse4.2, sse4a, ssse3, tbm, xop.

Pretty strong, right? In the current context of the mobile market being so hot, there are very few disassembly libraries that support ARM. If you need to develop a compiler that works under both x86 and ARM, having a unified interface is naturally better.Additionally, Capstone’s next branch also supports the cool feature of analyzing the registers used and affected by instructions during decoding (the master branch does not have this interface), having such a foundational library can indeed save a lot of effort.From the perspective of the x86/64 platform, Capstone can be regarded as a complete surpassing of BeaEngine in terms of decoding capability and instruction set support.Now, after all the praise, it’s time to mention the drawbacks.Since Capstone is ported from LLVM and is a C language project while LLVM is a C++ project, a lot of adaptation work was done during the migration, making it appear bloated.For example, in LLVM, MCInst is a class that describes a single lower-level mechanism instruction. Since Capstone is a C project, these classes are converted into structures during porting, and member functions are turned into independent C functions, such as MCInst_Init, MCInst_setOpcode, etc. Due to the complexity and high compatibility of the LLVM framework, all concepts within it are highly abstracted, and Capstone has also adapted interfaces to convert them into its own architecture, which can lead to too many intermediate layers during decoding and performance degradation. The important intermediate layer structure sequence used in the decoding process is as follows:

References:

MCInst => InternalInstruction => cs_insn

The basic decoding work relies on the LLVM architecture to decode into Capstone’s InternalInstruction, which is an internal structure containing all the details of the decoding process. After decoding is completed, it calls update_pub_insn to copy the contents that need to be exposed to cs_insn. Other disassembly engines decode directly into the target structure.Capstone’s decoding process is so complex that it naturally affects performance. I conducted a not-so-strict performance test, and the performance consumption time of Capstone is about 5 to 6 times that of udis86 (by the way, I submitted a small Pull Request to Capstone, which can be found here and here, with a benchmark attached that showed nearly a 20% performance improvement). If tested differently, when udis86 only uses ud_decode for decoding, Capstone does not have an independent decoding interface, requiring some hacks to prevent it from generating assembly text, then the time consumption of Capstone is about twice that of udis86, showing that Capstone is much slower in text operations than udis86.Moreover, Capstone’s memory consumption is significant; when decoding an instruction, the instruction structure cs_insn you pass in must be allocated by dynamic allocation functions, and it needs to be allocated twice, once for cs_insn and once for cs_detail. This can cause massive memory fragmentation. Additionally, each instruction’s structure is quite large; I don’t remember how big, but sizeof(cs_insn)+sizeof(cs_detail) is at least over 2K.The necessity of using dynamic memory is a difference between Capstone and other disassembly engines. If you need to perform a lot of instruction analysis using Capstone, you should provide a fixed object memory allocator to alleviate some of the memory fragmentation and improve performance.Perhaps for the above reasons, the x64dbg community initially used BeaEngine as the supporting foundation, but due to frequent bugs in BeaEngine, they later replaced it with Capstone, but only used Capstone for GUI text disassembly because while its decoding speed is not good, it has very few bugs (thanks to the support from a large company like Apple), while flow graphs and instruction analysis (which is still not perfect) still use BeaEngine, which is unavoidable since performance is also very important.Another issue is that if you need a disassembly engine with strong decoding capabilities, it is recommended to compare the decoding structures of each engine before making a choice to see if there are fields you need or must have.Because Capstone has a frustrating aspect: although its decoding capability is quite strong, it wraps the intermediate layers and only exposes the fields it deems necessary. Its main maintainer is somewhat stubborn (or can be said to be rigorous), insisting that less commonly used fields do not need to be exposed, and that interfaces should be as simple as possible.For example, the offset of the immediate value in the instruction and the offset of the displacement in the memory operand are originally present in the internal structure InternalInstruction but are discarded when copied to the public structure cs_insn. Also, while the REP and REPE prefixes are represented by the same constant, they have different functions when paired with different instructions; for this, Capstone has an internal valid_repe function to distinguish them, but it is not exposed in the public structure and is treated as REP for identification. Although these are niche, they are still very useful for instruction analysis and transformation.So personally, I find Capstone’s interface quite frustrating, but its functionality is incredibly strong. If you study the internal structure of its source code, you will find many interfaces that are not provided but have good stuff internally, so I maintain a branch myself and use it with mixed feelings.Others

There is also XDE, but since I haven’t used it, I won’t comment on it. Additionally, the blackbone project includes a length disassembly engine worth mentioning, called ldasm. It isn’t really an engine because it only has one function, which is to calculate the length of an instruction, and is very useful for relocating jump instructions during hooking.Code transfer portalSummary

These several disassembly engines each have their strengths (except OD..), but they also have their little flaws. There is no perfect thing in this world; they are open source, and it’s good enough to have them available. You have to do something yourself, right? Choose a good library, report bugs you find during use, submit issues to the community, or create solutions and send a Pull Request; that counts as paying your usage fee.Below is a comparison of these three disassembly engines based on my personal experience:Performance: udis86 > BeaEngine > CapstoneDecoding capability: Capstone > BeaEngine > udis86 (udis86 does not support register analysis; the others have similar decoding capabilities)Platform support: Capstone > (udis86 = BeaEngine)X86 extended instruction set: Capstone > (udis86 ≈ BeaEngine)If you need a disassembly engine that performs well under x86/64 while also having strong decoding capabilities and do not need features like register analysis, then udis86 is suitable for you; if you also need register analysis, then BeaEngine and Capstone are suitable; if you also need ARM architecture support, Capstone is likely to be more suitable for you.Each engine has its advantages and disadvantages; how suitable they are can only be determined by personal experience. Only your own feet know if the shoes fit.If you want to see my other related technical articles, feel free to visit bughoho.me.

Kanxue Crowdsourcing: http://ce.kanxue.com

Kanxue Forum: http://bbs.pediy.com/

A Non-Professional Comparison of Various Open Source Disassembly Engines

—– WeChat ID: ikanxue —–

Kanxue Academy, dedicated to security research for 16 years!

Leave a Comment