A few days ago, I started writing a series of articles about tools and methods for unpacking malware. Each piece of malware or packer is different, and sometimes a universal method cannot be used for unpacking. However, common characteristics can sometimes be found. For example, packers often use weak encryption algorithms, and breaking these algorithms is possible.
In this article, I will introduce some methods based on very simple cryptographic principles that can break the weak encryption algorithms used by packers. Although these cracking methods are simple, they are still effective against a large number of malware families and packers, allowing for automatic unpacking of some malware.
First, I must declare that these methods are not new and have been used for a long time. Here are some related articles and tools:
-
Principles and Practice of X-Raying, the best and earliest related article, by Peter Ferrie.
-
XorSearch, by Didier Stevens.
-
Decoding XOR Shellcode Without a Key, by Chris Jordan.
-
UnXor, by Tomchop.
-
Deobfuscating Embedded Malware Using Probable-Plaintext Attacks (KANDI tool), by Christian Wressnegger, Frank Boldewin, and Konrad Rieck.
I implemented a tool (revealpe.py) in Python for cracking. I have integrated these cracking methods into one tool and made some improvements, especially in the unpacking of malware:
-
RevealPE
This tool can crack algorithms based on XOR, ADD, ROL with 8 or 32-bit keys (which I believe are the most common in malware), with or without key increments. Additionally, it can also crack the Vigenère cipher, and whether it can crack other more complex encryption algorithms remains to be discovered.
Results:
The script attempts all cracking methods on the target file, matching different algorithms and keys. It creates a file for each result, named in the following format:
<original_file_name>.<algorithm>_<offset_match>_<param1>_<param2>.<dec
The .dec suffix indicates a file decrypted using the algorithm and key. The .decpe suffix indicates those files that have been decrypted and extracted as valid PE files.
For example, for a malware f658526e1227c45415544063997b49c8, the following results can be obtained after cracking:
f658526e1227c45415544063997b49c8.XOR1_1f60_88_ff.dec
f658526e1227c45415544063997b49c8.XOR1_1f60_88_ff.decpe
f658526e1227c45415544063997b49c8.XOR4_1f60_85868788_fbfbfbfc.dec
The matching position is 0x1f60. The first two results indicate that they can be decrypted using xor_byte, with an initial key of 0x88 and an increment of 0xff. The third result indicates decryption using xor_dword, with an initial key of 0x85868788 and an increment of 0xfbfbfbfc, but no .decpe file was generated, so this algorithm and the found key only apply to the given plaintext and not to the complete PE file (perhaps this is an alignment issue; if the option is set to not align, the correct key for the xor_dword algorithm will be obtained).
Here is a short video demonstrating the use of this tool:
https://youtu.be/pRuyN64ZkAg
Here are some examples of weak encryption algorithms that may be attacked.

XOR or ADD (fixed key)
P xor K = C -> K = C xor P
P add K = C -> K = C sub P
For example:
Initial key: 0x85, no increment.
XOR or ADD (self-incrementing key, constant increment)
Pi xor Ki = Ci -> Ki = Ci xor Pi -> K(i) – K(i-1) = Kinc
Pi add Ki = Ci -> Ki = Ci sub Pi -> K(i) – K(i-1) = Kinc
Initial key: 0x54, increment: 0x92.

Algorithms like Vigenère
Ci = SUST_TABLE[ Pi ]
The indices of positions with the same value in the plaintext must match the indices of positions with the same value in the ciphertext. We can create signatures based on the distances between the same values to determine the position of a given plaintext in the ciphertext.
For example:
Once a ciphertext corresponding to a certain plaintext is found, it is possible to reconstruct a partial substitution table: (T -> 73), (i -> A6), (s -> FC), (space -> C5), etc. (besides the PE header, the revealPE.py tool also looks for other known plaintexts to complete the substitution table, such as common API names).
In addition to constructing a partial substitution table, the revealPE.py tool will use brute-force methods to check if common algorithms match this reconstructed substitution table (the revealPE.py tool brute-forces these algorithms: xor-add-rol, xor-rol-add, add-xor-rol, add-rol-xor, rol-xor-add, rol-add-xor using all possible keys).
I conducted some tests with several exe files and pdf files:
Exe files:
93 exe files resulted in embedded PE.
4453 exe files did not result in embedded PE.
Pdf files:
586 pdf files resulted in encrypted PE.
44536 pdf files did not result in encrypted PE.
The test files were randomly selected. Most of them may not be packed files or may not contain any encrypted PE. For example, the revealPE.py tool did not find embedded PE in 4453 exe files, but this does not mean that the cracking failed for these files, as they may not contain embedded PE at all.
For more accurate testing, I need some files that definitely contain encrypted PE to precisely test how many PE files revealPE can decrypt.
Test file links: exe files pdf files

This article was compiled by the Green team of the Kx community, source @vallejocc
Please indicate that it is from the Kx community when reprinting
Popular Recommendations | Kx Academy
-
Detailed Analysis of a Ransomware
-
BankBot Banking Trojan Variant Found on Google Play Store
-
Exploiting a CVE-2016-7193 Exploit with a Wild Sample
For more exciting articles, please visit the Kx forum