A few days ago, I began writing a series of articles about tools and methods for unpacking malware. Each malware or packer is different, and sometimes general methods do not work for unpacking. However, common characteristics can sometimes be found among them. For example, packers often use weak encryption algorithms, which can be cracked.
In this article, I will introduce some methods based on very simple cryptographic principles that can crack the weak encryption algorithms used by packers. Although these cracking methods are straightforward, they remain effective for a large number of malware families and packers, allowing for the automatic unpacking of certain malware.
First, I must declare that these methods are not new and have been used for a long time. Here are some related articles and tools:
-
“Principles and Practice of X-Raying“, the best and earliest related article by Peter Ferrie.
-
“XorSearch“, by Didier Stevens.
-
“Decoding XOR Shellcode Without a Key“, by Chris Jordan.
-
“UnXor“, by Tomchop.
-
“Deobfuscating Embedded Malware Using Probable-Plaintext Attacks” (KANDI Tool), by Christian Wressnegger, Frank Boldewin, and Konrad Rieck.
I implemented a tool (revealpe.py) in Python for cracking. I integrated these cracking methods into a single tool and made some improvements, particularly in the area of unpacking malware:
-
RevealPE
This tool can crack algorithms based on XOR, ADD, and ROL, with 8 or 32-bit keys (which I believe are the most common in malware), with or without key increments. Additionally, it can crack Vigenère ciphers, and it remains to be seen whether other more complex encryption algorithms can be cracked.
Results:
The script attempts all cracking methods on the target file, matching different algorithms and keys. It creates a file for each result, named in the following format:
<original_file_name>.<algorithm>_<offset_match>_<param1>_<param2>.<dec | decpe>
The .dec extension indicates a file decrypted using an algorithm and key. The .decpe extension indicates those files that have been decrypted and extracted as valid PE files.
For example, for a malware sample f658526e1227c45415544063997b49c8, the following results can be obtained after cracking:
f658526e1227c45415544063997b49c8.XOR1_1f60_88_ff.dec
f658526e1227c45415544063997b49c8.XOR1_1f60_88_ff.decpe
f658526e1227c45415544063997b49c8.XOR4_1f60_85868788_fbfbfbfc.dec
The matching position is 0x1f60. The first two results indicate that the file can be decrypted using xor_byte, with an initial key value of 0x88 and an increment of 0xff. The third result indicates that xor_dword can be used for decryption, with an initial key value of 0x85868788 and an increment of 0xfbfbfbfc, but no .decpe file was generated, so this algorithm and the found key only apply to the given plaintext, not to the complete PE file (perhaps this is an alignment issue; setting the option to misalign could yield the correct key for the xor_dword algorithm).
Here is a short video demonstrating the use of this tool:
https://youtu.be/pRuyN64ZkAg
Here are some examples of weak encryption algorithms that may be attacked.

XOR or ADD (Static Key)
P xor K = C -> K = C xor P
P add K = C -> K = C sub P
For example:
Initial key value: 0x85, no increment.
XOR or ADD (Incrementing Key, Constant Increment)
Pi xor Ki = Ci -> Ki = Ci xor Pi -> K(i) – K(i-1) = Kinc
Pi add Ki = Ci -> Ki = Ci sub Pi -> K(i) – K(i-1) = Kinc
For example:
Initial key value: 0x54, increment: 0x92.

Algorithms like Vigenère
Ci = SUST_TABLE[ Pi ]
The indices of positions with the same value in the plaintext must correspond to the indices with the same value in the ciphertext. We can create signatures based on the distances between the same values to determine the positions of given plaintexts in the ciphertext.
For example:
Once a corresponding ciphertext for a certain plaintext is found, it is possible to reconstruct a local substitution table: (T -> 73), (i -> A6), (s -> FC), (space -> C5), etc. (In addition to the PE header, the revealPE.py tool will also look for other well-known plaintexts to improve the substitution table, such as common API names).
Besides constructing a local substitution table, the revealPE.py tool will use brute force to check if common algorithms match the reconstructed substitution table (the revealPE.py tool brute-forces these algorithms: xor-add-rol, xor-rol-add, add-xor-rol, add-rol-xor, rol-xor-add, rol-add-xor, with all possible keys).
I conducted some tests using various exe and pdf files:
Exe files:
93 exe files resulted in embedded PE.
4453 exe files did not yield embedded PE.
Pdf files:
586 pdf files resulted in encrypted PE.
44536 pdf files did not yield encrypted PE.
The test files were randomly selected. Most of them may not be packed files or may not contain any encrypted PE. For example, the revealPE.py tool did not find embedded PE in 4453 exe files, but this does not mean that cracking failed on these files, as they may not contain embedded PE.
To conduct more accurate tests, I need some files that are known to contain encrypted PE to accurately assess how many PE files revealPE can decrypt.
Test file links: exe files pdf files

This article was translated by the Green team of the KX community, source @vallejocc
Please indicate that it is from the KX community when reprinting
Popular Recommendations | KX Academy
-
A Detailed Analysis of a Ransomware
-
BankBot Banking Trojan Variant Found on Google Play Store
-
Constructing an Exploit for CVE-2016-7193 Using a Wild Sample
For more exciting articles, please visit the KX forum