Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

This article is an excellent piece from the Kanxue Forum.

Author from Kanxue ForumID: Avacci

This topic originates from the third question of the Kanxue Advanced Research Class 3W in September.

The target app has a very simple interface. Clicking the “CHECK” button will continuously print the encrypted string below. The goal is to analyze the entire encryption process.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Due to Yang’s leniency, the entire encryption process is actually not complicated, but the use of OLLVM’s string obfuscation and control flow flattening adds considerable difficulty to static analysis. At such times, it is essential to utilize various other tools and techniques to expedite the analysis process.

Many times, we do not need to figure out the specific details of how a variable’s value is decrypted at runtime; we can directly hook its decrypted value at runtime. It is often unnecessary to analyze every sub-function’s role in a key encryption algorithm function. If we discover characteristics of a commonly used algorithm (not necessarily a perfect match, it could be a variant), we should boldly speculate and conduct timely replay tests (CyberChef is really convenient). After all, there’s no loss in guessing wrong, but guessing right might save years of life.

First, drag the APK into GDA and locate the handler function for the button click:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

uUIDCheckS0 is the value generated after clicking the CHECK button, and the generation algorithm is included in the function UUIDCheckSum, with the input parameter being a 36-character random string generated by RandomStringUtils.randomAlphanumeric(36).

We can use Log to determine the input parameters and return values for each call to UUIDCheckSum.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Since UUIDCheckSum is a native function, it is now in the so file. After unzipping the APK, drag the unique so file into IDA.

Search in the function window to locate the specific implementation of UUIDCheckSum.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

I originally wanted to locate the return value first and then backtrack, but I found that IDA F5 did not recognize the method as having a return value. I suspect the function was not correctly identified, perhaps the function’s tail was truncated.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Return to the assembly code and view the function’s tail.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

It turns out that the function’s tail is followed by a piece of code starting with .datadiv_decode, which was mentioned in previous videos as a decryption code segment.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Looking at the calling position, in .init_array:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

I observed the disassembled result of this code segment,

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

It appears to decrypt several strings, such as stru_37010, byte_37090, byte_370A0, stru_370B0. We can use Frida hooks to obtain the decrypted values and see which ones are useful..

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

I’m particularly interested in the decrypted string of stru_37010:

0123456789-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

Seeing this neatly arranged character sequence, I first thought whether it might use some kind of base64 encoding. Let’s locate where this string is used.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

The results of the decompiled locations appear quite strange, possibly due to the encrypted values being optimized.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

But looking at the assembly code, it is indeed called here.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVMAccording to the call stack, I traced back and found that it was used in the sub_F9B8 method within UUIDCheckSum.

So, let’s speculate that this sub_F9B8 is a base64 encryption function using 0123456789-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ as the encoding table? This needs to be verified.

I first wrote two hook functions using Frida, one hooked the sub_F9B8 function, and another hooked the sub_FCB4 function called before sub_F9B8. Since sub_FCB4 uses the previously generated random number as input, it is likely that sub_FCB4 will use the processed result.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Implementation of the hook function:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVMAfter hooking, click the button to get the result.

Generated random number:Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Intermediate result after processing by sub_FCB4:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Second parameter passed to sub_F9B8:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVMFinal result printed (Logcat)

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Verify the guess on CyberChef. Since I suspect sub_F9B8 is base64, I input the intermediate value 2n5ixhQ{-oLqe-4nEi-Xr3dv-u6Rq4uo`ga3 and set the encoding table. The result indeed matches the final result.Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Lucky enough, verification succeeded!

Next, I just need to study sub_FCB4. A rough observation of the disassembled code shows that it processes the input random string byte by byte.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

First, take the input and output values from one run and use 010editor as a reference.

Input:Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Output:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Since the function is not very complex, I plan to conduct static analysis first.First:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

The 23rd byte of the output is the 24th byte of the input, then locate the code.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

If the index is not 8, 13, 14, 18, 24 then the result is the original byte XORed with 1 (except for the last two bytes).

If it is 14:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

The output byte is 52.

The other bytes at positions 8, 13, 18, and 24 all have the value 45.Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

There are still the last two bytes, which are processed at the end of the function.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Here, the values of v14 and v16 are not very clear, so I decided to use IDA trace to check.

In the trace log, locate the position where the last two bytes are processed.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVMFirst, look at the last byte:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

This byte is obtained by specifying the value at a certain offset through a string, with the offset value stored in the X8 register.

Step back to trace the source of the X8 register value, and locate:

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

So it is W20^W21.

W20 is the last byte of the previously processed input string. W21 comes from [SP,#0x50+var_44]. This position is also where the XORed result is stored. So I suspect that during the previous processing, the input bytes were also XORed one by one.

I searched and indeed found many.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

The first byte of the input string is XORed with 0xFF, and the result is XORed with the second byte, and so on.

Then the special position bytes are not processed. Obtain the low four bits of the final result.

Similarly, it is found that the second-to-last byte comes from adding the input bytes one by one.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Obtain the low four bits of the final sum.

Thus, the analysis is concluded.

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

– End –

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Kanxue ID: Avacci

https://bbs.pediy.com/user-home-879855.htm

* This article is original by Avacci from the Kanxue Forum, please indicate the source when reprinting from the Kanxue community.

Android Application Layer Packet Capture Script Released!

Enrollment for the March 2021 Advanced Research Class is in full swing! 👇

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

* Click the image for details

# Previous Recommendations

  • A Buffer Overflow Reproduction Note

  • Sandboxie Step-by-Step

  • [Reverse Decryption] Decryption Method for WannaRen Encrypted Files

  • Using Unicorn to Trace and Restore OLLVM Obfuscated Non-Standard Algorithms

  • Understanding and Analyzing FART

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM
Public Account ID: ikanxue
Official Weibo: Kanxue Security
Business Cooperation: [email protected]
Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Share it!

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Like it!

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Watch it!

Analysis of Encryption Algorithms After Light Obfuscation with OLLVM

Click “Read the Original” to learn more!

Leave a Comment