
This article is an excellent piece from the Kanxue Forum.
Author from Kanxue ForumID: Avacci
This topic originates from the third question of the Kanxue Advanced Research Class 3W in September.
The target app has a very simple interface. Clicking the “CHECK” button will continuously print the encrypted string below. The goal is to analyze the entire encryption process.
Due to Yang’s leniency, the entire encryption process is actually not complicated, but the use of OLLVM’s string obfuscation and control flow flattening adds considerable difficulty to static analysis. At such times, it is essential to utilize various other tools and techniques to expedite the analysis process.
Many times, we do not need to figure out the specific details of how a variable’s value is decrypted at runtime; we can directly hook its decrypted value at runtime. It is often unnecessary to analyze every sub-function’s role in a key encryption algorithm function. If we discover characteristics of a commonly used algorithm (not necessarily a perfect match, it could be a variant), we should boldly speculate and conduct timely replay tests (CyberChef is really convenient). After all, there’s no loss in guessing wrong, but guessing right might save years of life.
First, drag the APK into GDA and locate the handler function for the button click:
uUIDCheckS0 is the value generated after clicking the CHECK button, and the generation algorithm is included in the function UUIDCheckSum, with the input parameter being a 36-character random string generated by RandomStringUtils.randomAlphanumeric(36).
We can use Log to determine the input parameters and return values for each call to UUIDCheckSum.
Since UUIDCheckSum is a native function, it is now in the so file. After unzipping the APK, drag the unique so file into IDA.
Search in the function window to locate the specific implementation of UUIDCheckSum.
I originally wanted to locate the return value first and then backtrack, but I found that IDA F5 did not recognize the method as having a return value. I suspect the function was not correctly identified, perhaps the function’s tail was truncated.
Return to the assembly code and view the function’s tail.
It turns out that the function’s tail is followed by a piece of code starting with .datadiv_decode, which was mentioned in previous videos as a decryption code segment.
Looking at the calling position, in .init_array:
I observed the disassembled result of this code segment,
It appears to decrypt several strings, such as stru_37010, byte_37090, byte_370A0, stru_370B0. We can use Frida hooks to obtain the decrypted values and see which ones are useful..
I’m particularly interested in the decrypted string of stru_37010:
0123456789-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
Seeing this neatly arranged character sequence, I first thought whether it might use some kind of base64 encoding. Let’s locate where this string is used.
The results of the decompiled locations appear quite strange, possibly due to the encrypted values being optimized.
But looking at the assembly code, it is indeed called here.
According to the call stack, I traced back and found that it was used in the sub_F9B8 method within UUIDCheckSum.
So, let’s speculate that this sub_F9B8 is a base64 encryption function using 0123456789-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ as the encoding table? This needs to be verified.
I first wrote two hook functions using Frida, one hooked the sub_F9B8 function, and another hooked the sub_FCB4 function called before sub_F9B8. Since sub_FCB4 uses the previously generated random number as input, it is likely that sub_FCB4 will use the processed result.
Implementation of the hook function:
After hooking, click the button to get the result.
Generated random number:
Intermediate result after processing by sub_FCB4:
Second parameter passed to sub_F9B8:
Final result printed (Logcat)
Verify the guess on CyberChef. Since I suspect sub_F9B8 is base64, I input the intermediate value 2n5ixhQ{-oLqe-4nEi-Xr3dv-u6Rq4uo`ga3 and set the encoding table. The result indeed matches the final result.
Lucky enough, verification succeeded!
Next, I just need to study sub_FCB4. A rough observation of the disassembled code shows that it processes the input random string byte by byte.
First, take the input and output values from one run and use 010editor as a reference.
Input:
Output:
Since the function is not very complex, I plan to conduct static analysis first.First:
The 23rd byte of the output is the 24th byte of the input, then locate the code.
If the index is not 8, 13, 14, 18, 24 then the result is the original byte XORed with 1 (except for the last two bytes).
If it is 14:
The output byte is 52.
The other bytes at positions 8, 13, 18, and 24 all have the value 45.
There are still the last two bytes, which are processed at the end of the function.
Here, the values of v14 and v16 are not very clear, so I decided to use IDA trace to check.
In the trace log, locate the position where the last two bytes are processed.
First, look at the last byte:
This byte is obtained by specifying the value at a certain offset through a string, with the offset value stored in the X8 register.
Step back to trace the source of the X8 register value, and locate:
So it is W20^W21.
W20 is the last byte of the previously processed input string. W21 comes from [SP,#0x50+var_44]. This position is also where the XORed result is stored. So I suspect that during the previous processing, the input bytes were also XORed one by one.
I searched and indeed found many.
The first byte of the input string is XORed with 0xFF, and the result is XORed with the second byte, and so on.
Then the special position bytes are not processed. Obtain the low four bits of the final result.
Similarly, it is found that the second-to-last byte comes from adding the input bytes one by one.
Obtain the low four bits of the final sum.
Thus, the analysis is concluded.
Kanxue ID: Avacci
https://bbs.pediy.com/user-home-879855.htm
* This article is original by Avacci from the Kanxue Forum, please indicate the source when reprinting from the Kanxue community.
Android Application Layer Packet Capture Script Released!
Enrollment for the March 2021 Advanced Research Class is in full swing! 👇

* Click the image for details
# Previous Recommendations
-
A Buffer Overflow Reproduction Note
-
Sandboxie Step-by-Step
-
[Reverse Decryption] Decryption Method for WannaRen Encrypted Files
-
Using Unicorn to Trace and Restore OLLVM Obfuscated Non-Standard Algorithms
-
Understanding and Analyzing FART


Share it!

Like it!

Watch it!

Click “Read the Original” to learn more!