RCTF 2021 – Complete Reproduction and In-Depth Analysis of ASM Crypto Encryption Algorithm
Introduction
This article is a complete technical reproduction of the ASM Crypto problem from RCTF 2021. The problem provides an encryption program written in x86 assembly and the encrypted ciphertext, requiring reverse analysis to restore the plaintext.
This article will adopt the forward analysis + reverse solving + code verification approach, ensuring that each step has theoretical basis and practical verification, allowing beginners to fully understand the entire process.
Problem Materials
After extracting the attachment, we obtain:
asm/
├── crypto.asm # MASM32 assembly source code (encryption program)
└── flag.enc # Encrypted ciphertext (256 hexadecimal characters)
Quick observation of <span>flag.enc</span>:
- Length is 256 characters
- Contains only
<span>[0-9a-f]</span>(lowercase hexadecimal) - Assumption: 128 bytes of binary data encoded in hexadecimal
Step 1: Static Analysis of Assembly Source Code
Main Program Flow
start:
; 1. Read input
invoke ReadConsole,hStdin,offset buffer,bufferSize+2,offset cLen,NULL
sub cLen,1 ; Only subtract 1 character (\n), keep \r
; 2. Four-step encryption process
invoke mInit,addr buffer,cLen
invoke Encode,addr buffer,bufferSize
invoke c2w,addr buffer,addr buffer2,bufferSize
mov tt,0
invoke dfs,addr buffer2,0
; 3. Output ciphertext
invoke WriteConsole,hStdout,offset buffer3,bufferSize*2+1,offset dwNum,NULL
Key Points:
<span>sub cLen,1</span>only subtracts the newline character<span>\n</span>- In the Windows console, input usually ends with
<span>\r\n</span> - Therefore, the plaintext will retain the
<span>\r</span>character (this detail is very important!)
Encryption process: Input → mInit → Encode → c2w → dfs → Output
Function 1: mInit – Expansion and XOR
mInit proc s,l
; First stage: loop to fill to 128 bytes
mov @temp,1 ; Increment value starts from 1
.while ecx<bufferSize
mov dl,[eax+ebx] ; Read buffer[eax]
add edx,@temp ; Add increment value
mov [ecx+ebx],dl ; Write to buffer[ecx]
inc ecx
inc eax
.if eax==l ; Finished one round of original input
xor eax,eax ; Reset read position
add @temp,1 ; Increment value +1
.endif
.endw
; Second stage: XOR all bytes with length
mov eax,l
.while ecx<bufferSize
mov dl,[ebx+ecx]
xor dl,al ; buffer[ecx] ^= length
mov [ebx+ecx],dl
inc ecx
.endw
ret
mInit endp
Mathematical Description
Let the input plaintext be <span>P[0..L-1]</span>, output <span>S[0..127]</span>:
Stage 1 – Loop Filling:
Init[i] = (P[i mod L] + floor(i / L)) mod 256
Explanation:
- Value at position
<span>i</span>= Original character<span>P[i mod L]</span>+ Current round of filling<span>floor(i / L)</span> - This keeps the first L bytes unchanged, while the subsequent bytes are cyclically copied with an increment offset
Example (L=3, plaintext “ABC”):
| Position i | i mod L | i // L | Calculation | Result |
|---|---|---|---|---|
| 0-2 | 0-2 | 0 | A,B,C | A,B,C |
| 3-5 | 0-2 | 1 | A+1,B+1,C+1 | A+1,B+1,C+1 |
| 6-8 | 0-2 | 2 | A+2,B+2,C+2 | A+2,B+2,C+2 |
| … | … | … | … | … |
Stage 2 – XOR with Length:
S[i] = Init[i] XOR L
All 128 bytes are XORed with the input length.
Function 2: Encode – In-Place Rolling XOR
Encode proc s,l
mov @k,8
.while ecx<l
xor edx,edx
mov eax,ecx
div @k ; eax = ecx/8, edx = ecx%8
.if edx!=0 ; If i % 8 != 0
mov dl,[ebx+ecx-1] ; Read left neighbor buffer[i-1]
.endif
.if eax!=0 ; If i / 8 != 0
mov al,[ebx+ecx-8] ; Read upper neighbor buffer[i-8]
.endif
mov ah,[ebx+ecx] ; Read current value
xor ah,al ; XOR upper neighbor
xor ah,dl ; XOR left neighbor
mov [ebx+ecx],ah ; ⚠️ Write back in place!
inc ecx
.endw
ret
Encode endp
Key Understanding: In-Place Update
This is the easiest place to make mistakes! The code is in-place modification, meaning:
E[i] = S[i] ⊕ (i≥8 ? E[i-8] : 0) ⊕ (i%8≠0 ? E[i-1] : 0)
Note:
<span>E[i-1]</span>and<span>E[i-8]</span>use the already updated values, not the original<span>S[i-1]</span>!- Processing order is from 0 to 127 (forward)
- Subsequent bytes depend on previously processed bytes
Dependency Graph:
Position: 0 1 2 3 4 5 6 7 8 9 10 ...
Dependency: None 0 1 2 3 4 5 6 0 1+8 2+9 ...
This design creates a strong avalanche effect: a change in one byte will affect all subsequent bytes.
Function 3: c2w – Binary to Hexadecimal Conversion
func1 proc var1
mov eax,var1
mov ah,al
and al,00001111b ; Low 4 bits
and ah,11110000b ; High 4 bits
shr ah,4
lea ebx,ca ; ca = '0123456789abcdef'
movzx ecx,al
mov al,[ebx+ecx] ; Low 4 bits → character
movzx ecx,ah
mov ah,[ebx+ecx] ; High 4 bits → character
ret
func1 endp
c2w proc s1,s2,l
.while dword ptr ecx<l
mov dl,[ebx+ecx]
invoke func1,edx
mov edx,s2
mov byte ptr [ecx*2+edx],ah
mov byte ptr [ecx*2+edx+1],al
inc ecx
.endw
ret
c2w endp
Function: Convert 128 bytes to 256 hexadecimal characters
Example:
- Input byte:
<span>0x4A</span> - Output characters:
<span>'4'</span>(0x34) and<span>'a'</span>(0x61) - Result:
<span>"4a"</span>
Function 4: dfs – In-Order Traversal of Complete Binary Tree
dfs proc s,t
mov ecx,t
.if ecx>=bufferSize*2 ; If node>=256, return
jmp dfs_jmp
.endif
mov eax,t
add eax,eax
inc eax ; eax = 2*t+1 (left child)
invoke dfs,s,eax ; Recursively visit left subtree
mov dl,[ebx+ecx] ; Visit current node
mov ecx,tt
lea ebx,buffer3
mov [ebx+ecx],dl ; Output to buffer3
inc ecx
mov tt,ecx
inc eax ; eax = 2*t+2 (right child)
invoke dfs,s,eax ; Recursively visit right subtree
dfs_jmp:
ret
dfs endp
Complete Binary Tree and In-Order Traversal
Consider the 256 characters as nodes of a complete binary tree:
0
/ \
1 2
/ \ / \
3 4 5 6
/\ /\ /\ /\
7 8 9 10 11 12 13 14
...
Node Relationships:
- Parent node:
<span>(i-1) // 2</span> - Left child:
<span>2*i + 1</span> - Right child:
<span>2*i + 2</span>
In-Order Traversal (left→root→right):
Access order: [255, 127, 63, 128, 31, 129, 64, 130, ...]
Effect: Completely disrupts the data order, with the data at position 0 being that of node 255!
Step 2: Mathematical Modeling (Forward Process)
Let the plaintext be <span>P[0..L-1]</span>, the complete encryption process:
Step 1: mInit Expansion
Init[i] = (P[i mod L] + floor(i/L)) mod 256
S[i] = Init[i] XOR L
Step 2: Encode Dependency Encoding
E[i] = S[i] ⊕ (i≥8 ? E[i-8] : 0) ⊕ (i%8≠0 ? E[i-1] : 0)
Step 3: c2w Hexadecimal
H[2*i] = hex_high(E[i])
H[2*i+1] = hex_low(E[i])
Step 4: dfs Tree Traversal
C = inorder_traverse(H) # Rearranged by in-order traversal
Step 3: Reverse Decryption Scheme
Overview of Reverse Steps
Decrypt: Ciphertext C → Reverse dfs → Reverse c2w → Reverse Encode → Reverse mInit → Plaintext P
Step A: Restore In-Order Traversal Permutation
Principle: Generate the index sequence of in-order traversal to establish a reverse mapping
# Generate in-order traversal sequence
order = []
def dfs(t):
if t >= 256: return
dfs(2*t + 1) # Left subtree
order.append(t) # Record node
dfs(2*t + 2) # Right subtree
dfs(0)
# order[i] indicates: the node number accessed at the i-th visit
# Reverse: Ciphertext C[i] should be placed back at H[order[i]] position
H = [''] * 256
for i in range(256):
H[order[i]] = C[i]
Correctness: In-order traversal is deterministic, and the reverse mapping exists uniquely.
Step B: Hexadecimal to Binary
E = bytes.fromhex(''.join(H)) # 256 characters → 128 bytes
Direct conversion, no ambiguity.
Step C: Reverse Encode – Key Insight!
Review the formula for Encode:
E[i] = S[i] ⊕ E[i-8] ⊕ E[i-1]
Due to the self-inverse property of XOR (<span>A ⊕ B ⊕ B = A</span>), we have:
S[i] = E[i] ⊕ E[i-8] ⊕ E[i-1]
Key Points:
<span>E[i]</span>,<span>E[i-1]</span>,<span>E[i-8]</span>are all known (in array E)- No need to know
<span>S[i-1]</span>or<span>S[i-8]</span> - Can be calculated directly from front to back!
def decode_encode(E):
S = bytearray(128)
for i in range(128): # Process from front to back
v = E[i]
if i >= 8:
v ^= E[i-8]
if i % 8 != 0:
v ^= E[i-1]
S[i] = v
return bytes(S)
Why can it be processed forward? Because Encode is an “in-place update”, meaning that when updating <span>E[i]</span><span>, the already updated values of </span><code><span>E[i-1]</span> and <span>E[i-8]</span> are used. During the reverse process, we know the final <span>E</span> array and can directly use the same index relationship to backtrack <span>S</span>.
This is different from the ordinary “calculate first and then update”:
# If it were like this (non-in-place):
E_new[i] = S[i] ^ S[i-1] ^ S[i-8] # Using old values of S
# Then the reverse would need to be processed from back to front:
for i in range(127, -1, -1):
S[i] = E[i] ^ S[i-1] ^ S[i-8]
But the assembly code is:
mov ah,[ebx+ecx] # Read current
xor ah,al # XOR (using updated value)
mov [ebx+ecx],ah # Write back to the same position
This is in-place operation, so the reverse can also use the same formula.
Step D: Reverse mInit – Brute Force Length
Restoring <span>Init</span> from <span>S</span> is simple:
Init[i] = S[i] XOR L
But we do not know <span>L</span>! We need to utilize the regularity of the filling.
Constraints:
Init[i] = P[i mod L] + floor(i / L) (mod 256)
For the same original character <span>P[j]</span>, it will present an arithmetic sequence at different positions:
Init[j] = P[j] + 0
Init[j+L] = P[j] + 1
Init[j+2L] = P[j] + 2
...
Brute Force Method:
for L in range(1, 129):
# Validate whether each position satisfies the arithmetic increment
valid = True
P = bytearray(L)
for j in range(L):
x0 = (S[j] ^ L) & 0xFF # Init[j]
P[j] = x0
# Check subsequent positions
m = 1
while j + m*L < 128:
expected = (x0 + m) & 0xFF
actual = (S[j + m*L] ^ L) & 0xFF
if expected != actual:
valid = False
break
m += 1
if not valid:
break
if valid:
return L, bytes(P)
Uniqueness: Under the global constraint of 128 bytes, only the true <span>L</span> can satisfy the arithmetic relationship at all positions.
Step 4: Complete Decryption Script
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
RCTF 2021 ASM Crypto - Complete Decryption Script
"""
import binascii
# Read ciphertext
with open('flag.enc', 'r', encoding='ascii') as f:
enc_hex = f.read().strip()
print(f"[*] Ciphertext length: {len(enc_hex)}")
# ========== Step A: Restore In-Order Traversal Permutation ==========
print("[*] Step 1: Restore In-Order Traversal...")
N = 256
order = []
def dfs(t):
if t >= N:
return
dfs(2*t + 1)
order.append(t)
dfs(2*t + 2)
dfs(0)
# Reverse permutation
buf3 = list(enc_hex) # Ciphertext sequence
buf2 = [''] * N # Original order
for k, idx in enumerate(order):
buf2[idx] = buf3[k]
hex_str = ''.join(buf2)
print(f" Restored: {hex_str[:60]}...")
# ========== Step B: Hexadecimal to Binary ==========
print("[*] Step 2: Hexadecimal to Binary...")
E = bytearray(binascii.unhexlify(hex_str))
print(f" Obtained {len(E)} bytes")
# ========== Step C: Reverse Encode ==========
print("[*] Step 3: Reverse Encode (Dependency Encoding)...")
S = bytearray(128)
for i in range(128):
v = E[i]
if i >= 8:
v ^= E[i-8]
if i % 8 != 0:
v ^= E[i-1]
S[i] = v
print(f" Completed")
# ========== Step D: Reverse mInit (Brute Force Original Length) ==========
print("[*] Step 4: Brute Force Original Length...")
def recover_plain(M):
for L in range(1, 129):
P = bytearray(L)
ok = True
for j in range(L):
# Get Init[j]
x0 = (M[j] ^ L) & 0xFF
P[j] = x0
# Validate arithmetic increment
m = 1
while j + m*L < 128:
expected = (x0 + m) & 0xFF
actual = (M[j + m*L] ^ L) & 0xFF
if expected != actual:
ok = False
break
m += 1
if not ok:
break
if ok:
return L, bytes(P)
return None, None
L, P = recover_plain(S)
if L is None:
print(" [!] No valid length found")
else:
print(f" [+] Found length: L = {L}")
print(f"\n{'='*70}")
print(f"[FLAG] Original bytes: {P}")
print(f"[FLAG] Visible text: {P.rstrip(b'\r').decode('latin1')}")
print(f"{'='*70}")
# ========== Step E: Forward Verification ==========
print("\n[*] Step 5: Forward Encryption Verification...")
def mInit(P):
l = len(P)
buf = bytearray(128)
buf[:l] = P
i, t, eax = l, 1, 0
while i < 128:
buf[i] = (buf[eax] + t) & 0xFF
i += 1
eax += 1
if eax == l:
eax = 0
t += 1
al = l & 0xFF
for i in range(128):
buf[i] ^= al
return bytes(buf)
def encode_forward(s):
s = bytearray(s)
for i in range(128):
dl = s[i-1] if (i % 8) != 0 else 0
al = s[i-8] if i >= 8 else 0
s[i] = s[i] ^ al ^ dl
return bytes(s)
def c2w(b):
ca = b'0123456789abcdef'
out = bytearray(256)
for i, x in enumerate(b):
out[2*i] = ca[(x >> 4) & 0xF]
out[2*i+1] = ca[x & 0xF]
return bytes(out)
def inorder_permute(buf2):
o = []
def dfs(t):
if t >= len(buf2):
return
dfs(2*t+1)
o.append(t)
dfs(2*t+2)
dfs(0)
return ''.join(buf2[i] for i in o)
# Re-encrypt
re_enc = inorder_permute(c2w(encode_forward(mInit(P))).decode())
if re_enc == enc_hex:
print(" [✓] Verification successful! Encryption result matches exactly")
else:
print(" [✗] Verification failed")
print(f" Expected: {enc_hex[:60]}...")
print(f" Actual: {re_enc[:60]}...")
print("\n[*] Decryption completed!")
Step 5: Running Results
Save the script as <span>solve.py</span> and place it in the same directory as <span>flag.enc</span>, then run:
$ python3 solve.py
Output:
[*] Ciphertext length: 256
[*] Step 1: Restore In-Order Traversal...
Restored: 473c1e38740b4b0714640c4652333461546e7c7c544657452a030c22...
[*] Step 2: Hexadecimal to Binary...
Obtained 128 bytes
[*] Step 3: Reverse Encode (Dependency Encoding)...
Completed
[*] Step 4: Brute Force Original Length...
[+] Found length: L = 19
======================================================================
[FLAG] Original bytes: b'Th15_lS_@_easy_ASM\r'
[FLAG] Visible text: Th15_lS_@_easy_ASM
======================================================================
[*] Step 5: Forward Encryption Verification...
[✓] Verification successful! Encryption result matches exactly
[*] Decryption completed!
In-Depth Analysis of Key Technical Points
1. Why is there a <span>\r</span> at the end of the plaintext?
In the Windows console:
- After user input and pressing enter, the system reads
<span>...\r\n</span> - The assembly code only executes
<span>sub cLen,1</span>, subtracting<span>\n</span> - Therefore,
<span>\r</span>is retained in the input as part of the plaintext
This is a detail, but it affects the final byte count and content.
2. How to understand the “in-place update” of Encode?
Key Difference:
Method A (Non-In-Place):
temp = S[i] ^ S[i-1] ^ S[i-8] # Using original values of S
E[i] = temp # Write to new array
This situation requires reverse traversal for decryption.
Method B (In-Place, Assembly Implementation):
mov ah,[ebx+ecx] # ah = buffer[i] (may have been modified)
xor ah,al # Use buffer[i-8] (may have been modified)
xor ah,dl # Use buffer[i-1] (has been modified)
mov [ebx+ecx],ah # Write back to buffer[i]
This is a typical in-place algorithm, where subsequent values depend on previously updated values.
During the reverse: Due to the properties of XOR <span>A ⊕ B ⊕ B = A</span>, use the same formula:
S[i] = E[i] ^ E[i-8] ^ E[i-1]
Where <span>E[i-8]</span> and <span>E[i-1]</span> are known values in the ciphertext, allowing direct calculation.
3. Mathematical Basis for Length Brute Force
Filling Regularity:
Init[j] = P[j] + 0
Init[j+L] = P[j] + 1
Init[j+2L] = P[j] + 2
...
Init[j+kL] = P[j] + k (mod 256)
For incorrect <span>L'</span><span>:</span>
- Different
<span>j</span>positions will have different “base values” - It is difficult for all 128 positions to form a coherent arithmetic sequence
- Especially under modulo 256 operations, incorrect
<span>L'</span><span> is almost impossible to satisfy global constraints</span>
Experimental Verification:
# Try L=18, 19, 20
for L in [18, 19, 20]:
valid_positions = 0
for j in range(L):
x0 = (S[j] ^ L) & 0xFF
m, ok = 1, True
while j + m*L < 128:
if (S[j+m*L] ^ L) != ((x0 + m) & 0xFF):
ok = False
break
m += 1
if ok:
valid_positions += 1
print(f"L={L}: {valid_positions}/{L} positions satisfy constraints")
Only when <span>L=19</span> do all positions satisfy.
4. Application of In-Order Traversal of Complete Binary Tree
Traversal Order Characteristics:
- The leftmost leaf node (255) is accessed first
- The root node (0) is accessed in the middle position
- Completely disrupts the linear order
Role in Cryptography:
- Data permutation
- Increases analysis difficulty
- Combined with linear transformations to form SP network structure
Common Errors and Debugging Tips
Error 1: Incorrect Reverse Direction of Encode
Symptoms: Decrypted data is completely garbled
Cause: Misunderstanding that it needs to be processed from back to front
Correct Approach: Understand the meaning of “in-place update”, process from front to back using the same formula
Error 2: Ignoring <span>\r</span> Character
Symptoms: Length never matches, finding 18 instead of 19
Cause: Did not consider the input characteristics of the Windows console
Solution: Check the assembly code’s <span>sub cLen,1</span><span>, which only subtracts one character</span>
Error 3: Insufficient Verification of Length Brute Force
Symptoms: Multiple “possible” lengths found
Cause: Only checked partial constraint conditions
Correct Approach: Check the arithmetic relationship of all 128 bytes
Debugging Tips
1. Step-by-Step Verification:
# Print intermediate results at each step
print(f"Step 1: {data[:20]}")
print(f"Step 2: {data[:20]}")
...
2. Forward Comparison:
# Encrypt known plaintext and compare intermediate steps
test_plain = b"test"
step1 = mInit(test_plain)
# Compare with the same step of the decryption process
3. Boundary Conditions:
# Check special positions
# i=0: Does not depend on previous
# i=7: Only depends on left neighbor
# i=8: Only depends on upper neighbor
# i=9: Depends on both left and upper neighbors
Security Analysis
Advantages
- Multi-layer Transformation: Combines padding, XOR, encoding, and permutation operations
- Dependency Relationships: Encode establishes dependencies between bytes, creating an avalanche effect
- Non-linear Permutation: Tree traversal disrupts order
Disadvantages
- Algorithm Disclosure: Once the source code is provided, it becomes a pure mathematical problem
- No Key: All operations are deterministic, with no key involved
- Fixed Length: 128-byte fixed length may leak information
- Strong Reversibility: Each step is a simple reversible operation
Improvement Suggestions
If it is to be transformed into a practical encryption algorithm:
-
Introduce a Key:
- Use the key to control the increment value of the padding
- Derive XOR values from the key instead of using the length directly
Increase Rounds:
- Multiple rounds of Encode, each using different dependency patterns
- Similar to AES’s multi-round structure
Dynamic Permutation:
- Select different permutation methods based on the key
- Do not fix the use of in-order traversal
Add S-Boxes:
- Add non-linear S-box substitution after Encode
- Enhance resistance to differential and linear analysis
Conclusion
This problem is an excellent CTF cryptography challenge, covering:
Technical Points:
- Reading and reversing x86 assembly
- Mathematical modeling and algorithm analysis
- Data structures (complete binary tree)
- Basic concepts of cryptography
Problem-Solving Approach:
- Static analysis: Understand encryption logic function by function
- Mathematical modeling: Describe each step with formulas
- Reverse design: Design verifiable reverse algorithms for each step
- Code implementation: Python implementation of the complete process
- Forward verification: Use plaintext encryption to verify decryption correctness
Key Insights:
- The “in-place update” feature of Encode determines the reverse method
- The arithmetic regularity of the padding provides uniqueness constraints
- Details of console I/O affect plaintext content
Final Answer:
Plaintext (including carriage return): b'Th15_lS_@_easy_ASM\r'
Plaintext (visible): Th15_lS_@_easy_ASM
Length: 19 bytes
This article ensures that each step has theoretical support and practical verification through the complete process of analysis → modeling → reverse → implementation → verification. I hope it helps readers gain a deeper understanding of the intricacies of this problem!
References:
- MASM32 Programming
- Complete Binary Trees and Tree Traversal
- XOR Cryptography Applications
- CTF Cryptography Problem Type Analysis
Author’s Note: All code and conclusions in this article can be reproduced and verified locally, and readers are welcome to practice!