Explaining Assembly Language Technology: Principles of Integrity Verification Based on Hash Algorithms

Application of Hash Algorithms in Security

Hash algorithms are the core technology for verifying file integrity, where the fundamental principle is to compute a fixed-length digest value (such as MD5) from the file content to establish a “digital fingerprint” of the file. This section will detail how to implement an integrity verification mechanism based on hash algorithms using assembly language.

Basic Principles of MD5 Verification

Algorithm Characteristics

  1. Deterministic: The same input always produces the same output
  2. Avalanche Effect: A small change in input results in a significant change in output
  3. Irreversibility: The original data cannot be derived from the hash value

Verification Process

; Pseudocode representation of the verification process
verify_integrity:
    mov esi, [file_path]    ; Path of the file to be verified
    call calculate_md5      ; Calculate MD5 value
    mov edi, known_hashes   ; List of known valid hashes
    call search_hash_table  ; Search in the hash table
    test eax, eax
    jnz allow_execution     ; Allow execution if found
    call deny_execution     ; Deny execution if not found

Assembly Implementation of MD5 Calculation

Initialization Phase

; MD5 initialization constants
md5_init:
    mov dword [state], 0x67452301
    mov dword [state+4], 0xEFCDAB89
    mov dword [state+8], 0x98BADCFE
    mov dword [state+12], 0x10325476
    ret

Main Transformation Loop

; MD5 main transformation loop (simplified)
md5_transform:
    mov ecx, 64             ; 64 rounds of transformation
transform_loop:
    ; F function selection (based on round)
    cmp ecx, 16
    ja .round2
    ; First round F function
    mov eax, ebx
    and eax, ecx
    not ecx
    and ecx, edx
    or eax, ecx
    jmp .apply
.round2:
    ; Second round G function
    ; ... (similar processing for other rounds)
.apply:
    ; Apply transformation
    add eax, [state]
    add eax, [k_table+ecx*4]
    add eax, [input_buffer+...]
    rol eax, [rotate_table+ecx*4]
    add eax, ebx
    ; Update state variables
    ; ...
    loop transform_loop
    ret

Integrity Verification System Implementation

Hash Table Construction

; Build the hash table for valid modules
build_hash_table:
    mov esi, module_list    ; List of valid modules
    mov edi, hash_table     ; Hash storage area
.next_module:
    lodsb                  ; Read module path
    test al, al
    jz .done
    call load_file          ; Load file
    call calculate_md5      ; Calculate MD5
    stosd                  ; Store hash value (actual MD5 requires 16 bytes)
    jmp .next_module
.done:
    ret

Runtime Verification

; Verification before module loading
check_module:
    pusha
    call calculate_md5      ; Calculate MD5 of the module to be loaded
    mov ecx, NUM_KNOWN_HASHES
    mov esi, hash_table
.search_loop:
    cmpsd                  ; Compare hash values
    je .hash_found
    loop .search_loop
    ; No matching hash found
    popa
    mov eax, -1            ; Return error
    ret
.hash_found:
    popa
    xor eax, eax           ; Return success
    ret

Security Enhancement Measures

Anti-Tampering Protection

; Verify the integrity of the hash table
protect_hash_table:
    mov esi, hash_table
    mov ecx, HASH_TABLE_SIZE
    call calculate_crc32
    cmp eax, [stored_crc]
    je .ok
    call alert_tampering    ; Tampering alert for hash table
.ok:
    ret

Performance Optimization Techniques

; Use SSE to accelerate MD5 calculation
md5_sse:
    movdqa xmm0, [state]    ; Load initial state
    ; ... (use SSE instructions for parallel processing)
    movdqa [state], xmm0    ; Store result
    ret

Modern Alternatives

Although MD5 is still used for educational purposes, it has been replaced in practical applications by more secure algorithms:

  1. SHA-256: Provides stronger collision resistance
  2. HMAC: A hash scheme combined with a key
  3. Digital Signatures: Integrity verification based on asymmetric encryption
; SHA-256 initialization example
sha256_init:
    movdqa xmm0, [sha256_const]
    ; ... (similar principles but more complex processing)
    ret

Practical Application Considerations

  1. Hash Storage Security: The list of valid hashes must be tamper-proof
  2. Performance Trade-offs: Full file hash verification may impact performance
  3. Whitelist Maintenance: The list of valid modules needs regular updates
  4. Hybrid Verification: Can combine with other mechanisms like digital signatures

Through this hash-based integrity verification mechanism, it is possible to effectively identify legitimate modules infected by viruses, serving as one of the foundational technologies for antivirus and system protection. Understanding these assembly-level implementation principles aids in developing more advanced security protection systems.

Leave a Comment