How to Ensure a Flawless Embedded OTA?

Over-The-Air (OTA) firmware upgrades are a standard feature of modern embedded systems. How can we ensure the security of the upgrade process? How do we handle abnormal situations such as network interruptions and power outages? How can we implement an efficient upgrade mechanism on resource-constrained embedded devices?

1. OTA Upgrade Architecture

System Architecture Design Principles

When designing an OTA upgrade system, the following core principles must be followed:

  1. 1. Security Principle: Ensure that the firmware source is trustworthy to prevent malicious firmware from being installed.
  2. 2. Reliability Principle: Guarantee the atomicity of the upgrade process to avoid bricking the device.
  3. 3. Efficiency Principle: Minimize data transfer volume to reduce upgrade time and costs.
  4. 4. Fault Tolerance Principle: Be able to recover from various abnormal situations, including network interruptions and power outages.
  5. 5. Maintainability Principle: The system design should be clear and easy to test and debug.

Overall System Architecture

Storage Partition
Device Architecture
Network Transmission Layer
Cloud Service Layer




OTA Management Server
Firmware Repository
Differential Engine
Signature Service
Device Management Platform
HTTPS/TLS Encrypted Channel
Resume from Breakpoint
Differential Transmission
Bootloader
Application App Slot A
Application App Slot B
Updater
Shared Memory Area
Bootloader Partition
App Slot A Partition
App Slot B Partition
Updater Partition
Configuration Storage Area

2. Differential Upgrade Technology

Principle of Differential Upgrade

Differential Upgrade (Delta Update) is an efficient firmware update method that only transmits the changed parts by comparing the differences between the old and new firmware versions, rather than the entire firmware file. This method can significantly reduce data transfer volume, saving bandwidth and time.

Basic Principle:

  1. 1. Server Side: Use differential algorithms (such as bsdiff, xdelta, etc.) to generate a differential package (patch file) from the old version to the new version.
  2. 2. Device Side: After receiving the differential package, merge it with the current firmware (apply patch) to generate the new version of the firmware.
  3. 3. Verification: Perform integrity checks on the newly generated firmware to ensure the upgrade is correct.

Selection of Differential Algorithms

Commonly used differential algorithms include:

  • bsdiff/bspatch: A binary differential algorithm suitable for any binary file, with high compression rates.
  • xdelta3: An open-source differential algorithm with excellent performance, suitable for large files.
  • Custom Algorithms: Specialized algorithms optimized for specific firmware formats.

For resource-constrained embedded devices, bsdiff/bspatch can be used as it strikes a good balance between compression rate and computational complexity.

Differential Upgrade Process

Flash Storage Bootloader Device Side Server Flash Storage Bootloader Device Side Server Generate Differential Package Request Upgrade Download Differential Package Apply Differential [Verification Passed] [Verification Failed] Read Old Version Firmware Read New Version Firmware Execute Differential Algorithm (bsdiff) Generate Differential Package (.patch) Calculate Differential Package Signature Report Current Firmware Version Return Differential Package Information (Size/Checksum) Request Download Differential Package Transfer Differential Package Data (Support Resume from Breakpoint) Store in Temporary Area Verify Differential Package Integrity Request Enter Upgrade Mode Read Current Firmware (Old Version) Apply Differential Algorithm (bspatch) Write New Firmware to Backup Partition (Slot B) Verify New Firmware Integrity Digital Signature Check Switch Active Partition to Slot B Update Partition Flag Reboot System Clear Backup Partition Data Return Upgrade Failed

Advantages and Challenges of Differential Upgrade

Advantages:

  1. 1. Bandwidth Savings: Typically reduces data transfer volume by 50%-90%, which is especially important for large-scale deployments.
  2. 2. Reduced Upgrade Time: Significantly shortens data transfer time, enhancing user experience.
  3. 3. Lower Upgrade Costs: In scenarios where charges are based on data usage, costs can be significantly reduced.
  4. 4. Increased Success Rate: Smaller data transfer volume reduces the probability of transmission errors.

Challenges:

  1. 1. Memory Requirements: Applying the differential requires loading the old firmware, differential package, and new firmware simultaneously, leading to high RAM requirements.
  2. 2. Computational Overhead: Differential merging requires a certain level of CPU processing power.
  3. 3. Version Management: Maintaining the differential relationships between multiple versions increases management complexity.
  4. 4. Error Recovery: Recovery strategies after differential merging failures need to be specially designed.

Key Points for Implementing Differential Upgrade

Server-Side Implementation:

Direct Path
Jump Path




Firmware Version Management
Check Version Path
Single Step Differential V1->V2
Multi-Step Differential V1->V1.1->V2
Generate Differential Package
Calculate Differential Package Checksum
Digital Signature
Store in Repository
Device Request
Current Version
Find Optimal Differential Path
Return Differential Package Information

Device-Side Implementation:

  1. 1. Memory Management: Use streaming processing, reading old firmware and applying differentials in chunks to avoid loading the entire firmware at once.
  2. 2. Resume from Breakpoint: Support resuming downloads of differential packages to improve transmission reliability.
  3. 3. Double Verification: Verify the differential package itself and then verify the new firmware after merging.
  4. 4. Rollback Mechanism: Ability to roll back in case of merge failure, ensuring system recoverability.

3. A/B Partition Mechanism

Principle of A/B Partition

The A/B partition mechanism (also known as Dual-Bank or Redundant Boot) is a redundant firmware storage solution that maintains two independent firmware partitions (Partition A and Partition B) within the device, ensuring that at least one usable firmware version is always available during the upgrade process. The core idea of this mechanism is:When upgrading to new firmware, do not disrupt the currently running firmware..

Working Principle:

  1. 1. Dual Partition Storage: The device has two independent firmware partitions that can store two different versions of firmware.
  2. 2. Active Partition Switching: The Bootloader maintains a flag indicating which partition should be booted from.
  3. 3. Atomic Upgrade: New firmware is written to the inactive partition, and the active partition flag is only switched after successful verification.
  4. 4. Automatic Rollback: If the new firmware fails to boot, the Bootloader automatically switches to the old partition.

A/B Partition Architecture Design

Bootloader Logic
Shared Memory Area
Flash Storage Partition

A

B


Yes

No

Yes

No


Yes

No


Bootloader Partition Not Updatable Responsible for Booting and Partition Management
App Slot A Version: V1.0 Status: Stable
App Slot B Version: V2.0 Status: Pending Verification
Active Partition Flag active_slot = A
Boot Counter boot_counter = 0
Upgrade Status Flag update_in_progress = false
Read Active Partition Flag
Active Partition
Verify Slot A Firmware
Verify Slot B Firmware
Verification Passed?
Verification Passed?
Boot Slot A
Try Slot B
Boot Slot B
Enter Safe Mode
Is Slot B Valid?

A/B Partition Upgrade Process

System Boot

Receive Upgrade Request

Download Complete

Verification Passed

Verification Failed, Redownload

Installation Complete, Atomic Switch

Reboot System

Boot Check

Running Normally, Mark Stable

Boot Failed

Automatic Rollback

Next Upgrade Write to Slot A

Download Complete

Verification Passed

Installation Complete

Reboot System

AppSlotA_Running
DownloadToSlotB
VerifyingSlotB
InstallingSlotB
SwitchToSlotB
AppSlotB_Running
VerifySlotB
AppSlotB_Stable
RollbackToSlotA
DownloadToSlotA
VerifyingSlotA
InstallingSlotA
SwitchToSlotA
Key Point: Atomic switch ensures partition flags and firmware states are consistent updates
Automatic rollback mechanism prevents device bricking

Guaranteeing Atomicity of Partition Switching

Partition switching is a critical operation that must ensure atomicity; otherwise, it may lead to inconsistent system states. Here are several methods to ensure atomicity:

Use Hardware-Supported Atomic Operations

Prepare Switch
Disable Interrupts
Write New Partition Flag
Synchronize Flash Write
Enable Interrupts
System Reboot

Critical steps must be completed in an uninterruptible sequence to ensure the atomicity of flag updates and system reboots.

Advantages and Limitations of A/B Partition

Advantages:

  1. 1. Zero Downtime Upgrade: The device can continue running during the upgrade process, switching only after completion.
  2. 2. Automatic Rollback: Automatically rolls back if the new firmware fails to boot, preventing device bricking.
  3. 3. Fast Recovery: Rollback operations are very quick, requiring only a switch of the partition flag.
  4. 4. Reduced Risk: Upgrade failures do not affect the currently running firmware.

Limitations:

  1. 1. Storage Space Requirements: Requires double the storage space, increasing costs.
  2. 2. Increased Complexity: Partition management and state synchronization increase system complexity.
  3. 3. Boot Time: Additional verification and selection logic may slightly increase boot time.
  4. 4. Version Management: Requires maintaining version information for both partitions.

4. Digital Signature Verification

Principle of Digital Signatures

Digital signatures are a core technology for ensuring firmware security, using cryptographic methods to ensure the integrity (not tampered with) and authenticity (trustworthy source) of the firmware. Digital signatures are based on asymmetric encryption algorithms, using a private key for signing and a public key for verification.

Basic Principle:

  1. 1. Server Side (Signing Process):
  • • Calculate the hash value of the firmware (e.g., SHA-256).
  • • Encrypt the hash value using the private key to generate the digital signature.
  • • Attach the signature to the firmware file.
  • 2. Device Side (Verification Process):
    • • Use the pre-installed public key to decrypt the signature and obtain the original hash value.
    • • Calculate the hash value of the received firmware.
    • • Compare the two hash values; if they match, verification is successful.

    Selection of Digital Signature Algorithms

    Commonly used digital signature algorithms include:

    • RSA: The most widely used algorithm, with high security, but larger signatures (2048-bit key produces a 256-byte signature).
    • ECDSA: Elliptic curve algorithm, smaller signatures (256-bit key produces a 64-byte signature), suitable for resource-constrained devices.
    • Ed25519: A modern elliptic curve algorithm with excellent performance and high security.

    For embedded devices, it is recommended to use ECDSA P-256, as it strikes a good balance between security and resource consumption.

    Digital Signature Verification Process

    Cryptographic Module Bootloader Device Side Server Cryptographic Module Bootloader Device Side Server Signature Generation Receive Firmware Verify Signature [Verification Passed] [Verification Failed] Calculate Firmware SHA-256 Hash Encrypt with Private Key Attach Signature to Firmware Header Receive Complete Firmware (including signature) Parse Firmware Header Extract Digital Signature Request Verify Firmware Read Firmware Header Extract Firmware Data and Signature Calculate Firmware Data Hash (SHA-256) Use Public Key to Verify Signature Return Verification Result Mark Firmware as Trusted Allow Installation Reject Installation Return Error Code Log Security Event

    Firmware Signature Format Design

    Firmware Image Structure
    Firmware Header Metadata
    Firmware Data Binary Code
    Digital Signature 256 Bytes
    Magic Number 4 Bytes
    Version Information 16 Bytes
    Firmware Size 4 Bytes
    Data CRC32 4 Bytes
    Data SHA256 32 Bytes
    Header CRC32 4 Bytes
    RSA-2048 Signature 256 Bytes
    or ECDSA Signature 64-72 Bytes
    

    Public Key Management Strategies

    Secure storage of public keys is crucial for the digital signature mechanism. Here are several public key management strategies:

    Strategy 1: Hardware Security Module (HSM)

    Public Key Storage
    Hardware Security Module HSM
    Tamper-Proof Storage
    Hardware Encryption Acceleration
    Firmware Verification
    Hardware Signature Verification
    Return Verification Result
    

    Strategy 2: Software Storage + Integrity Protection

    Yes
    
    No
    
    Public Key
    Calculate Public Key Hash
    Store Public Key to Flash
    Store Hash to Secure Area
    At Boot Time
    Read Public Key
    Calculate Current Hash
    Compare with Stored Hash
    Hash Matches?
    Use Public Key
    Tampering Detected, Refuse to Boot
    

    Strategy 3: Public Key Rotation Mechanism

    To address long-term deployment scenarios, support for public key rotation is necessary:

    Initial State, Use Key Pair 1
    
    Provision Key Pair 2 via Secure Channel
    
    Both Keys Are Usable
    
    New Firmware Signed Only with Key Pair 2
    
    All Devices Upgraded
    
    Key1_Active
    Key2_Provisioned
    Both_Keys_Valid
    Key2_Only
    Key2_Active
    During Transition: Support for Both Key Sets to Ensure Smooth Switching
    

    5. Power Failure Protection Mechanism

    Power Failure Risk Analysis

    During the OTA upgrade process, devices may suddenly lose power for various reasons:

    • • Power Failure
    • • Battery Depletion
    • • User Accidental Power Off
    • • Network Interruptions Leading to Watchdog Resets

    Potential issues caused by power failure:

    Power Failure During Upgrade
    Incomplete Flash Write
    Inconsistent Partition Flags
    Firmware Data Corruption
    Firmware Boot Failure
    Incorrect Partition Selection
    Risk of Bricking the Device
    

    Power Failure Protection Strategies

    The core idea of the power failure protection mechanism is:Ensure that at any time, at least one complete and bootable firmware version is available..

    Three Principles of Power Failure Protection
    
    
    Principle 1: Atomic Writes Either Fully Write or Not Write at All
    Principle 2: State Flag Protection Use Redundant Flags and Checksums
    Principle 3: At Boot Time Verification Detect and Repair Inconsistent States
    Implementation: Transactional Writes
    Implementation: Multiple Flag Verification
    Implementation: Automatic Recovery Mechanism
    Power Failure Protection System
    

    Bootloader Power Failure Recovery at Startup

    The Bootloader must check the integrity of the system and perform necessary recovery operations at each startup:

    No
    
    Yes
    
    Yes
    
    No
    
    Yes
    
    No
    
    Yes
    
    No
    
    
    
    Yes
    
    No
    
    Yes
    
    No
    
    Bootloader Startup
    Read Status Flags
    Are Flags Valid?
    Initialize Default State
    Check Upgrade Status Flag
    Is Upgrade in Progress?
    Check Write Progress
    Normal Startup Process
    Is Progress Complete?
    Verify Written Firmware
    Power Failure Detected
    Is Firmware Valid?
    Mark Upgrade Complete
    Clear Incomplete Writes
    Restore Pre-Upgrade State
    Boot with Original Firmware
    Switch to New Firmware
    Boot New Firmware
    Verify Active Partition Firmware
    Verification Passed?
    Boot Active Partition
    Try Backup Partition
    Is Backup Partition Valid?
    Switch to Backup Partition and Boot
    Enter Safe Mode
    

    Watchdog Protection Mechanism

    The Watchdog Timer is an important mechanism to prevent the system from deadlocking under abnormal conditions:

    Bootloader Watchdog Timer Application Bootloader Watchdog Timer Application Normal Upgrade Process loop [every 5 seconds] Exception Handling Start Upgrade, Configure Watchdog (60 seconds) Feed Dog (Reset Timer) Continue Upgrade Operation Upgrade Complete Turn Off Watchdog Request Reboot Start Upgrade, Configure Watchdog (60 seconds) Deadlock Occurs During Upgrade 60 Seconds Without Feeding Dog Signal Triggers System Reset Detect Watchdog Reset Execute Power Failure Recovery Process

    6. Conclusion

    1. 1. Differential Upgrade: Significantly reduces data transfer volume by only transmitting the differences in firmware, saving bandwidth and time.
    2. 2. A/B Partition Mechanism: Maintains two independent firmware partitions to ensure automatic rollback in case of upgrade failure, preventing device bricking.
    3. 3. Digital Signature Verification: Uses cryptographic methods to ensure firmware integrity and authenticity, preventing malicious firmware from being installed.
    4. 4. Power Failure Protection Mechanism: Ensures that the device can recover normally under any circumstances through atomic writes, state flag protection, and boot-time verification.

    In practical applications, these technologies need to be appropriately adjusted and optimized based on specific hardware platforms, resource constraints, and application scenarios.

    How to Ensure a Flawless Embedded OTA?

    END

    Author:Psyducking

    Source:Embedded Software GuesthouseCopyright belongs to the original author. If there is any infringement, please contact for deletion..Recommended ReadingAn Irresistible VSCode Plugin!A Mobile Operating System Designed Based on FreeRTOSThe Most Legendary American Programmer, Only Using China’s Loongson Computer…→ Follow for More Updates ←

    Leave a Comment