Over-The-Air (OTA) firmware upgrades are a standard feature of modern embedded systems. How can we ensure the security of the upgrade process? How do we handle abnormal situations such as network interruptions and power outages? How can we implement an efficient upgrade mechanism on resource-constrained embedded devices?
1. OTA Upgrade Architecture
System Architecture Design Principles
When designing an OTA upgrade system, the following core principles must be followed:
- 1. Security Principle: Ensure that the firmware source is trustworthy to prevent malicious firmware from being installed.
- 2. Reliability Principle: Guarantee the atomicity of the upgrade process to avoid bricking the device.
- 3. Efficiency Principle: Minimize data transfer volume to reduce upgrade time and costs.
- 4. Fault Tolerance Principle: Be able to recover from various abnormal situations, including network interruptions and power outages.
- 5. Maintainability Principle: The system design should be clear and easy to test and debug.
Overall System Architecture
Storage Partition
Device Architecture
Network Transmission Layer
Cloud Service Layer
OTA Management Server
Firmware Repository
Differential Engine
Signature Service
Device Management Platform
HTTPS/TLS Encrypted Channel
Resume from Breakpoint
Differential Transmission
Bootloader
Application App Slot A
Application App Slot B
Updater
Shared Memory Area
Bootloader Partition
App Slot A Partition
App Slot B Partition
Updater Partition
Configuration Storage Area
2. Differential Upgrade Technology
Principle of Differential Upgrade
Differential Upgrade (Delta Update) is an efficient firmware update method that only transmits the changed parts by comparing the differences between the old and new firmware versions, rather than the entire firmware file. This method can significantly reduce data transfer volume, saving bandwidth and time.
Basic Principle:
- 1. Server Side: Use differential algorithms (such as bsdiff, xdelta, etc.) to generate a differential package (patch file) from the old version to the new version.
- 2. Device Side: After receiving the differential package, merge it with the current firmware (apply patch) to generate the new version of the firmware.
- 3. Verification: Perform integrity checks on the newly generated firmware to ensure the upgrade is correct.
Selection of Differential Algorithms
Commonly used differential algorithms include:
- • bsdiff/bspatch: A binary differential algorithm suitable for any binary file, with high compression rates.
- • xdelta3: An open-source differential algorithm with excellent performance, suitable for large files.
- • Custom Algorithms: Specialized algorithms optimized for specific firmware formats.
For resource-constrained embedded devices, bsdiff/bspatch can be used as it strikes a good balance between compression rate and computational complexity.
Differential Upgrade Process
Flash Storage Bootloader Device Side Server Flash Storage Bootloader Device Side Server Generate Differential Package Request Upgrade Download Differential Package Apply Differential [Verification Passed] [Verification Failed] Read Old Version Firmware Read New Version Firmware Execute Differential Algorithm (bsdiff) Generate Differential Package (.patch) Calculate Differential Package Signature Report Current Firmware Version Return Differential Package Information (Size/Checksum) Request Download Differential Package Transfer Differential Package Data (Support Resume from Breakpoint) Store in Temporary Area Verify Differential Package Integrity Request Enter Upgrade Mode Read Current Firmware (Old Version) Apply Differential Algorithm (bspatch) Write New Firmware to Backup Partition (Slot B) Verify New Firmware Integrity Digital Signature Check Switch Active Partition to Slot B Update Partition Flag Reboot System Clear Backup Partition Data Return Upgrade Failed
Advantages and Challenges of Differential Upgrade
Advantages:
- 1. Bandwidth Savings: Typically reduces data transfer volume by 50%-90%, which is especially important for large-scale deployments.
- 2. Reduced Upgrade Time: Significantly shortens data transfer time, enhancing user experience.
- 3. Lower Upgrade Costs: In scenarios where charges are based on data usage, costs can be significantly reduced.
- 4. Increased Success Rate: Smaller data transfer volume reduces the probability of transmission errors.
Challenges:
- 1. Memory Requirements: Applying the differential requires loading the old firmware, differential package, and new firmware simultaneously, leading to high RAM requirements.
- 2. Computational Overhead: Differential merging requires a certain level of CPU processing power.
- 3. Version Management: Maintaining the differential relationships between multiple versions increases management complexity.
- 4. Error Recovery: Recovery strategies after differential merging failures need to be specially designed.
Key Points for Implementing Differential Upgrade
Server-Side Implementation:
Direct Path
Jump Path
Firmware Version Management
Check Version Path
Single Step Differential V1->V2
Multi-Step Differential V1->V1.1->V2
Generate Differential Package
Calculate Differential Package Checksum
Digital Signature
Store in Repository
Device Request
Current Version
Find Optimal Differential Path
Return Differential Package Information
Device-Side Implementation:
- 1. Memory Management: Use streaming processing, reading old firmware and applying differentials in chunks to avoid loading the entire firmware at once.
- 2. Resume from Breakpoint: Support resuming downloads of differential packages to improve transmission reliability.
- 3. Double Verification: Verify the differential package itself and then verify the new firmware after merging.
- 4. Rollback Mechanism: Ability to roll back in case of merge failure, ensuring system recoverability.
3. A/B Partition Mechanism
Principle of A/B Partition
The A/B partition mechanism (also known as Dual-Bank or Redundant Boot) is a redundant firmware storage solution that maintains two independent firmware partitions (Partition A and Partition B) within the device, ensuring that at least one usable firmware version is always available during the upgrade process. The core idea of this mechanism is:When upgrading to new firmware, do not disrupt the currently running firmware..
Working Principle:
- 1. Dual Partition Storage: The device has two independent firmware partitions that can store two different versions of firmware.
- 2. Active Partition Switching: The Bootloader maintains a flag indicating which partition should be booted from.
- 3. Atomic Upgrade: New firmware is written to the inactive partition, and the active partition flag is only switched after successful verification.
- 4. Automatic Rollback: If the new firmware fails to boot, the Bootloader automatically switches to the old partition.
A/B Partition Architecture Design
Bootloader Logic
Shared Memory Area
Flash Storage Partition
A
B
Yes
No
Yes
No
Yes
No
Bootloader Partition Not Updatable Responsible for Booting and Partition Management
App Slot A Version: V1.0 Status: Stable
App Slot B Version: V2.0 Status: Pending Verification
Active Partition Flag active_slot = A
Boot Counter boot_counter = 0
Upgrade Status Flag update_in_progress = false
Read Active Partition Flag
Active Partition
Verify Slot A Firmware
Verify Slot B Firmware
Verification Passed?
Verification Passed?
Boot Slot A
Try Slot B
Boot Slot B
Enter Safe Mode
Is Slot B Valid?
A/B Partition Upgrade Process
System Boot
Receive Upgrade Request
Download Complete
Verification Passed
Verification Failed, Redownload
Installation Complete, Atomic Switch
Reboot System
Boot Check
Running Normally, Mark Stable
Boot Failed
Automatic Rollback
Next Upgrade Write to Slot A
Download Complete
Verification Passed
Installation Complete
Reboot System
AppSlotA_Running
DownloadToSlotB
VerifyingSlotB
InstallingSlotB
SwitchToSlotB
AppSlotB_Running
VerifySlotB
AppSlotB_Stable
RollbackToSlotA
DownloadToSlotA
VerifyingSlotA
InstallingSlotA
SwitchToSlotA
Key Point: Atomic switch ensures partition flags and firmware states are consistent updates
Automatic rollback mechanism prevents device bricking
Guaranteeing Atomicity of Partition Switching
Partition switching is a critical operation that must ensure atomicity; otherwise, it may lead to inconsistent system states. Here are several methods to ensure atomicity:
Use Hardware-Supported Atomic Operations
Prepare Switch
Disable Interrupts
Write New Partition Flag
Synchronize Flash Write
Enable Interrupts
System Reboot
Critical steps must be completed in an uninterruptible sequence to ensure the atomicity of flag updates and system reboots.
Advantages and Limitations of A/B Partition
Advantages:
- 1. Zero Downtime Upgrade: The device can continue running during the upgrade process, switching only after completion.
- 2. Automatic Rollback: Automatically rolls back if the new firmware fails to boot, preventing device bricking.
- 3. Fast Recovery: Rollback operations are very quick, requiring only a switch of the partition flag.
- 4. Reduced Risk: Upgrade failures do not affect the currently running firmware.
Limitations:
- 1. Storage Space Requirements: Requires double the storage space, increasing costs.
- 2. Increased Complexity: Partition management and state synchronization increase system complexity.
- 3. Boot Time: Additional verification and selection logic may slightly increase boot time.
- 4. Version Management: Requires maintaining version information for both partitions.
4. Digital Signature Verification
Principle of Digital Signatures
Digital signatures are a core technology for ensuring firmware security, using cryptographic methods to ensure the integrity (not tampered with) and authenticity (trustworthy source) of the firmware. Digital signatures are based on asymmetric encryption algorithms, using a private key for signing and a public key for verification.
Basic Principle:
- 1. Server Side (Signing Process):
- • Calculate the hash value of the firmware (e.g., SHA-256).
- • Encrypt the hash value using the private key to generate the digital signature.
- • Attach the signature to the firmware file.
- • Use the pre-installed public key to decrypt the signature and obtain the original hash value.
- • Calculate the hash value of the received firmware.
- • Compare the two hash values; if they match, verification is successful.
Selection of Digital Signature Algorithms
Commonly used digital signature algorithms include:
- • RSA: The most widely used algorithm, with high security, but larger signatures (2048-bit key produces a 256-byte signature).
- • ECDSA: Elliptic curve algorithm, smaller signatures (256-bit key produces a 64-byte signature), suitable for resource-constrained devices.
- • Ed25519: A modern elliptic curve algorithm with excellent performance and high security.
For embedded devices, it is recommended to use ECDSA P-256, as it strikes a good balance between security and resource consumption.
Digital Signature Verification Process
Cryptographic Module Bootloader Device Side Server Cryptographic Module Bootloader Device Side Server Signature Generation Receive Firmware Verify Signature [Verification Passed] [Verification Failed] Calculate Firmware SHA-256 Hash Encrypt with Private Key Attach Signature to Firmware Header Receive Complete Firmware (including signature) Parse Firmware Header Extract Digital Signature Request Verify Firmware Read Firmware Header Extract Firmware Data and Signature Calculate Firmware Data Hash (SHA-256) Use Public Key to Verify Signature Return Verification Result Mark Firmware as Trusted Allow Installation Reject Installation Return Error Code Log Security Event
Firmware Signature Format Design
Firmware Image Structure
Firmware Header Metadata
Firmware Data Binary Code
Digital Signature 256 Bytes
Magic Number 4 Bytes
Version Information 16 Bytes
Firmware Size 4 Bytes
Data CRC32 4 Bytes
Data SHA256 32 Bytes
Header CRC32 4 Bytes
RSA-2048 Signature 256 Bytes
or ECDSA Signature 64-72 Bytes
Public Key Management Strategies
Secure storage of public keys is crucial for the digital signature mechanism. Here are several public key management strategies:
Strategy 1: Hardware Security Module (HSM)
Public Key Storage
Hardware Security Module HSM
Tamper-Proof Storage
Hardware Encryption Acceleration
Firmware Verification
Hardware Signature Verification
Return Verification Result
Strategy 2: Software Storage + Integrity Protection
Yes
No
Public Key
Calculate Public Key Hash
Store Public Key to Flash
Store Hash to Secure Area
At Boot Time
Read Public Key
Calculate Current Hash
Compare with Stored Hash
Hash Matches?
Use Public Key
Tampering Detected, Refuse to Boot
Strategy 3: Public Key Rotation Mechanism
To address long-term deployment scenarios, support for public key rotation is necessary:
Initial State, Use Key Pair 1
Provision Key Pair 2 via Secure Channel
Both Keys Are Usable
New Firmware Signed Only with Key Pair 2
All Devices Upgraded
Key1_Active
Key2_Provisioned
Both_Keys_Valid
Key2_Only
Key2_Active
During Transition: Support for Both Key Sets to Ensure Smooth Switching
5. Power Failure Protection Mechanism
Power Failure Risk Analysis
During the OTA upgrade process, devices may suddenly lose power for various reasons:
- • Power Failure
- • Battery Depletion
- • User Accidental Power Off
- • Network Interruptions Leading to Watchdog Resets
Potential issues caused by power failure:
Power Failure During Upgrade
Incomplete Flash Write
Inconsistent Partition Flags
Firmware Data Corruption
Firmware Boot Failure
Incorrect Partition Selection
Risk of Bricking the Device
Power Failure Protection Strategies
The core idea of the power failure protection mechanism is:Ensure that at any time, at least one complete and bootable firmware version is available..
Three Principles of Power Failure Protection
Principle 1: Atomic Writes Either Fully Write or Not Write at All
Principle 2: State Flag Protection Use Redundant Flags and Checksums
Principle 3: At Boot Time Verification Detect and Repair Inconsistent States
Implementation: Transactional Writes
Implementation: Multiple Flag Verification
Implementation: Automatic Recovery Mechanism
Power Failure Protection System
Bootloader Power Failure Recovery at Startup
The Bootloader must check the integrity of the system and perform necessary recovery operations at each startup:
No
Yes
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Bootloader Startup
Read Status Flags
Are Flags Valid?
Initialize Default State
Check Upgrade Status Flag
Is Upgrade in Progress?
Check Write Progress
Normal Startup Process
Is Progress Complete?
Verify Written Firmware
Power Failure Detected
Is Firmware Valid?
Mark Upgrade Complete
Clear Incomplete Writes
Restore Pre-Upgrade State
Boot with Original Firmware
Switch to New Firmware
Boot New Firmware
Verify Active Partition Firmware
Verification Passed?
Boot Active Partition
Try Backup Partition
Is Backup Partition Valid?
Switch to Backup Partition and Boot
Enter Safe Mode
Watchdog Protection Mechanism
The Watchdog Timer is an important mechanism to prevent the system from deadlocking under abnormal conditions:
Bootloader Watchdog Timer Application Bootloader Watchdog Timer Application Normal Upgrade Process loop [every 5 seconds] Exception Handling Start Upgrade, Configure Watchdog (60 seconds) Feed Dog (Reset Timer) Continue Upgrade Operation Upgrade Complete Turn Off Watchdog Request Reboot Start Upgrade, Configure Watchdog (60 seconds) Deadlock Occurs During Upgrade 60 Seconds Without Feeding Dog Signal Triggers System Reset Detect Watchdog Reset Execute Power Failure Recovery Process
6. Conclusion
- 1. Differential Upgrade: Significantly reduces data transfer volume by only transmitting the differences in firmware, saving bandwidth and time.
- 2. A/B Partition Mechanism: Maintains two independent firmware partitions to ensure automatic rollback in case of upgrade failure, preventing device bricking.
- 3. Digital Signature Verification: Uses cryptographic methods to ensure firmware integrity and authenticity, preventing malicious firmware from being installed.
- 4. Power Failure Protection Mechanism: Ensures that the device can recover normally under any circumstances through atomic writes, state flag protection, and boot-time verification.
In practical applications, these technologies need to be appropriately adjusted and optimized based on specific hardware platforms, resource constraints, and application scenarios.

END
Author:Psyducking
Source:Embedded Software GuesthouseCopyright belongs to the original author. If there is any infringement, please contact for deletion..▍Recommended ReadingAn Irresistible VSCode Plugin!A Mobile Operating System Designed Based on FreeRTOSThe Most Legendary American Programmer, Only Using China’s Loongson Computer…→ Follow for More Updates ←