(Image from Wikipedia entry Caerlaverock Castle: https://en.wikipedia.org/wiki/Caerlaverock_Castle )
In the previous article, we introduced the difference between the two concepts of “security” in Chinese, namely Security and Safety, in embedded systems—Safety is generally understood as “functional safety,” while Security refers to “information security”—to distinguish it from the more familiar “cyber information security.” In the following discussions, unless otherwise specified, “information security” will specifically refer to “embedded information security.”
So, what is the essence of embedded information security?
The essence of embedded information security (Security) is Isolation.
It is worth emphasizing that this isolation does not differentiate between software (Software Security), hardware (Hardware Security), and various processes related to teams (Team / Design Flow Security). In other words, using “isolation” to achieve “information security” is valid in any context within embedded systems—this can be seen as a universally applicable axiom.
When we mention isolation, we immediately face the following questions:
-
Who is being isolated from whom? (Who to defend against)
-
What is to be isolated? (What to protect)
-
What method will be used for isolation? (How to isolate)
Don’t underestimate these three questions; they are crucial for designing information security systems. Correctly answering these questions is the most effective way to prevent a PhD from straying far off-topic.
“Wait, wait…,” someone couldn’t sit still: “Isn’t information security supposed to be about encryption and decryption, handshake authentication, and similar things?”
“Right, right! Topics like DES, 3DES, and AES128 being insecure and the need for AES256?” another person chimed in: “What do algorithms like MD5 and RSA have to do with Isolation? Are we encrypting all content to achieve isolation?”
So, some of you comrades, with a simplistic understanding, just know about encryption algorithms and think that so-called information security is just about encryption and protecting keys to make things unintelligible. This is very one-sided. To put it simply, encryption algorithms, key management, and authentication are to information security what bricks are to a house—building a house requires a demand (what is the purpose of the house, who will live in it, what are the requirements, what is the budget? What is the expected lifespan? What natural and geological disasters might the location face? To what extent should it withstand?), target planning (how much budget, how long to build, who to hire, how to construct? What are the acceptance criteria?), and also theoretical guidance—ultimately through engineering practice, using various building materials to construct a house that meets the requirements. You see, if it weren’t for your interruptions, wasting so many words, we wouldn’t have said anything useful. To fully understand the relationships involved, I will detail this in subsequent articles; here we will start from a more fundamental understanding of information security.
-
Who to defend against and what to protect
These two questions are usually considered together, and there are many specifics that can’t be summarized in just a few words, so let me tell a story from our everyday life:
Little Li is a hardware engineer who has self-taught software development and has achieved some success, often taking on freelance work to earn some barbecue money. This time, he took on a development job from a familiar small business owner; the specifics of the hardware and functions are not important, but it is worth mentioning that Little Li is very proud of one software algorithm inside, which can greatly improve the product’s parameters, achieving what “expensive high-end” products can do with a smaller hardware cost. To protect this algorithm, Little Li spent a lot of effort encrypting the product, implementing everything he could find online, like encrypting the communication for firmware upgrades, using state machine obfuscation algorithms, firmware integrity checks, and multi-key protection… In short, he used every algorithm he could find online. Despite spending a lot of time, the small business owner did not pay him any extra, but he was still very satisfied.
However… Less than two weeks after the product was launched, a clone product appeared on the market that was identical. The other party copied the firmware using brute-force methods and then mass-produced it with the same firmware. The small business owner was very dissatisfied and confronted Little Li, questioning whether he felt underpaid or had sold the design to someone else. Little Li was very aggrieved, repeatedly emphasizing that he had encrypted the product, and that no one could obtain his algorithm—upon hearing this, the small business owner coldly laughed, “No one is interested in your algorithm; what they care about is how to clone the entire product and mass-produce it!—You wasted so much time and effort, yet you didn’t bind the key UID; I really don’t know how to say this to you.”
As the old saying goes: no matter how impressive it looks, a single brick can knock it down. No matter how good the encryption is, if it can be cloned, it’s useless.
This story tells us that many times, if your isolation method is essentially just closing the door to prevent thieves, but someone can dig up your house… Therefore, for different attack methods, different isolation methods must be designed, and one cannot only think of spatial isolation; temporal isolation should not be overlooked either.
-
How to isolate
A wise elder in a movie once said, you see the world’s disputes are nothing but the words “fame” and “wealth”; once you see through it… you’ll understand that to be a true winner in life, you need to grasp both with both hands. An elder in the ivory tower once said, you see the flow of time is nothing but the words “time” and “space”; once you see through it—you will understand that in information security, both temporal isolation and spatial isolation must be grasped with both hands!
Spatial isolation
Spatial isolation is easy to understand; for example, in a 32-bit 4G address space, you can divide it into small segments (of any size) through hardware, and each segment can have different access permissions (No-Access / Read-Only / Full Access). This is spatial isolation. But how to understand temporal isolation? Is it that a pair of time-traveling lovers have both reached the same space but always missed each other? If you understand it this way, that’s not bad, but are you sure you are a programmer and not a programmer girl? To correctly understand this issue, we first need to discuss the spatial isolation of two different types of resources.
Isolation of non-shared resources
Non-shared resources refer to resources that are “exclusively owned” by a certain task in a multitasking system. For these resources, it’s very simple—during task switching, write the current task’s resource configuration into a dedicated memory isolation peripheral (like the Memory Protection Unit, MPU, in Cortex-M)—this way, each task can isolate its exclusively owned resources from other tasks.
Isolation of shared resources
Shared resources, as the name suggests, are resources shared among multiple tasks, such as various shared peripherals (like UART, SPI). For shared resources, achieving isolation cannot be done purely from a static spatial perspective, because isolation is essentially an “exclusivity”—my things cannot be used by others—so how can we achieve “sharing” among multiple tasks? Naturally, the concept of “time-sharing” is introduced. Simply put, on the time axis, divide time into segments like address space, and allocate different segments to different tasks; thus, for each task, within its time slice, the resource is exclusively owned.
The concept of time-sharing is not a new thing; so what does it have to do with isolation? Strictly speaking, it has no relation at all. Pure time-sharing does not achieve any isolation—under time-sharing, the so-called isolation should reflect that after two adjacent tasks switch resource usage rights, the latter should not be able to access the former’s residual information—otherwise, it would lead to information leakage between tasks. To achieve this functionality, we need to introduce the concept of context:
When a task gains the right to use a resource, it must restore its context to continue its previous work
When a task is forced to relinquish its right to use, it must not only protect its context for the next use but also destroy the current context to prevent information leakage to other users of the resource
This process is not difficult to understand, but it is worth emphasizing that, for shared resources, each task has its own context for that resource. Where is this context saved? Isn’t it in a segment of memory belonging to the task? Thus, we can easily deduce:
Isolation of shared resources means each task has a spatial isolation of its exclusive context for the resource.
Furthermore, the method of equipping each task with a “context” to share a common resource among multiple tasks is called “virtualization”—that is,
using a physical resource, and through time-sharing, virtualizing a resource for each task.
And
virtualization is the core method to achieve “temporal isolation.”
Alright, after all this, we summarize simply:
-
Non-shared resources—We use memory management peripherals, and simple spatial isolation is sufficient.
-
Shared resources—We use virtualization technology to achieve temporal isolation through spatial isolation of the “context” (preventing information leakage during task switching).
After saying so much, let’s explain an interesting thing:
The pipeline is actually a shared resource—multiple tasks share the same pipeline and execute task code through time-sharing. Regarding the “context” of the pipeline, we habitually call it task context. In this sense, the OS simply virtualizes the pipeline so that each task can temporarily monopolize it during execution. Now the question arises, have you noticed that if the front and back-end systems are simple multitasking, then when an ordinary MCU handles interrupts, although there are push and pop operations, it does not “wipe” the “residual context” left by the previous task!—In other words, task information can leak during interrupt processing!! In principle, ordinary MCUs cannot reliably achieve “temporal isolation”!
What does the ARMv8-M TrustZone architecture do? It ensures that when a program runs in Secure mode and a Non-Secure interrupt suddenly occurs, in addition to normal context protection, the hardware also helps wipe out the context of the Secure execution!—This fundamentally achieves reliable “temporal isolation” through hardware methods, which older ARMv7-M and ARMv6-M architectures cannot achieve through hardware—this is why, from a principle standpoint, ARMv8-M’s TrustZone is more secure than older architectures.
After understanding the basic concept of isolation, as ARM has released the PSA architecture (Platform Security Architecture), in the next article, we will introduce how PSA plays with isolation in practice.
—————End of Main Text—————
If you like my thinking, feel free to subscribe to Bare Metal Thinking.
Copyright belongs to Bare Metal Thinking (a public account under Silly Kid Publishing Studio).
All content is original, any form of reproduction is strictly prohibited, sharing/forwarding is welcome
(The difference between reproduction and sharing is: reproduction extracts article content and publishes it on other media; sharing still uses Bare Metal Thinking public account as the main body to spread the article.)