Essential Insights for Embedded Engineers: Protocol Design is More Than Just Connectivity

As embedded engineers, we deal with “protocols” almost every day. From reading sensor data to coordinating multiple devices, any communication between devices relies on protocols.

Many people think that protocol design is simple, merely involving the creation of a message format. However, in real projects, even simple scenarios often lead to issues like “devices cannot connect,” “data is lost,” or “incompatibility between old and new devices.” The problem is not that protocols are difficult, but rather that the core design principles are not well understood. Today, I will discuss some key aspects of protocol design in straightforward terms.

1. Don’t Just Focus on “Current Usability” and Forget About “Future Modifications”

The most common pitfall is designing with only the “current requirements” in mind, leaving no room for future upgrades.

For example, an engineer designed a temperature collection protocol that only included “device address + temperature value.” It worked fine at the time. However, six months later, when the need arose to add a “humidity collection” feature, the new device’s messages included a humidity field that the old devices did not recognize, resulting in errors. Conversely, the old device’s messages lacked the humidity field, leaving the new device unsure of how to process them.

Even more troublesome, if the protocol does not include a “version number,” it becomes impossible to distinguish between “old version” and “new version,” ultimately requiring hardware upgrades that are both costly and time-consuming.

Tips to Avoid Pitfalls:

  1. When designing a protocol, include a “version number” field (for example, 1 byte, where 01 represents V1 and 02 represents V2). Confirm the version during the handshake, allowing new devices to automatically be compatible with old version logic.
  2. Leave 1-2 “reserved fields” in the message format for future functionality additions without needing to change the overall format.

2. Don’t Blindly Trust “Theoretical Models”; Align with “Physical Layer Characteristics”

Engineers familiar with the OSI seven-layer model often design protocols at a high level, overlooking the characteristics of the underlying physical layer—such as whether they are using RS485 or Ethernet. These hardware characteristics directly determine whether the protocol can function correctly.

For instance, in a project using RS485 half-duplex communication (where devices cannot send and receive data simultaneously, only “you speak, I listen; I speak, you listen”), the master sends a “save data” command to the slave, requiring the slave to save before responding. If the slave takes 5 seconds to save, all other slaves must wait, slowing down the entire system.

Later, the logic was changed: upon receiving the command, the slave immediately replies “received,” allowing the master to handle other devices; once the data is saved, it responds with “save successful/failed” when the master queries it. This significantly improved efficiency.

Different physical layers have different “personalities”; design must accommodate these:

  • RS485/RS232: Primarily half-duplex; avoid allowing devices to “speak simultaneously”. It’s best to confirm important data transmission.
  • I2C: Mostly used for board-level communication (e.g., chip communication on a motherboard), with stable signals that do not require complex additional checks.
  • Ethernet/CAN: Support multiple devices transmitting data simultaneously (the lower layer automatically handles conflicts), and can actively report in emergencies without waiting for the master to query.

3. Balance Fault Tolerance and Efficiency; Don’t Sacrifice One for the Other

Protocol design has two core demands: first, “data must be reliable” (fault tolerance), and second, “transmission must be fast” (efficiency). These two aspects often conflict, requiring a balance.

For example, consider the choice of error-checking methods:

  • Parity checking is the simplest and fastest to compute but can only detect some errors;
  • CRC checking can detect almost all errors but requires more computation, potentially slowing down performance on low-end CPUs.

If transmitting short messages like “device status,” parity checking is sufficient; however, for long data like “firmware upgrade packages,” CRC is necessary, or else errors may go undetected.

Regarding efficiency: at a baud rate of 9600 for RS485, a maximum of 960 bytes can be transmitted per second. If the message is filled with “frame headers, frame tails, and addresses”—all “useless information”—the actual effective data is minimal, leading to low efficiency.

Optimization Tips:

  1. Design commonly used messages to be “fixed length”; for example, use 16 bytes for control commands, allowing for direct reading of 16 bytes without needing to determine “when the message is complete,” which speeds up processing.
  2. For long data (like firmware packages), break them into fixed-length small messages; if one segment fails, only that segment needs to be retransmitted, not the entire message.
  3. Avoid excessive use of broadcasting in Ethernet: while broadcasting is convenient, overuse can lead to “broadcast storms,” slowing down the entire network. Prefer using “unicast” (point-to-point transmission).

Final Summary: Protocol Design is Not Just About “Writing Messages”; It’s About “Designing Systems”

Many people think that protocols are merely about “message formats”; this is not the case. A good protocol must consider “current usability, future modifications, alignment with the physical layer, and high fault tolerance and efficiency.” Essentially, it is about designing a set of “communication rules between devices.”

Remember these three core principles:

  1. Include a version number to allow for upgrades;
  2. Align with the physical layer; do not work against the hardware;
  3. Balance fault tolerance and efficiency; avoid extremes.

Leave a Comment