Custom Embedded Protocols Should Follow TLV Format

In embedded software development, it is common to see friends defining custom protocols, mainly due to resource constraints and the relatively specialized functionality. Subsequent maintenance is also much simpler. However, at a certain stage or scale of the project, one must consider issues such as scalability and compatibility, leading to a variety of protocol designs. Today, I would like to discuss the TLV (Type-Length-Value) format in communication protocols, as custom protocols ultimately tend to converge towards this method or similar approaches, making it beneficial to adopt it from the start.

TLV Data Format

The TLV (Type-Length-Value) format in embedded IoT is an efficient and flexible data encapsulation method widely used in communication for resource-constrained IoT devices.Structure Analysis:TypeIdentifies the type of data (e.g., voltage, temperature, humidity, etc.), usually represented by 1-2 bytes. For example,0x01represents voltage,0x02represents temperature. The type essentially determines the byte length of this TLV data unit and the parsing method for the Value.LengthIndicates the byte count of the Value field, which can be fixed or variable length. For example, 1 byte (0-255) or variable-length integers (to save space).ValueContains the actual data content, with the length defined by the Length field, such as a temperature value3.11V, amplified 1000 times encoded as an integer of two bytes0x0C 0x1C.

Having introduced the TLV data encapsulation format, let’s now detail its three main characteristics:

Self-describing Types

The first part of each TLV structure is the Type field (usually 1-4 bytes), which clearly identifies the semantics and encoding of the data, as mentioned earlier.

However, this field needs to be agreed upon by both the sender and receiver, leading to the creation of a Type Mapping Table.

The Type Mapping Table: The Type value is determined through predefined protocol specifications or dynamic negotiation (such as during device handshake), allowing the receiver to select the corresponding parsing logic based on the Type value. If the Type value does not match the actual data format (for example, if the Type declares an integer but the Value is a string), the receiver can directly report an error.

This way, external metadata is unnecessary, as the data itself carries type information, and the parser does not need to rely on additional protocol documents or configuration files. Many advanced serialization tools currently require such tools, which can be cumbersome.

In terms of compatibility, different devices or protocol versions can achieve compatibility by extending Type definitions (e.g., reserving private Type ranges), allowing older devices to ignore unknown Types. Similarly, when upgrading the protocol, you only need to define a new Type value (e.g., 0x04 for PM2.5), allowing older devices to skip unknown fields while new devices parse the new data, provided you define a sufficient number of Type fields.

Dynamic Variable Length

The Length field following the Type is not fixed; it is usually sufficient to use 1-4 bytes, primarily to declare the byte length of the Value portion.

For example:

If Value is a 4-byte integer, Length=0x04

If Value is a variable-length string "Hello", Length=0x05

Of course, more flexible encoding can also be applied, where the byte count of the Length field can be dynamically adjusted based on protocol design.

Short data (≤255 bytes): Length is represented by 1 byte;
Long data (e.g., firmware packages): Length is represented by 4 bytes

This not only saves space by avoiding padding waste from fixed-length fields but also supports variable data to accommodate the varying data lengths in IoT scenarios (e.g., sensor data, configuration parameters, log text).

Have you ever used streaming transmission? By knowing the data boundary in advance through Length, there is no need to wait for a complete data packet, allowing the receiver to parse data while receiving it.

Efficient Binary Encoding

The TLV format is transmitted as a binary byte stream rather than a text format (like JSON or XML). Moreover, a compact encoding rule is adopted in the protocol specification, such as:

Integers use the minimum number of bytes (e.g., 1 byte for values within 0x7F, 2 bytes for larger values);

Floating-point numbers are compressed using the IEEE 754 standard;

Related posts

Leave a Comment Cancel reply