Concept of union
In Chinese, union is referred to as a union, joint, or collective. Its definition format is the same as that of a struct, but its meaning is entirely different. Below is the definition format of a union:
union union_name
{
member_list
} union_variable_name;
Since its definition format is the same as that of a struct, what is the difference? The distinction is illustrated below through a nested struct and union.
struct my_struct
{
int type;
union my_union
{
char *str;
int number;
} value;
} Elem_t;
The access method is the same as that of a struct. For example, to access the number variable, you can do so as follows:
Elem_t.value.number = 10;
What is the difference between union and struct? In a nutshell, the addresses of the members in a union are the same, while the members in a struct each have their own addresses. The following image shows the storage of Elem_t in memory.
After seeing the storage location of the variable in memory, the characteristics of union become clear. The benefits of such storage are evident; the program can use different types of variables while occupying only one variable’s storage space, thus saving storage space. In the above program, the two members of the union occupy the same storage space size, which is four bytes. Therefore, the total storage space occupied by this union is four bytes. If the storage sizes of the union members are different, the size of the union’s storage space depends on the largest member’s storage size.
Applications of union
Using union to pack data
When using a union to pack data, it is essential to understand whether the current processor is big-endian or little-endian.
-
Big-endian: The low-order data is stored in the high memory address, while the high-order data is stored in the low memory address.
-
Little-endian: The low-order data is stored in the low memory address, while the high-order data is stored in the high memory address.
Below is an example illustrating the storage format in both big-endian and little-endian.
With an understanding of big-endian and little-endian, let’s see how union can pack data. Below is a piece of code:
#include <stdio.h>
int main(void)
{
union
{
unsigned int word;
struct
{
unsigned char byte1;
unsigned char byte2;
} byte;
} u1;
u1.byte.byte1 = 0x21;
u1.byte.byte2 = 0x43;
printf("The Value of word is:0x%x\n", u1.word);
}
The output of the above will vary depending on the alignment mode. If it is in little-endian mode:
The Value of word is:0x4321
If it is in big-endian mode:
The Value of word is:0x2143
Of course, using this method for data packing has its drawbacks, as it can yield different results based on the processor’s alignment. Therefore, we often use data shifting to achieve this:
uint8_t byte3 = 0x21;
uint8_t byte4 = 0x43;
uint16_t word;
word = (((uint16_t)byte4) << 8) | ((uint16_t)byte3);
The above method is not affected by the processor’s alignment and has better portability.
Application of union in data transmission
Background: There are two small vehicles that need to communicate, namely vehicle A and vehicle B. Sometimes, vehicle A needs to send its current speed to vehicle B, and at other times, vehicle A needs to send its current position to vehicle B, and sometimes vehicle A needs to send its current status to vehicle B.
Analysis: In the above context, we know that the messages sent do not require sending speed, status, and position simultaneously; these three parameters are sent separately. At this point, we can use the characteristics of union to construct a data structure. The benefit of this approach is that it reduces the memory occupied by variables. For instance, if we do not use union, we would typically use a struct like this:
struct buffer
{
uint8_t power; /* Current battery capacity */
uint8_t op_mode; /* Operating mode */
uint8_t temp; /* Current temperature */
uint16_t x_pos;
uint16_t y_pos;
uint16_t vel; /* Current speed of the vehicle */
} my_buff;
Using the above struct, we can calculate (not considering memory alignment; memory alignment would require padding the struct’s memory, which I plan to write about in a separate article), the struct occupies 9 bytes of storage. To optimize our code, we can construct the data we want to transmit as follows:
union
{
struct
{
uint8_t power;
uint8_t op_mode;
uint8_t temp;
} status;
struct
{
uint16_t x_pos;
uint16_t y_pos;
} position;
uint16_t vel;
} msg_union;
In this way, the union occupies only 4 bytes of storage. If we want to encapsulate the data to be sent into a data frame, the above-defined union presents a problem because the receiver does not know which parameter was sent by the sender. Therefore, we need to add a parameter type variable, resulting in the following code:
struct
{
uint8_t msg_type;
union
{
struct
{
uint8_t power;
uint8_t op_mode;
uint8_t temp;
} status;
struct
{
uint16_t x_pos;
uint16_t y_pos;
} position;
uint16_t vel;
} msg_union;
} message;
With the addition of msg_type, we can parse the data on the receiving end.
Summary
Through the above example, we can review that if we do not use union, when transmitting data, directly sending the data formed by struct results in a significantly larger data packet than using union. Using union not only helps us save storage space but also saves bandwidth during communication.
Application of union in data parsing
In the previous example, we optimized the code using union in data transmission. What role does union play in data parsing? Consider the following piece of code:
typedef union
{
uint8_t buffer[PACKET_SIZE];
struct
{
uint8_t size;
uint8_t CMD;
uint8_t payload[PAYLOAD_SIZE];
uint8_t crc;
} fields;
} PACKET_t;
// Function call method: packet_builder(packet.buffer, new_data)
// When storing new data in the buffer, some additional operations are needed
// For example, size should be stored in buffer[0]
// cmd should be stored in buffer[1], and so on
void packet_builder(uint8_t *buffer, uint8_t data)
{
static uint8_t received_bytes = 0;
buffer[received_bytes++] = data;
}
void packet_handler(PACKET_t *packet)
{
if (packet->fields.size > TOO_BIG)
{
// Error
}
if (packet->fields.cmd == CMD)
{
// Process corresponding data
}
}
To understand this data parsing process, we need to utilize the characteristic of union where members share the same address. The elements in buffer[PACKET_SIZE] correspond one-to-one with the elements in fields, as illustrated in the following image:
After viewing this image, it becomes clear that after writing data into the buffer, we can directly read it from fields.
Conclusion
Effectively utilizing union not only saves storage space but also leverages the shared address feature to achieve intricate effects. I had not used union much before, but my recent study of union has made me realize that the road ahead is long and winding. However, I also quote a saying from Hu Shi: “What is there to fear about the infinite truth? Every inch gained brings an inch of joy.”
References:
[1] https://www.allaboutcircuits.com/technical-articles/union-in-c-language-for-packing-and-unpacking-data/
[2] https://www.allaboutcircuits.com/technical-articles/learn-embedded-c-programming-language-understanding-union-data-object/.
[3] https://stackoverflow.com/questions/252552/why-do-we-need-c-unions.
ENDAuthor: wenzidSource: Wenzi Embedded SoftwareCopyright belongs to the original author. If there is any infringement, please contact for deletion.▍Recommended ReadingWhy is Russia not worried about chip embargo?A jump method in C language that is even more “domineering” than goto3 open-source libraries commonly used by experts that make microcontroller development twice as effective!→ Follow to avoid getting lost ←