The author has limited capabilities, and if there are any errors in the article, I would greatly appreciate it if friends could point them out. Thank you!
Concept of Union
In Chinese, a union is also referred to as a union type or a union structure. Its definition format is the same as that of a struct, but its meaning is completely different. Below is the definition format of a union:
union union_name
{
member_list
} union_variable_name;
Since its definition is the same as that of a structure, what is the difference? The following example illustrates the difference between a struct and a union through nesting.
struct my_struct
{
int type;
union my_union
{
char *str;
int number;
} value;
} Elem_t;
The access method is the same as that of a structure. For example, to access the number variable, you can do so as follows:
Elem_t.value.number = 10;
What is the difference between a union and a struct? In summary, the addresses of the members in a union are the same, while the members in a struct each have their own addresses. The following image shows the storage of Elem_t in memory.
After seeing the storage location of the variable in memory, the characteristics of a union become clear. The benefits of such storage are evident: the program can use different types of variables while occupying only one variable’s storage space, thus saving storage space. In the above program, the two members of the union occupy the same amount of storage space, which is four bytes. Therefore, the total storage space occupied by this union is four bytes. If the storage sizes of the union members are different, the size of the union’s storage space depends on the largest member’s storage size.
Applications of Union
Using Union to Pack Data
When using a union to pack data, it is essential to know whether the current processor is big-endian or little-endian.
-
Big-endian: The low-order data is stored in the high memory address, while the high-order data is stored in the low memory address.
-
Little-endian: The low-order data is stored in the low memory address, while the high-order data is stored in the high memory address.
Below is an example illustrating the storage format in both big-endian and little-endian systems.
With an understanding of big-endian and little-endian, let’s see how a union can pack data. Below is a code snippet:
#include <stdio.h>
int main(void)
{
union
{
unsigned int word;
struct
{
unsigned char byte1;
unsigned char byte2;
} byte;
} u1;
u1.byte.byte1 = 0x21;
u1.byte.byte2 = 0x43;
printf("The Value of word is:0x%x\n", u1.word);
}
The output of the above code will vary depending on the alignment mode. If it is in little-endian mode:
The Value of word is:0x4321
If it is in big-endian mode:
The Value of word is:0x2143
Of course, using this method to pack data has its drawbacks, as it can yield different results depending on the processor’s alignment. Therefore, we often use data shifting to achieve this:
uint8_t byte3 = 0x21;
uint8_t byte4 = 0x43;
uint16_t word;
word = (((uint16_t)byte4) << 8) | ((uint16_t)byte3);
The above method is not affected by the processor’s alignment and has better portability.
Union in Data Transmission
Background: There are two small cars that need to communicate, namely car A and car B. Sometimes, car A needs to send its current speed to car B, and sometimes it needs to send its current position, and at other times, it needs to send its current status.
Analysis: In the above background, we know that the messages sent do not require sending speed, status, and position simultaneously; these three parameters are sent separately. At this point, we can use the characteristics of a union to construct a data structure. The benefit of this approach is that it reduces the memory occupied by variables. For instance, if we do not use a union, we would typically use a struct like this:
struct buffer
{
uint8_t power; /* Current battery capacity */
uint8_t op_mode; /* Operating mode */
uint8_t temp; /* Current temperature */
uint16_t x_pos;
uint16_t y_pos;
uint16_t vel; /* Current speed of the car */
} my_buff;
Using the above structure, we can calculate (not considering memory alignment; memory alignment would require padding the structure’s memory, which I plan to write about in a separate article) that the structure occupies 9 bytes of storage. To optimize our code, we can construct the data we want to transmit as follows:
union
{
struct
{
uint8_t power;
uint8_t op_mode;
uint8_t temp;
} status;
struct
{
uint16_t x_pos;
uint16_t y_pos;
} position;
uint16_t vel;
} msg_union;
In this way, the storage space occupied by this union is only 4 bytes. If we want to encapsulate the data to be sent into a data frame, the above-defined union presents a problem because the receiver does not know which parameter was sent by the sender. Therefore, we need to add a parameter type variable, resulting in the following code:
struct
{
uint8_t msg_type;
union
{
struct
{
uint8_t power;
uint8_t op_mode;
uint8_t temp;
} status;
struct
{
uint16_t x_pos;
uint16_t y_pos;
} position;
uint16_t vel;
} msg_union;
} message;
With the addition of msg_type, we can parse the data on the receiving end.
Summary
Through the above example, we can review that if we do not use a union, when transmitting data, directly sending the data constructed by a struct results in a significantly larger data packet than using a union. Using a union not only helps us save storage space but also saves bandwidth during communication.
Union in Data Parsing
In the previous example, we optimized the code using a union in data transmission. What role does a union play in data parsing? Consider the following code:
typedef union
{
uint8_t buffer[PACKET_SIZE];
struct
{
uint8_t size;
uint8_t CMD;
uint8_t payload[PAYLOAD_SIZE];
uint8_t crc;
} fields;
} PACKET_t;
// Function call method: packet_builder(packet.buffer, new_data)
// When storing new data in the buffer, some additional operations are needed
// For example, size should be stored in buffer[0]
// CMD should be stored in buffer[1], and so on
void packet_builder(uint8_t *buffer, uint8_t data)
{
static uint8_t received_bytes = 0;
buffer[received_bytes++] = data;
}
void packet_handler(PACKET_t *packet)
{
if (packet->fields.size > TOO_BIG)
{
// Error
}
if (packet->fields.cmd == CMD)
{
// Process corresponding data
}
}
To understand this data parsing process, we need to utilize the characteristic of a union where members share the same address. The elements in buffer[PACKET_SIZE] correspond one-to-one with the elements in fields. A diagram can illustrate this clearly, as shown below:
After looking at this diagram, it becomes clear that when data is written to the buffer, it can be directly read from fields.
Conclusion
Effectively utilizing a union not only saves storage space but also leverages the shared address feature to achieve sophisticated effects. I had not used unions much before, but my recent study of unions has made me realize that the road ahead is long and winding. However, I also quote a saying from Hu Shi: “What is there to fear about the infinite truth? A little progress brings a little joy.”
References:
[1] https://www.allaboutcircuits.com/technical-articles/union-in-c-language-for-packing-and-unpacking-data/
[2] https://www.allaboutcircuits.com/technical-articles/learn-embedded-c-programming-language-understanding-union-data-object/.
[3] https://stackoverflow.com/questions/252552/why-do-we-need-c-unions.
Your reading is my greatest encouragement, and your suggestions are my greatest improvement. Please click the image below to enter the mini-program for comments or add the author’s WeChat for mutual communication. The business card QR code can be found at the bottom of the public account.
Ending Remarks
This is the public account:“The Last Bug”, a technical knowledge enhancement base created for everyone. I also sincerely thank all the friends for their support, and we look forward to the next exciting issue!
Recommended Articles Click the blue text to jump
☞【Hard Shell】Play with “Machine Code” in C Programs (Small Knowledge Reveals Great Principles)
☞ Enlightenment, the Mysterious Register Keyword (C Language Edition)
☞【Collection】In-depth Analysis of Microcontroller Program Execution (C Program Version)
☞【Serial】Learning Microcontroller Driver Programming through “Library Files” (5) – Final Chapter
☞ Why are functions or variables generally not defined in .h files in C? (Essence)
☞【Collection】Advanced Tips for Structures Used by Experts
☞ In-depth Analysis of “Bit Order” and “Byte Order” (In Memory)
☞ I heard that the program crashed because the code was not “aligned”? (In-depth Analysis)
☞【Collection】Self-made Small GUI Framework (Design Ideas)