How to excel in Embedded Systems?
When you ask this question, you will often hear the advice to master C language!Today, I recommend a comprehensive summary of knowledge points in Embedded C Language written by an expert, which is definitely worth reading.
This article is based on embedded practice and combines relevant materials to elaborate on the C language knowledge and key points needed for embedded systems, hoping that everyone who reads this article can gain something. πππ
1). Data Types (commonly used char, short, int, long, unsigned, float, double)
2). Operations and Expressions (=, +, -, *, while, do-while, if, goto, switch-case)
3). Data Storage (auto, static, extern, const, register, volatile, restricted),
4). Structures (struct, enum, union, typedef),
5). Bit Operations and Logical Operations (<<, >>, &, |, ~, ^, &&),
6). Preprocessing (#define, #include, #error, #if…#elif…#else…#endif, etc.),
7). Platform Extension Keywords (__asm, __inline, __syscall)
These keywords together form the C syntax for embedded platforms.
Embedded applications can logically be abstracted into three parts:
1). Data Input (such as sensors, signals, interface input),
2). Data Processing (such as protocol decoding and packaging, conversion of AD sampling values, etc.)
3). Data Output (GUI display, output pin status, DA output control voltage, PWM duty cycle, etc.),
Data management runs through the entire development of embedded applications, which includes data types, memory management, bit and logical operations, as well as data structures. C language supports the implementation of these functions grammatically and provides corresponding optimization mechanisms to cope with the more constrained resource environment in embedded systems.
C language supports commonly used character, integer, and floating-point variables. Some compilers, such as Keil, also extend support for bit and sfr (register) data types to meet special address operations.
C language only specifies the minimum value range for each basic data type, so the same type may occupy different lengths of storage space on different chip platforms, which requires consideration of subsequent portability compatibility during code implementation.
The typedef provided by C language is the keyword used to handle this situation, commonly adopted in most cross-platform software projects, as shown below:
typedef unsigned char uint8_t;typedef unsigned short uint16_t;typedef unsigned int uint32_t;......typedef signed int int32_t;
printf("int size:%d, short size:%d, char size:%d\n", sizeof(int), sizeof(char), sizeof(short));
char *p;printf("point p size:%d\n", sizeof(p));
This is actually related to the addressable width of the chip; for example, a 32-bit MCU has a width of 4, while a 64-bit MCU has a width of 8. Sometimes this is also a simple way to check the MCU’s bit width.
1). Allocate from static storage area.
Memory is allocated during program compilation and remains throughout the program’s runtime, such as global variables and static variables.
2). Create on the stack.
When a function executes, the storage units for local variables in the function can be created on the stack, and these storage units are automatically released when the function execution ends. Stack memory allocation operations are built into the processor’s instruction set, making them highly efficient, but the allocated memory capacity is limited.
3). Allocate from the heap, also known as dynamic memory allocation.
Programs can request any amount of memory using malloc or new during runtime, and the programmer is responsible for when to free or delete the memory. The lifespan of dynamic memory is determined by the programmer, offering great flexibility but also encountering the most issues.
Letβs first look at a simple C language example.
//main.c#include <stdio.h>#include <stdlib.h>static int st_val; // Static global variable -- static storage areaint ex_val; // Global variable -- static storage areaint main(void){ int a = 0; // Local variable -- allocated on stack int *ptr = NULL; // Pointer variable static int local_st_val = 0; // Static variable local_st_val += 1; a = local_st_val; ptr = (int *)malloc(sizeof(int)); // Allocate space from heap if(ptr != NULL) { printf("*p value:%d", *ptr); free(ptr); ptr = NULL; // After free, ptr needs to be set to NULL to avoid invalid pointer checks } }
The scope of C language not only describes the accessible area of identifiers, but also specifies the storage area of variables. The variables st_val and ex_val with file scope are allocated in the static storage area, where the static keyword mainly limits whether the variable can be accessed by other files. Meanwhile, the variables a, ptr, and local_st_val within the block scope are allocated in different areas based on their types; a is a local variable allocated on the stack, ptr is a pointer allocated space using malloc.
Therefore, it is defined in the heap, while local_st_val is limited by the keyword, indicating allocation in the static storage area. This involves an important knowledge point: the meaning of static differs in file scope and block scope:In file scope, it is used to limit the external linkage of functions and variables (whether they can be accessed by other files), while in block scope, it is used to allocate variables to static storage area.
For C language, understanding the above knowledge is generally sufficient for memory management, but for embedded C, defining a variable does not necessarily mean it resides in memory (SRAM); it may also be in FLASH space or directly stored in registers (register defined variables or some local variables under high optimization levels).
For example, a global variable defined as const is placed in FLASH, while a local variable defined as register may be optimized to reside directly in general-purpose registers. Understanding this knowledge is significant for code maintenance, especially when optimizing runtime speed or dealing with constrained storage.
Additionally, the compiler for embedded C language will extend memory management mechanisms, such as supporting scatter loading and __attribute__((section(“user-defined area”))), which allows specifying variables to be stored in special areas (like SDRAM, SQI FLASH), enhancing memory management to adapt to complex application scenarios and requirements.
LD_ROM 0x00800000 0x10000 { ;load region size_regionEX_ROM 0x00800000 0x10000 { ;load address = execution address*.o (RESET, +First)*(InRoot$$Sections).ANY (+RO)}EX_RAM 0x20000000 0xC000 { ;rw Data.ANY (+RW +ZI)}EX_RAM1 0x2000C000 0x2000 {.ANY(MySection)}EX_RAM2 0x40000000 0x20000{.ANY(Sdram)}int a[10] __attribute__((section("Mysection"));int b[100] __attribute__((section("Sdram"));
By adopting this method, we can specify variables to the required areas, which is sometimes necessary, for instance, when doing GUI or web applications that require storing a large number of images and documents, the internal FLASH space may not be sufficient, thus allowing variable declarations in external areas.
Furthermore, certain parts of data in memory are critical, and to prevent them from being overwritten by other content, it may be necessary to allocate separate SRAM areas to avoid fatal errors caused by unintended modifications. These experiences are commonly used and important in actual product development, but due to space limitations, only brief examples are provided here. If you encounter such needs in your work, it is recommended to understand them in detail.
As for heap usage, it is consistent with standard C language in embedded Linux; attention should be paid to checking after malloc, and remember to set to NULL after freeing to avoid “wild pointers”.
However, for resource-constrained microcontrollers, scenarios requiring frequent memory block allocations are generally rare. If frequent allocation is needed, a memory management mechanism based on static storage and memory block partitioning is often constructed, which not only enhances efficiency (by pre-partitioning fixed-size blocks for direct lookup during use) but also allows for controlled use of memory blocks, effectively avoiding memory fragmentation issues..
An array consists of elements of the same type. When it is declared, the compiler allocates a section of memory based on the characteristics of the internal elements. Additionally, C language provides multi-dimensional arrays to meet the needs of special scenarios, while pointers provide a symbolic method to use addresses. Only pointing to specific addresses is meaningful. The flexibility of C language pointers is the greatest; before being accessed, they can point to any address, greatly facilitating hardware operations but simultaneously imposing higher requirements on developers.
Refer to the code below.
int main(void){char cval[] = "hello";int i;int ival[] = {1, 2, 3, 4};int arr_val[][2] = {{1, 2}, {3, 4}};const char *pconst = "hello";char *p;int *pi;int *pa;int **par;p = cval; p++; //addr increases by 1 pi = ival; pi+=1; //addr increases by 4 pa = arr_val[0]; pa+=1; //addr increases by 4 par = arr_val; par++; //addr increases by 8for(i=0; i<sizeof(cval); i++) {printf("%d ", cval[i]); }printf("\n");printf("pconst:%s\n", pconst);printf("addr:%d, %d\n", cval, p);printf("addr:%d, %d\n", ival, pi);printf("addr:%d, %d\n", arr_val, pa);printf("addr:%d, %d\n", arr_val, par);}
For arrays, values are generally accessed starting from 0, with length-1 as the endpoint, accessing through the half-open interval [0, length). This generally does not cause issues, but sometimes, when we need to read the array backward, there is a risk of mistakenly treating length as the starting point, leading to out-of-bounds access. Additionally, when operating on arrays, to save space, the index variable i is sometimes defined as unsigned char type.
In C language, the range of unsigned char type is 0~255. If the array is large, it may lead to exceeding the array limits, causing an infinite loop. This is something that can be easily avoided during the initial code construction, but if requirements change later and the array size increases, there will be hidden risks in other places using the array, which need to be particularly noted.
#include <stdio.h>typedef struct{int b;int a;}STRUCT_VAL;static __align(4) char arr[8] = {0x12, 0x23, 0x34, 0x45, 0x56, 0x12, 0x24, 0x53};int main(void){ STRUCT_VAL *pval;int *ptr; pval = (STRUCT_VAL *)arr; ptr = (int *)&arr[4];printf("val:%d, %d", pval->a, pval->b);printf("val:%d,", *ptr);}//0x45342312 0x53241256//0x53241256
Based on pointer type casting, efficient and quick solutions to data parsing issues arise in protocol analysis and data storage management. However, data alignment and endianness are common and error-prone issues during processing.
For example, the above character array arr is defined as 4-byte aligned through __align(4), ensuring that subsequent access after converting to int pointer does not trigger unaligned access exceptions. Without this definition, char defaults to 1-byte alignment.
Of course, this does not necessarily trigger exceptions (it depends on the overall memory layout of arr’s address and whether the actual space supports unaligned access, as some SDRAM may trigger exceptions during unaligned access), but adding or removing other variables may trigger such exceptions, and the place where exceptions occur often has no relation to the variables added. Moreover, code may run normally on some platforms but trigger exceptions when switched to others. This hidden phenomenon is challenging to trace and resolve in embedded systems.
Additionally, C language pointers have a special usage for accessing specific physical addresses through type casting and implementing callbacks via function pointers, as shown below:
#include <stdio.h>typedef int (*pfunc)(int, int);int func_add(int a, int b){return a+b;}int main(void){ pfunc *func_ptr; *(volatile uint32_t *)0x20001000 = 0x01a23131; func_ptr = func_add;printf("%d\n", func_ptr(1, 2));}
Here, it is important to note that volatile refers to variables that can change, typically used in the following situations:
1) Hardware registers of parallel devices (such as status registers)
2) Non-automatic variables accessed by an interrupt service routine (Non-automatic variables)
3) Variables shared among multiple tasks in multi-threaded applications
Using volatile can resolve synchronization issues when accessing the same variable from user mode and exception interrupts. Additionally, it prevents optimization of address access, ensuring actual address access.
Mastering the use of volatile is crucial in embedded systems and is a basic requirement for practitioners in embedded C.
Function pointers are not commonly seen in general embedded software development, but they can implement many applications such as asynchronous callbacks and driver modules in a simple way. However, this can only be considered a starting point; many details are worth understanding and mastering in depth.
C language provides custom data types to describe a class of transactions with the same characteristic points, mainly supporting structures, enumerations, and unions..
Among them, enumeration restricts data access through aliases, making data more intuitive and readable, implemented as follows:
typedef enum {spring=1, summer, autumn, winter }season;season s1 = summer;
Unions can store different types of data in the same storage space, and the size of the union is determined by the largest variable it contains, as shown below:
typedef union{ char c; short s; int i; }UNION_VAL;UNION_VAL val; int main(void){ printf("addr:0x%x, 0x%x, 0x%x\n", (int)(&val.c), (int)(&val.s), (int)(&val.i)); val.i = 0x12345678; if(val.s == 0x5678) printf("Little-endian\n"); elseprintf("Big-endian\n"); } /*addr:0x407970, 0x407970, 0x407970 Little-endian*/
int data = 0x12345678; short *pdata = (short *)&data if(*pdata == 0x5678) printf("%s\n", "Little-endian"); elseprintf("%s\n", "Big-endian");
Structures are collections of variables with common characteristics. Compared to C++ classes, they do not have security access restrictions and do not support functions directly, but through custom data types and function pointers, many operations similar to classes can still be achieved.
For most embedded projects, structured data handling greatly facilitates the optimization of the overall architecture and subsequent maintenance. Here is an example:
typedef int (*pfunc)(int, int); typedef struct{int num; int profit; pfunc get_total; }STRUCT_VAL;int GetTotalProfit(int a, int b){ return a*b; } int main(void){ STRUCT_VAL Val; STRUCT_VAL *pVal; Val.get_total = GetTotalProfit; Val.num = 1; Val.profit = 10; printf("Total:%d\n", Val.get_total(Val.num, Val.profit)); //Variable access pVal = &Val printf("Total:%d\n", pVal->get_total(pVal->num, pVal->profit)); //Pointer access } /* Total:10 Total:10 */
The structure in C language supports accessing through pointers and variables, allowing parsing of any data in memory (for example, we previously mentioned parsing protocols through pointer type casting). Moreover, packaging data and function pointers and passing them through pointers is an important foundation for implementing driver layer interface switching, holding significant practical significance.
Additionally, using bit fields, unions, and structures can achieve another type of bit manipulation, which is crucial for encapsulating low-level hardware registers, as demonstrated in practice:
typedef unsigned char uint8_t; union reg{ struct{uint8_t bit0:1; uint8_t bit1:1; uint8_t bit2_6:5; uint8_t bit7:1; }bit; uint8_t all; }; int main(void){ union reg RegData; RegData.all = 0; RegData.bit.bit0 = 1; RegData.bit.bit7 = 1; printf("0x%x\n", RegData.all); RegData.bit.bit2_6 = 0x3; printf("0x%x\n", RegData.all); } /* 0x81 0x8d*/
Through unions and bit field operations, we can access the bits of data, providing a simple and intuitive processing method for registers and memory-constrained platforms.
Furthermore, another important knowledge point regarding structures is alignment. Accessing through alignment can significantly improve runtime efficiency, but the storage length issue introduced by alignment can also easily lead to errors. Understanding alignment can be categorized as follows:
-
Basic Data Types: Aligned by default length, such as char aligned by 1 byte, short by 2 bytes, etc.
-
Arrays: Aligned according to the basic data type; if the first is aligned, the subsequent ones will naturally align as well.
-
Unions: Aligned by the maximum data type length it contains.
-
Structures: Each data type in a structure must be aligned, and the structure itself is aligned by the length of the largest internal data type.
union DATA{ int a; char b; }; struct BUFFER0{union DATA data; char a; //reserved[3] int b; short s; //reserved[2] }; //16 bytes struct BUFFER1{char a; //reserved[0] short s; union DATA data; int b; };//12 bytes int main(void){ struct BUFFER0 buf0;struct BUFFER1 buf1;printf("size:%d, %d\n", sizeof(buf0), sizeof(buf1)); printf("addr:0x%x, 0x%x, 0x%x, 0x%x\n", (int)(&buf0.data), (int)(&buf0.a), (int)(&buf0.b), (int)(&buf0.s)); printf("addr:0x%x, 0x%x, 0x%x, 0x%x\n", (int)(&buf1.a), (int)(&buf1.s), (int)(&buf1.data), (int)(&buf1.b)); } /* size:16, 12 addr:0x61fe10, 0x61fe14, 0x61fe18, 0x61fe1c addr:0x61fe04, 0x61fe06, 0x61fe08, 0x61fe0c */
#include includes file command; in C language, its effect is to insert all contents of the included file at the current position, which includes not only header files but also parameter files and configuration files that can be inserted into specified positions in the current code. The <> and “” indicate whether to start searching from the standard library path or user-defined path.
#define macro definition is commonly used to define constants or code segment aliases. In some cases, it can also achieve unified processing of interfaces by combining with ## formatted strings, as illustrated below:
#define MAX_SIZE 10#define MODULE_ON 1#define ERROR_LOOP() do{
printf("error loop\n");
}while(0);#define global(val) g_##valint global(v) = 10;int global(add)(int a, int b){return a+b;ο½
#undef cancels the defined parameter to avoid redefinition issues.
#error, #warning are used for user-defined warning messages, which can be limited by pre-defined configurations when used with #if, #ifdef.
#pragma is a pre-defined processing directive with parameters. Commonly used #pragma pack(1), but using this will cause the entire file to align according to the set byte alignment. Using push and pop can resolve this issue, as shown in the code below:
#pragma pack(push)#pragma pack(1)struct TestA{char i;int b;}A;#pragma pack(pop); // Note to call pop; otherwise, the subsequent file will be aligned according to the pack definition, leading to unexpected results, equivalent to struct _TestB{char i;int b; }__attribute__((packed))A;
If you have read this far, you should have a clearer understanding of C language. Embedded C language provides developers with ample freedom in handling hardware physical addresses, bit operations, and memory access. Through arrays, pointers, and type casting techniques, data processing can effectively reduce copying processes, which is necessary for low-level development and facilitates the overall architecture development.
However, the illegal access, overflow, out-of-bounds, and alignment, data width, endianness issues across different hardware platforms brought by this freedom are generally manageable by functional designers, but for those who take over the project later, if the initial design does not consider these issues clearly, it often signifies problems and troubles. Therefore, for any practitioner in embedded C, a clear grasp of these foundational knowledge points is essential.
———— END ————
Hua Qing Yuan Jian has been deeply rooted in the high-end IT field for 20 years, meticulously creating high-end courses in [Embedded Systems] and [Artificial Intelligence]. A complete learning path helps you progress from basics to advanced levels, suitable for industry novices, technical transitions, cross-industry transitions, and in-service enhancements.
Additionally, Hua Mei has prepared a surprise for everyone! Including Embedded Systems, Internet of Things, Artificial Intelligence and many high-paying employment courses, along with special course benefits for everyone~
πππ

If you have any questions scan to contact Hua Mei~

Hua Qing Yuan Jian Education Technology Group was established in 2004, a leading brand in the technology-driven new generation information technology education service integrating industry, academia, and research. It aims to enable every student to obtain IT education services that keep pace with the forefront of technology simply, directly, and efficiently, achieving high-end career dreams. From offline to online, from teaching to research and development, from theory to practice, from campus to workplace, it also provides comprehensive talent training solutions for enterprises, universities, and individuals. Currently, it has established 13 learning centers in major first and second-tier cities across the country, delivering over 300,000 IT talents to enterprises and empowering over 1,100 universities and 20,000 enterprises with talent training and support.