Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

How to excel in Embedded Systems?

When you ask this question, you will often hear the advice to master C language!Today, I recommend a comprehensive summary of knowledge points in Embedded C Language written by an expert, which is definitely worth reading.

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

From a grammatical perspective, C language is not complex, but writing high-quality and reliable embedded C programs is not easy, as it requires familiarity with hardware characteristics and defects, as well as a certain understanding of compilation principles and computer technology.

This article is based on embedded practice and combines relevant materials to elaborate on the C language knowledge and key points needed for embedded systems, hoping that everyone who reads this article can gain something. πŸ˜„πŸ˜„πŸ˜„

1. Keywords
Keywords are reserved identifiers in C language with special functions, which can be divided according to their functions as follows:

1). Data Types (commonly used char, short, int, long, unsigned, float, double)

2). Operations and Expressions (=, +, -, *, while, do-while, if, goto, switch-case)

3). Data Storage (auto, static, extern, const, register, volatile, restricted),

4). Structures (struct, enum, union, typedef),

5). Bit Operations and Logical Operations (<<, >>, &, |, ~, ^, &&),

6). Preprocessing (#define, #include, #error, #if…#elif…#else…#endif, etc.),

7). Platform Extension Keywords (__asm, __inline, __syscall)

These keywords together form the C syntax for embedded platforms.

Embedded applications can logically be abstracted into three parts:

1). Data Input (such as sensors, signals, interface input),

2). Data Processing (such as protocol decoding and packaging, conversion of AD sampling values, etc.)

3). Data Output (GUI display, output pin status, DA output control voltage, PWM duty cycle, etc.),

Data management runs through the entire development of embedded applications, which includes data types, memory management, bit and logical operations, as well as data structures. C language supports the implementation of these functions grammatically and provides corresponding optimization mechanisms to cope with the more constrained resource environment in embedded systems.

2. Data Types

C language supports commonly used character, integer, and floating-point variables. Some compilers, such as Keil, also extend support for bit and sfr (register) data types to meet special address operations.

C language only specifies the minimum value range for each basic data type, so the same type may occupy different lengths of storage space on different chip platforms, which requires consideration of subsequent portability compatibility during code implementation.

The typedef provided by C language is the keyword used to handle this situation, commonly adopted in most cross-platform software projects, as shown below:

typedef unsigned char uint8_t;typedef unsigned short uint16_t;typedef unsigned int uint32_t;......typedef signed int int32_t;
Since the basic data widths differ across platforms, determining the width of the basic data type, such as int, on the current platform requires the interface sizeof provided by C language, implemented as follows.
printf("int size:%d, short size:%d, char size:%d\n", sizeof(int), sizeof(char), sizeof(short));
Another important knowledge point is the width of pointers, as shown below:
char *p;printf("point p size:%d\n", sizeof(p));

This is actually related to the addressable width of the chip; for example, a 32-bit MCU has a width of 4, while a 64-bit MCU has a width of 8. Sometimes this is also a simple way to check the MCU’s bit width.

3. Memory Management and Storage Architecture
C language allows program variables to determine memory addresses upon definition, implementing a fine-grained handling mechanism through scope and the keywords extern and static.According to different hardware areas, memory allocation can be divided into three methods (excerpted from C++ High-Quality Programming):

1). Allocate from static storage area.

Memory is allocated during program compilation and remains throughout the program’s runtime, such as global variables and static variables.

2). Create on the stack.

When a function executes, the storage units for local variables in the function can be created on the stack, and these storage units are automatically released when the function execution ends. Stack memory allocation operations are built into the processor’s instruction set, making them highly efficient, but the allocated memory capacity is limited.

3). Allocate from the heap, also known as dynamic memory allocation.

Programs can request any amount of memory using malloc or new during runtime, and the programmer is responsible for when to free or delete the memory. The lifespan of dynamic memory is determined by the programmer, offering great flexibility but also encountering the most issues.

Let’s first look at a simple C language example.

//main.c#include <stdio.h>#include <stdlib.h>static int st_val;                   // Static global variable -- static storage areaint ex_val;                           // Global variable -- static storage areaint main(void){   int a = 0;                         // Local variable -- allocated on stack   int *ptr = NULL;                   // Pointer variable   static int local_st_val = 0;       // Static variable   local_st_val += 1;   a = local_st_val;   ptr = (int *)malloc(sizeof(int)); // Allocate space from heap   if(ptr != NULL)   {          printf("*p value:%d", *ptr);    free(ptr);          ptr = NULL;          // After free, ptr needs to be set to NULL to avoid invalid pointer checks       }            }

The scope of C language not only describes the accessible area of identifiers, but also specifies the storage area of variables. The variables st_val and ex_val with file scope are allocated in the static storage area, where the static keyword mainly limits whether the variable can be accessed by other files. Meanwhile, the variables a, ptr, and local_st_val within the block scope are allocated in different areas based on their types; a is a local variable allocated on the stack, ptr is a pointer allocated space using malloc.

Therefore, it is defined in the heap, while local_st_val is limited by the keyword, indicating allocation in the static storage area. This involves an important knowledge point: the meaning of static differs in file scope and block scope:In file scope, it is used to limit the external linkage of functions and variables (whether they can be accessed by other files), while in block scope, it is used to allocate variables to static storage area.

For C language, understanding the above knowledge is generally sufficient for memory management, but for embedded C, defining a variable does not necessarily mean it resides in memory (SRAM); it may also be in FLASH space or directly stored in registers (register defined variables or some local variables under high optimization levels).

For example, a global variable defined as const is placed in FLASH, while a local variable defined as register may be optimized to reside directly in general-purpose registers. Understanding this knowledge is significant for code maintenance, especially when optimizing runtime speed or dealing with constrained storage.

Additionally, the compiler for embedded C language will extend memory management mechanisms, such as supporting scatter loading and __attribute__((section(“user-defined area”))), which allows specifying variables to be stored in special areas (like SDRAM, SQI FLASH), enhancing memory management to adapt to complex application scenarios and requirements.

LD_ROM 0x00800000 0x10000 { ;load region size_regionEX_ROM 0x00800000 0x10000 { ;load address = execution address*.o (RESET, +First)*(InRoot$$Sections).ANY (+RO)}EX_RAM 0x20000000 0xC000 { ;rw Data.ANY (+RW +ZI)}EX_RAM1 0x2000C000 0x2000 {.ANY(MySection)}EX_RAM2 0x40000000 0x20000{.ANY(Sdram)}int a[10] __attribute__((section("Mysection"));int b[100] __attribute__((section("Sdram"));

By adopting this method, we can specify variables to the required areas, which is sometimes necessary, for instance, when doing GUI or web applications that require storing a large number of images and documents, the internal FLASH space may not be sufficient, thus allowing variable declarations in external areas.

Furthermore, certain parts of data in memory are critical, and to prevent them from being overwritten by other content, it may be necessary to allocate separate SRAM areas to avoid fatal errors caused by unintended modifications. These experiences are commonly used and important in actual product development, but due to space limitations, only brief examples are provided here. If you encounter such needs in your work, it is recommended to understand them in detail.

As for heap usage, it is consistent with standard C language in embedded Linux; attention should be paid to checking after malloc, and remember to set to NULL after freeing to avoid “wild pointers”.

However, for resource-constrained microcontrollers, scenarios requiring frequent memory block allocations are generally rare. If frequent allocation is needed, a memory management mechanism based on static storage and memory block partitioning is often constructed, which not only enhances efficiency (by pre-partitioning fixed-size blocks for direct lookup during use) but also allows for controlled use of memory blocks, effectively avoiding memory fragmentation issues..

4. Pointers and Arrays
Arrays and pointers are often the main causes of program bugs, such as array out of bounds, pointer out of bounds, illegal address access, unaligned access, these issues often have pointers and arrays lurking behind them. Therefore, understanding and mastering pointers and arrays is a necessary path to becoming a qualified C language developer.

An array consists of elements of the same type. When it is declared, the compiler allocates a section of memory based on the characteristics of the internal elements. Additionally, C language provides multi-dimensional arrays to meet the needs of special scenarios, while pointers provide a symbolic method to use addresses. Only pointing to specific addresses is meaningful. The flexibility of C language pointers is the greatest; before being accessed, they can point to any address, greatly facilitating hardware operations but simultaneously imposing higher requirements on developers.

Refer to the code below.

int main(void){char cval[] = "hello";int i;int ival[] = {1, 2, 3, 4};int arr_val[][2] = {{1, 2}, {3, 4}};const char *pconst = "hello";char *p;int *pi;int *pa;int **par;p = cval;  p++;            //addr increases by 1  pi = ival;  pi+=1;          //addr increases by 4  pa = arr_val[0];  pa+=1;          //addr increases by 4  par = arr_val;  par++;         //addr increases by 8for(i=0; i<sizeof(cval); i++)  {printf("%d ", cval[i]);  }printf("\n");printf("pconst:%s\n", pconst);printf("addr:%d, %d\n", cval, p);printf("addr:%d, %d\n", ival, pi);printf("addr:%d, %d\n", arr_val, pa);printf("addr:%d, %d\n", arr_val, par);}

For arrays, values are generally accessed starting from 0, with length-1 as the endpoint, accessing through the half-open interval [0, length). This generally does not cause issues, but sometimes, when we need to read the array backward, there is a risk of mistakenly treating length as the starting point, leading to out-of-bounds access. Additionally, when operating on arrays, to save space, the index variable i is sometimes defined as unsigned char type.

In C language, the range of unsigned char type is 0~255. If the array is large, it may lead to exceeding the array limits, causing an infinite loop. This is something that can be easily avoided during the initial code construction, but if requirements change later and the array size increases, there will be hidden risks in other places using the array, which need to be particularly noted.

As mentioned earlier, the space occupied by pointers is related to the chip’s addressing width; 32-bit platforms are 4 bytes, and 64-bit platforms are 8 bytes. The length in pointer arithmetic is also related to its type; for example, char type is 1, int type is 4. If you carefully observe the code above, you will find that the value of par increases by 8 because it points to pointers, where the corresponding variable is a pointer, meaning the length corresponds to the pointer type length, which is 8 on a 64-bit platform and 4 on a 32-bit platform. Understanding these characteristics is not difficult, but a slight oversight in engineering applications can lead to hard-to-detect issues. Additionally, pointers also support type casting, which can be quite useful in certain situations, as shown in the following code:
#include <stdio.h>typedef struct{int b;int a;}STRUCT_VAL;static __align(4) char arr[8] = {0x12, 0x23, 0x34, 0x45, 0x56, 0x12, 0x24, 0x53};int main(void){    STRUCT_VAL *pval;int *ptr;    pval = (STRUCT_VAL *)arr;    ptr = (int *)&arr[4];printf("val:%d, %d", pval->a, pval->b);printf("val:%d,", *ptr);}//0x45342312 0x53241256//0x53241256

Based on pointer type casting, efficient and quick solutions to data parsing issues arise in protocol analysis and data storage management. However, data alignment and endianness are common and error-prone issues during processing.

For example, the above character array arr is defined as 4-byte aligned through __align(4), ensuring that subsequent access after converting to int pointer does not trigger unaligned access exceptions. Without this definition, char defaults to 1-byte alignment.

Of course, this does not necessarily trigger exceptions (it depends on the overall memory layout of arr’s address and whether the actual space supports unaligned access, as some SDRAM may trigger exceptions during unaligned access), but adding or removing other variables may trigger such exceptions, and the place where exceptions occur often has no relation to the variables added. Moreover, code may run normally on some platforms but trigger exceptions when switched to others. This hidden phenomenon is challenging to trace and resolve in embedded systems.

Additionally, C language pointers have a special usage for accessing specific physical addresses through type casting and implementing callbacks via function pointers, as shown below:

#include <stdio.h>typedef int (*pfunc)(int, int);int func_add(int a, int b){return a+b;}int main(void){    pfunc *func_ptr;    *(volatile uint32_t *)0x20001000 = 0x01a23131;    func_ptr = func_add;printf("%d\n", func_ptr(1, 2));}

Here, it is important to note that volatile refers to variables that can change, typically used in the following situations:

1) Hardware registers of parallel devices (such as status registers)

2) Non-automatic variables accessed by an interrupt service routine (Non-automatic variables)

3) Variables shared among multiple tasks in multi-threaded applications

Using volatile can resolve synchronization issues when accessing the same variable from user mode and exception interrupts. Additionally, it prevents optimization of address access, ensuring actual address access.

Mastering the use of volatile is crucial in embedded systems and is a basic requirement for practitioners in embedded C.

Function pointers are not commonly seen in general embedded software development, but they can implement many applications such as asynchronous callbacks and driver modules in a simple way. However, this can only be considered a starting point; many details are worth understanding and mastering in depth.

5. Structure Types and Alignment

C language provides custom data types to describe a class of transactions with the same characteristic points, mainly supporting structures, enumerations, and unions..

Among them, enumeration restricts data access through aliases, making data more intuitive and readable, implemented as follows:

typedef enum {spring=1, summer, autumn, winter }season;season s1 = summer;

Unions can store different types of data in the same storage space, and the size of the union is determined by the largest variable it contains, as shown below:

typedef union{     char c;     short s;     int i; }UNION_VAL;UNION_VAL val; int main(void){     printf("addr:0x%x, 0x%x, 0x%x\n",                     (int)(&val.c), (int)(&val.s), (int)(&val.i));       val.i = 0x12345678;     if(val.s == 0x5678)         printf("Little-endian\n");       elseprintf("Big-endian\n"); } /*addr:0x407970, 0x407970, 0x407970 Little-endian*/
Unions are primarily used to access internal segments of data by sharing memory addresses, providing a more convenient way to parse certain variables. Testing the endianness of chips is also a common application of unions. Of course, using pointer type casting can also achieve this purpose, implemented as follows:
int data = 0x12345678; short *pdata = (short *)&data if(*pdata == 0x5678)     printf("%s\n", "Little-endian"); elseprintf("%s\n", "Big-endian");
It can be seen that using unions can sometimes avoid the misuse of pointers.

Structures are collections of variables with common characteristics. Compared to C++ classes, they do not have security access restrictions and do not support functions directly, but through custom data types and function pointers, many operations similar to classes can still be achieved.

For most embedded projects, structured data handling greatly facilitates the optimization of the overall architecture and subsequent maintenance. Here is an example:

typedef int (*pfunc)(int, int); typedef struct{int num;     int profit;       pfunc get_total; }STRUCT_VAL;int GetTotalProfit(int a, int b){     return a*b; }  int main(void){       STRUCT_VAL Val;       STRUCT_VAL *pVal;        Val.get_total = GetTotalProfit;       Val.num = 1;       Val.profit = 10;     printf("Total:%d\n",  Val.get_total(Val.num, Val.profit));  //Variable access      pVal = &Val     printf("Total:%d\n",  pVal->get_total(pVal->num, pVal->profit)); //Pointer access } /* Total:10 Total:10 */

The structure in C language supports accessing through pointers and variables, allowing parsing of any data in memory (for example, we previously mentioned parsing protocols through pointer type casting). Moreover, packaging data and function pointers and passing them through pointers is an important foundation for implementing driver layer interface switching, holding significant practical significance.

Additionally, using bit fields, unions, and structures can achieve another type of bit manipulation, which is crucial for encapsulating low-level hardware registers, as demonstrated in practice:

typedef unsigned char uint8_t; union reg{     struct{uint8_t bit0:1;         uint8_t bit1:1;         uint8_t bit2_6:5;         uint8_t bit7:1;       }bit;     uint8_t all; }; int main(void){     union reg RegData;       RegData.all = 0;        RegData.bit.bit0 = 1;       RegData.bit.bit7 = 1;     printf("0x%x\n", RegData.all);        RegData.bit.bit2_6 = 0x3;     printf("0x%x\n", RegData.all); } /* 0x81 0x8d*/

Through unions and bit field operations, we can access the bits of data, providing a simple and intuitive processing method for registers and memory-constrained platforms.

Furthermore, another important knowledge point regarding structures is alignment. Accessing through alignment can significantly improve runtime efficiency, but the storage length issue introduced by alignment can also easily lead to errors. Understanding alignment can be categorized as follows:

  • Basic Data Types: Aligned by default length, such as char aligned by 1 byte, short by 2 bytes, etc.

  • Arrays: Aligned according to the basic data type; if the first is aligned, the subsequent ones will naturally align as well.

  • Unions: Aligned by the maximum data type length it contains.

  • Structures: Each data type in a structure must be aligned, and the structure itself is aligned by the length of the largest internal data type.

union DATA{     int a;     char b; };  struct BUFFER0{union DATA data;     char a;     //reserved[3]     int b;     short s;     //reserved[2] }; //16 bytes  struct BUFFER1{char a;              //reserved[0]     short s;    union DATA data;     int b; };//12 bytes  int main(void){     struct BUFFER0 buf0;struct BUFFER1 buf1;printf("size:%d, %d\n", sizeof(buf0), sizeof(buf1));     printf("addr:0x%x, 0x%x, 0x%x, 0x%x\n",                     (int)(&buf0.data), (int)(&buf0.a), (int)(&buf0.b), (int)(&buf0.s));          printf("addr:0x%x, 0x%x, 0x%x, 0x%x\n",                     (int)(&buf1.a), (int)(&buf1.s), (int)(&buf1.data), (int)(&buf1.b)); } /* size:16, 12 addr:0x61fe10, 0x61fe14, 0x61fe18, 0x61fe1c addr:0x61fe04, 0x61fe06, 0x61fe08, 0x61fe0c */
Among them, the size of the union is consistent with the largest variable, int, which is 4 bytes. Based on the values read, the actual memory layout and padding positions are consistent. In fact, learning to understand C language’s alignment mechanism through padding is an effective and quick way.
6. Preprocessing Mechanism
C language provides a rich preprocessing mechanism that facilitates cross-platform code implementation. Additionally, C language’s macro mechanism allows for data and code block replacements, string formatting, and code segment switching, which hold significant importance for engineering applications. Below, I will describe commonly used preprocessing mechanisms in C language according to functional requirements.

#include includes file command; in C language, its effect is to insert all contents of the included file at the current position, which includes not only header files but also parameter files and configuration files that can be inserted into specified positions in the current code. The <> and “” indicate whether to start searching from the standard library path or user-defined path.

#define macro definition is commonly used to define constants or code segment aliases. In some cases, it can also achieve unified processing of interfaces by combining with ## formatted strings, as illustrated below:

#define MAX_SIZE  10#define MODULE_ON  1#define ERROR_LOOP() do{
                     printf("error loop\n");
                   }while(0);#define global(val) g_##valint global(v) = 10;int global(add)(int a, int b){return a+b;}
#if..#elif…#else…#endif, #ifdef..#endif, #ifndef…#endif conditional selection judgment; conditional selection is often used to switch code blocks, especially in comprehensive and cross-platform projects to meet various demands.

#undef cancels the defined parameter to avoid redefinition issues.

#error, #warning are used for user-defined warning messages, which can be limited by pre-defined configurations when used with #if, #ifdef.

#pragma is a pre-defined processing directive with parameters. Commonly used #pragma pack(1), but using this will cause the entire file to align according to the set byte alignment. Using push and pop can resolve this issue, as shown in the code below:

#pragma pack(push)#pragma pack(1)struct TestA{char i;int b;}A;#pragma pack(pop); // Note to call pop; otherwise, the subsequent file will be aligned according to the pack definition, leading to unexpected results, equivalent to struct _TestB{char i;int b; }__attribute__((packed))A;
7. Summary

If you have read this far, you should have a clearer understanding of C language. Embedded C language provides developers with ample freedom in handling hardware physical addresses, bit operations, and memory access. Through arrays, pointers, and type casting techniques, data processing can effectively reduce copying processes, which is necessary for low-level development and facilitates the overall architecture development.

However, the illegal access, overflow, out-of-bounds, and alignment, data width, endianness issues across different hardware platforms brought by this freedom are generally manageable by functional designers, but for those who take over the project later, if the initial design does not consider these issues clearly, it often signifies problems and troubles. Therefore, for any practitioner in embedded C, a clear grasp of these foundational knowledge points is essential.

———— END ————

Hua Qing Yuan Jian has been deeply rooted in the high-end IT field for 20 years, meticulously creating high-end courses in [Embedded Systems] and [Artificial Intelligence]. A complete learning path helps you progress from basics to advanced levels, suitable for industry novices, technical transitions, cross-industry transitions, and in-service enhancements.

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

Additionally, Hua Mei has prepared a surprise for everyone! Including Embedded Systems, Internet of Things, Artificial Intelligence and many high-paying employment courses, along with special course benefits for everyone~

Offline Course Purchase Benefits

πŸ‘‡πŸ‘‡πŸ‘‡

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

If you have any questions scan to contact Hua Mei~

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary
About Hua Qing Yuan Jian

Hua Qing Yuan Jian Education Technology Group was established in 2004, a leading brand in the technology-driven new generation information technology education service integrating industry, academia, and research. It aims to enable every student to obtain IT education services that keep pace with the forefront of technology simply, directly, and efficiently, achieving high-end career dreams. From offline to online, from teaching to research and development, from theory to practice, from campus to workplace, it also provides comprehensive talent training solutions for enterprises, universities, and individuals. Currently, it has established 13 learning centers in major first and second-tier cities across the country, delivering over 300,000 IT talents to enterprises and empowering over 1,100 universities and 20,000 enterprises with talent training and support.

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

Technical Highlights:
Super helpful! Five minutes to understand the WebSocket protocol. Super helpful! Embedded engineers must understand fork and vfork~ Super helpful! The truth about the five major embedded operating systems of STM32 is like this! Super helpful! Who still doesn’t know the “four addresses” of embedded systems?! Super helpful! 5 steps to solve memory leaks in the Linux kernel~ Super helpful! What are the cutting-edge features of HTML5? Super helpful! Can anyone tell me if hardware engineers still have a future?! Super helpful! How to choose between embedded application development and embedded driver development? Super helpful! Common methods of ES6 in ES arrays, how many do you know? Super helpful! A 2000+ word explanation: Is embedded really competitive?! Super helpful! One article to understand: How to configure the HAL library of STM32! Super helpful! Ah! The hottest directions in embedded systems must include these! Super helpful! Embedded C language – the application of the assert() function, how much do you know? Super helpful! What is the difference between an 8K monthly salary and a 400K annual salary for embedded engineers? Super helpful! Industry truths that all embedded engineers know! The three commonly used architectures in their development are actually…………

Essential Knowledge Points for Embedded C Language: A Comprehensive Summary

Leave a Comment

×