Advertisement(Lichuang Mall is giving out benefits again! Get free coupons for Ruby and Shengrongda!)
Seeing embedded issues from the perspective of PC programming is the first step; learning to use embedded programming thinking is the second step; combining PC and embedded thinking together and applying it to practical projects is the third step.
Some friends transition from PC programming to embedded programming. In China, few friends in embedded programming have graduated from a computer science program; most have graduated from disciplines like automatic control or electronics. These individuals have strong practical experience but lack theoretical knowledge; a significant portion of computer science graduates go into online gaming or web applications that are independent of the operating system, and are not keen on entering the embedded industry, as it is a challenging path. They have robust theoretical knowledge but lack knowledge in circuits and related fields, making it difficult to learn embedded systems without additional study.
Although I haven’t conducted an industry survey, from what I’ve seen and the people I’ve hired, engineers in the embedded industry either lack theoretical knowledge or practical experience. Rarely do they possess both. The root cause lies in the issues with university education in China. I won’t delve into this topic here to avoid a debate. I would like to list a few examples from my practice to draw attention to certain issues when working on projects in embedded systems.
The First Issue:
A colleague was developing a serial port driver under uC/OS-II, and issues were found during testing both in the driver and the interface. An application was developed for communication, where the serial port driver provided a function to query the character count in the driver buffer: GetRxBuffCharNum(). The higher level needs to receive a certain number of characters before it can parse the packet. The code written by a colleague can be represented in pseudocode as follows:
bExit = FALSE;
do {if (GetRxBuffCharNum() >= 30)
bExit = ReadRxBuff(buff, GetRxBuffCharNum());
} while (!bExit);
This code checks if there are more than 30 characters in the current buffer, and reads all characters from the buffer until the read is successful. The logic is clear, and the idea is straightforward. However, this code does not work correctly. If it were on a PC, there would be no issues; it would function normally. But in embedded systems, the outcome is uncertain. My colleague was frustrated and did not understand why. When he came to me for help, I asked him how GetRxBuffCharNum() was implemented.
Upon inspection:
unsigned GetRxBuffCharNum(void)
{
cpu_register reg;
unsigned num;
reg = interrupt_disable();
num = gRxBuffCharNum;
interrupt_enable(reg);
return (num);
}
It is evident that within the loop, interrupt_disable() and interrupt_enable() create a global critical section to ensure the integrity of gRxBufCharNum. However, due to the frequent disabling and enabling of interrupts in the outer do { } while() loop, the time is very short. In reality, the CPU may not be able to respond to the UART interrupt properly. This also relates to the baud rate of the UART, the size of the hardware buffer, and the speed of the CPU. The baud rate we are using is quite high, around 3Mbps. The start and stop bits of the UART consume one bit each. One byte takes about 10 cycles to transmit. At 3Mbps, it takes about 3.3us to transmit one byte. How many CPU instructions can be executed in 3.3us? At 100MHz ARM, approximately 150 instructions can be executed. How long does it take to disable interrupts? Generally, disabling interrupts on ARM requires more than 4 instructions, and the same for enabling. The code for receiving UART interrupts is actually more than 20 instructions. Thus, it is possible to encounter a bug that causes communication data to be lost, which manifests at the system level as unstable communication.
Modifying this code is actually quite simple; the easiest way is to change it from the higher level. That is:
bExit = FALSE;
do {
DelayUs(20); // Delay 20us, generally implemented with a busy loop
num = GetRxBuffCharNum();
if (num >= 30)
bExit = ReadRxBuff(buff, num);
} while (!bExit);
This gives the CPU time to execute the interrupt code, thus avoiding the frequent disabling of interrupts causing the interrupt code to execute late, leading to data loss. In embedded systems, most RTOS applications do not come with serial port drivers. When designing code, the integration of the code with the kernel is not fully considered, leading to deeper issues. The reason RTOS is called RTOS is due to its quick response to events; quick response to events depends on the CPU’s response speed to interrupts. Drivers in systems like Linux are highly integrated with the kernel, running in kernel mode. While RTOS cannot replicate the structure of Linux, there are lessons to be learned.
From the above example, it is clear that embedded developers need to have a thorough understanding of every aspect of the code.
The Second Example:
A colleague was driving a 14094 chip that converts serial to parallel. The serial signal was simulated using IO due to the lack of dedicated hardware. My colleague casually wrote a driver, but after debugging for 3 to 4 days, there were still issues. I couldn’t watch it any longer, so I took a look, and sometimes the control of the parallel signal was normal and sometimes not. I reviewed the code, which can be roughly represented in pseudocode as:
for (i = 0; i < 8; i++)
{
SetData((data >> i) & 0x1);
SetClockHigh();
for (j = 0; j < 5; j++);
SetClockLow();
}
This code sends the 8 bits of data sequentially from bit0 to bit7 on each high clock pulse. It should work normally. I couldn’t see where the issue was. After careful thought and reviewing the 14094 datasheet, I understood. The 14094 requires that the high clock pulse lasts for 10ns, and the low clock pulse must also last for 10ns. This code only implements a delay for the high pulse, without a delay for the low pulse. If an interrupt occurs during the low pulse, this code can work. However, if the CPU does not execute during the low pulse, it will not work correctly. Hence, it works inconsistently.
Modifying it is also quite simple:
for (i = 0; i < 8; i++)
{
SetData((data >> i) & 0x1);
SetClockHigh();
for (j = 0; j < 5; j++);
SetClockLow();
for (j = 0; j < 5; j++);
}
Now it works perfectly. However, this code is still not very portable because if the compiler optimizes, it may eliminate these two delay loops. If eliminated, it cannot guarantee that the high and low pulses last for 10ns, thus it will not work correctly. Therefore, truly portable code should implement this loop as a nanosecond-level DelayNs(10);
Like Linux, when powered on, first measure how long the nop instruction takes, how many nop instructions are needed for 10ns. Just execute a certain number of nop instructions. Using compiler directives to prevent optimization or special keywords to prevent the delay loop from being optimized away, such as in GCC:
__volatile__ __asm__(“nop;\n”);
This example clearly illustrates that writing good code requires a lot of knowledge support. What do you think?
Advertisement(Visit Lichuang Mall, click to read the original text)
About Lichuang Mall
Lichuang Mall (WWW.SZLCSC.COM) is the largest one-stop electronic component procurement self-operated mall in China by online order volume, with a self-built modern component warehouse of over 6000 square meters and a stock of over 40000 types. As a subsidiary of Shenzhen Jialichuang Group (which covers the entire electronic industry chain self-operated services including: online EDA (EasyEDA) + industry-leading PCB prototyping/small batch + component mall + stencil manufacturing + SMT assembly + electronic design education and solutions), Lichuang Mall is a vertical mall with a complete variety, self-operated inventory, and guaranteed quality. All components at Lichuang Mall are sourced through official channels from original manufacturers or agents to ensure genuine products, providing you with professional one-stop electronic component procurement services.