Although I have not conducted an industrial survey, from what I have seen and the people I have recruited, engineers in the embedded industry either lack theoretical knowledge or practical experience.
It is rare to find someone who possesses both. The root cause lies in the problems of university education in China. I will not discuss this issue here to avoid unnecessary arguments. I want to list a few examples from my practice to draw attention to some issues when doing projects in embedded systems.
The first example:

A colleague developed a serial driver under uC/OS-II, and problems were found during testing of both the driver and the interface. An application developed a communication program, and the serial driver provided a function to query the number of characters in the driver buffer: GetRxBuffCharNum().
The higher layer needs to receive a certain number of characters before it can parse the packet. The code written by a colleague can be represented in pseudocode as follows:
bExit = FALSE;
do { if (GetRxBuffCharNum() >= 30) bExit = ReadRxBuff(buff, GetRxBuffCharNum());} while (!bExit);
unsigned GetRxBuffCharNum(void) { cpu_register reg; unsigned num; reg = interrupt_disable(); num = gRxBuffCharNum; interrupt_enable(reg); return (num);}
It is clear that in the loop, there is a global critical region between interrupt_disable() and interrupt_enable(), which ensures the integrity of gRxBufCharNum.
However, since the outer do { } while() loop frequently disables and enables interrupts, the time is very short.
In fact, the CPU may not respond to the UART interrupt properly. Of course, this is related to the UART baud rate, the size of the hardware buffer, and the speed of the CPU. The baud rate we used was very high, about 3Mbps.
The UART start and stop signals occupy one bit each. A byte consumes 10 cycles. A baud rate of 3Mbps takes about 3.3us to transmit one byte.
How many CPU instructions can be executed in 3.3us?
At 100MHz ARM, about 150 instructions can be executed. So how long does it take to disable interrupts? Generally, disabling interrupts on ARM requires more than 4 instructions, and enabling requires more than 4 instructions.
The code that receives UART interrupts actually exceeds 20 instructions. Therefore, there is a possibility of losing communication data, which manifests as unstable communication at the system level.
Modifying this code is actually quite simple; the easiest way is to modify it from the higher level. That is:
bExit = FALSE; do { DelayUs(20); // Delay 20us, generally implemented with busy loop num = GetRxBuffCharNum(); if (num >= 30) bExit = ReadRxBuff(buff, num); } while (!bExit);
From the above example, it is clear that embedded developers need to have a thorough understanding of every aspect of the code.
The second example:

A colleague drove a 14094 serial-to-parallel chip. The serial signal was simulated using IO because there was no dedicated hardware. My colleague casually wrote a driver, and after debugging for 3 or 4 days, there were still issues.
I couldn’t stand it anymore and went to take a look; the control of the parallel signal was sometimes normal and sometimes not. I looked at the code, which roughly was:
for (i = 0; i < 8; i++) { SetData((data >> i) & 0x1); SetClockHigh(); for (j = 0; j < 5; j++); SetClockLow();}
This code sends the 8 bits of data from bit 0 to bit 7 on each high signal. It should work fine. I couldn’t see where the problem was.
After thinking carefully and looking at the 14094 datasheet, I understood.
It turns out that the 14094 requires the clock high signal to last for 10ns, and the low signal must also last for 10ns. This code only implemented the delay for the high signal without considering the low signal delay. If an interrupt occurs during the low signal, this code may work.
However, if the CPU executes without an interrupt during the low signal, it cannot work correctly, hence the intermittent failures.
Modifying it is also quite simple:
for (i = 0; i < 8; i++) { SetData((data >> i) & 0x1); SetClockHigh(); for (j = 0; j < 5; j++); SetClockLow(); for (j = 0; j < 5; j++);}
This works perfectly. However, this code is not easily portable because if the compiler optimizes it, it may eliminate these two delay loops.
If eliminated, it cannot guarantee the high and low signals last for 10ns, and thus cannot function correctly.
Therefore, truly portable code should implement this loop as a nanosecond-level DelayNs(10);
Like Linux, at power-up, first measure how long a nop instruction takes, and how many nop instructions execute for 10ns.
Executing a certain number of nop instructions will suffice. Utilize compiler directives or special keywords to prevent the compiler from optimizing away the delay loop. For instance, in GCC:
__volatile__ __asm__(“nop;\n”);
This example clearly shows that writing good code requires extensive knowledge to support it. What do you think?
Source: https://blog.csdn.net/coolbacon/article/details/6842921
Reposted from WeChat public account: Embedded Miscellany
Copyright belongs to the original author or platform. For learning reference only. If there is any infringement, please contact for deletion.

How to Optimize Embedded C Language Code?

Three Common Software Architectures in Embedded Systems

Fan Benefits! Get a Multimeter for Free and Development Board!