01 DSP Algorithm Library
1. Introduction
In ARM microcontroller development, there is a DSP digital signal processing function library provided in CMSIS. This library includes basic data functions, fast mathematical operations, complex number operations, filters, matrices, transforms, click control, statistics, support functions, and interpolation functions, covering most algorithms used in engineering applications. Here, I have a question: how much speed improvement can be achieved by using DSP functions in CMSIS compared to functions in a regular mathematical library? Below, I will compare the speed of square root operations and sine trigonometric function calculations. This will help in choosing different mathematical libraries for different occasions in the future.

2. Testing Method
The testing hardware platform is the STM32F103 microcontroller made yesterday, which is an ARM microcontroller with an M3 core. This microcontroller is relatively simple and popular, so let’s take a look at its mathematical calculation speed. A port is used to output high and low levels to indicate the time required for calculations.

To use the DSP library, you need to select CMSIS DSP in the connection library settings of the project file options in the IAR development environment. Simply include the ARM math header file in your program. In the main loop of the test program, the ordinary sine trigonometric function calculation is executed, setting the LED output pin level before and after, so that the width of the high-level pulse indicates the sine function calculation time. After a 1ms interval, the sine function from the DSP library is executed, also using the LED pin’s high level to indicate execution time. The differences in execution speed of the two functions can be visually compared using an oscilloscope.

3. Test Results
Using an oscilloscope to measure the waveform of the LED pin, we can see two pulse signals before and after. The first signal is from the sine function in the ordinary mathematical library, which takes a longer time, while the second is from the DSP library, which clearly takes less time. This indicates that for microcontrollers with an M3 core, the efficiency of using the DSP mathematical library is very high. It should be noted that the system clock of the microcontroller is 64MHz at this time. To compare the execution times of the two mathematical libraries more accurately, the waveform is expanded. Using the cursor measurement function of the oscilloscope, the pulse width is measured. The ordinary sine function takes 41.7 microseconds, while the DSP sine function takes only 10 microseconds. This comparison shows that for sine function calculations, the DSP algorithm library is approximately four times faster than ordinary mathematical functions.

● Sine Operation Comparison: math
: 41.7us DSP
: 10us
▲ Figure 1.3.1 Comparison of Sine Function Execution Speed Between Ordinary Mathematical Library and DSP Library
Next, let’s compare floating-point square root mathematical operations. The method is the same. The oscilloscope measures the differences between the ordinary mathematical library function and the DSP mathematical library function. The ordinary mathematical library takes 12.52 microseconds to calculate the square root of a floating-point number, while the square root operation in the DSP mathematical library only takes 4.9 microseconds. This is less than a threefold increase over the ordinary mathematical library. This indicates that the DSP speedup varies for different mathematical operations.

● Square Root Operation Comparison: math
: 12.52us DSP
: 4.9us
▲ Figure 1.3.2 Comparison of Square Root Operation Algorithm Function Speeds
Finally, let’s compare ordinary integer bit shift operations. We will shift left the integers in a set of 128 numbers and compare the time required for the ordinary C language shift with the time required for the DSP library shift function. The C language shift takes about 24 microseconds, while the DSP library function takes 16.65 microseconds, which is about a one-third speed increase.

▲ Figure 1.3.3 Bit Shift Operation
● Bit Shift Comparison: C
: 24us DSP
: 16.65us
※ Conclusion ※
This article compares the speeds of some functions in the CMSIS DSP mathematical library, which are all faster than those in the ordinary C language and math library functions. The speed of the trigonometric functions improved by about four times, while ordinary bit shift operations only improved by about one-third.