FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

EEWorld

Electronic Information Sharp Interpretation

Technical Dry Goods Daily Updates

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

Now let’s discuss how to use FreeRTOS services to facilitate microcontroller program design and feel the changes in system processing efficiency after introducing FreeRTOS. I thought of a simple common requirement: the “background” printing output of the serial port (often used for debugging information output during microcontroller development) for the experiment. The serial port (abbreviation for asynchronous serial communication port, UART) is a low-speed device relative to the CPU, with a maximum communication speed of 115200bit/s. At this speed, the time required to output an 8-bit character (assuming 1 start bit, 1 stop bit, and no parity) is 87us. The time required for the CPU to write a character into the serial port’s transmission data register can be ignored, but since the hardware FIFO of the serial port is very small, it is necessary to wait until the transmission data register allows writing before performing the write operation. Therefore, in simple program design, the flag bit is continuously queried, and when the register can be written, a character is written until the string to be sent is completed.

When considering execution efficiency, this approach cannot be used, especially in a multitasking environment, because this loop repeatedly querying the flag bit causes other tasks to wait without utilizing CPU processing power. Therefore, we need a method for “background” output, allowing the current task to submit the string to be output from the serial port to the system, and then continue executing the program without waiting for the string to be sent from the serial port; or the current task waits for the string to be sent, but other tasks can execute.FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

Below is my experimental code running on the ST Nucleo-L4R5ZI board, which can easily be modified to run on other STM32 boards (but the SRAM of STM32L4R5 is large, making it easy to squander). On this development board, the USB serial port is connected to LPUART1 of the STM32L4R5. I wrote a uart_puts(char *) function to output strings from LPUART1. The TDR register of LPUART1 is used to write the data to be sent, and the sending completion status can be obtained from the ISR register. Of course, to achieve background operation, interrupts are needed.

1
Simple and Direct Method – Character Queue

Instead of repeatedly querying, let an interrupt be generated when the TDR register of LPUART1 can be written to notify the CPU that a character should be written. Since it is background, there must be a storage space to store the string provided by the uart_puts() function parameters (if it can fully accommodate, the function can return); the interrupt service routine needs to know the location of this storage to retrieve characters.

It seems that FreeRTOS queues are just suitable for this function: uart_puts() writes characters to the queue, and the interrupt ISR code reads (it can also wake a task to read, but that is unnecessary) from the queue. Following this idea, I wrote the uart_puts() function as follows:QueueHandle_t uart_tx_queue;void uart_puts(char *str){ for(;;str++) { char ch=*str; if(ch==0) break; else { xQueueSend(uart_tx_queue, &ch, portMAX_DELAY); LPUART1->CR1 |= USART_CR1_TXEIE_TXFNFIE; // enable TXE IRQ } }}The cooperative ISR program is:void LPUART1_IRQHandler(void){ char data; BaseType_t waken=pdFALSE; if(xQueueReceiveFromISR(uart_tx_queue, &data, &waken)==pdPASS) { LPUART1->TDR = data; portYIELD_FROM_ISR(waken); } else // no more data to transmit { LPUART1->CR1 &= ~USART_CR1_TXEIE_TXFNFIE; // disable IRQ }}

This means that when a task calls uart_puts(), it writes characters into the queue and opens the interrupt without needing to care about it, while the ISR only needs to fetch characters from the queue. If there are no more characters to transmit, it will disable the interrupt. Of course, the queue must be created in advance, for example, a queue with a length of 400 characters:

uart_tx_queue = xQueueCreate(400, 1);I also wrote two additional tasks that call uart_puts() to output different text strings, and then used the vTaskDelay() function to create some delays – to ensure that the strings written within a certain time do not exceed the serial port throughput limit. Additionally, to test CPU execution status, I can insert GPIO operation code in the above functions to light up LED indicators or capture with a digital oscilloscope. STM32 runs at the default clock frequency of 4MHz. By observing with an oscilloscope: at 9600 Baud, the serial port throughput capability is mostly utilized. The yellow line in the figure below is the UART TX output signal, and the cyan line is generated by the “lighting” code added in the uart_puts() function, surrounding the xQueueSend() function call.

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing Compared to the time outputted by the serial port, the time spent in the uart_puts() function has significantly decreased, achieving a “background” effect. Let’s zoom in on the waveform time axis:

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

In the uart_puts() function, adding characters to the queue took about 60us each time. This time is shorter than the 104us required to output one bit of UART. Let’s probe the execution of the ISR, also inserting “lighting” code at the entry and exit of the ISR (not precise, as the time for saving and restoring the stack is not reflected). The interrupt frequency is consistent with the frequency of the characters outputted by the serial port.

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing From this waveform, a rough judgment can be made: the time consumed by xQueueSend() is longer than that of xQueueReceiveFromISR(). Thus, the key to achieving “background printing” is to load the string into the queue provided by FreeRTOS, delaying the output. If the queue does not have that large a capacity – for example, if we modify the queue length to 100, what would be the effect? See the test image.

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing When the queue is full, xQueueSend() must wait for the ISR to remove characters from the queue before it can return, leading to excessive waiting (about 1ms, consistent with the UART output character cycle). Of course, this time can be utilized by other tasks, and FreeRTOS will perform task scheduling, so CPU processing power is not wasted.

Software Cost

Let’s analyze the cost of implementation. I estimated above that adding characters to the queue takes about 60us each time – this is also over two hundred machine cycles. Let’s experiment with changing the serial port Baud to 115200, what effect will it have? I found that the effect is: the queue does not play any role, and the time of calling uart_puts() is synchronized with the serial port output. Strange! Zooming in, it looks like this (the yellow line is the UART TX signal, and the cyan line’s high level represents calling xQueueSend()):

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing It seems completely different from when it was at 9600 Baud; the uart_puts() function “drifted” and actually consumed more time, even dragging down the output throughput of the serial port. Apart from calling xQueueSend(), it didn’t do any other heavy work, so the queue wouldn’t fill up. Why is there a segment of delay (the cyan line’s low level)?

Upon investigating the execution of the ISR, I immediately understood – compared to 9600 Baud, the interrupt frequency has increased by an order of magnitude, requiring more CPU time. When uart_puts() writes a character to the queue, it immediately opens the interrupt, so the CPU executes the ISR. Let’s look at the image (here the yellow line represents ISR entry and exit).

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing Inside the ISR, it first retrieves a character from the queue and writes it into the TDR register, and if scheduling is needed, it performs task scheduling and then returns. Note that at this point the queue is already empty, because uart_puts() just wrote one character. But before uart_puts() continues to add characters to the queue, another interrupt occurs (even though the character just written has not been completed, the TDR has already allowed writing), and the ISR finds the queue empty, disables the interrupt, and returns. uart_puts() can continue executing, writing a character to the queue, and so on. The result is that the queue can contain at most one character.

The reason is that the time consumed by queue operations (Send, followed by two Receives) is longer than the time to send a character via UART, so the queue does not serve as a buffer, and the operational cost of itself wastes CPU resources.

2
Circular Buffer

As a replacement for the queue provided by FreeRTOS, I allocated a block of memory as a buffer, using two integer variables tail and head (representing the read and write positions, respectively) to record the usage of the buffer. When tail and head are equal, it indicates that the buffer has no data. Writing a byte into the buffer increases tail by 1; conversely, taking a byte out increases head by 1. The significance of the “circular buffer” is that when tail or head exceeds the buffer size, it resets to 0, and the maximum number of bytes that the entire buffer can hold is fixed.

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

First, consider the implementation of the ISR, for example:void LPUART1_IRQHandler(void){ if(uartbuf.tail!=uartbuf.head) // not empty { LPUART1->TDR = uartbuf.buf[uartbuf.head]; uartbuf.head = (uartbuf.head+1)%BUFSIZE; if(uartbuf.tail!=uartbuf.head) return; } // buffer empty LPUART1->CR1 &= ~USART_CR1_TXEIE_TXFNFIE; // disable IRQ}uart_puts() implementation needs to consider more situations, as the usage of the circular buffer is not as straightforward as in the previous example: because the head variable does not exist – it relates to the DMA transfer address, which must access the DMA hardware register. I used the memcpy() function to fill the buffer, and I also had to calculate addresses based on conditions. If the buffer cannot hold the output string, the task needs to be blocked until the DMA processes the buffer data.void uart_puts(char *str){ uint16_t len; char done=0, dma_stopped=0; xSemaphoreTake(uart_mutex, portMAX_DELAY); taskENTER_CRITICAL(); len=strlen(str); if(dmabuf.pend_ext) { uint16_t avail, dma_pos; dma_pos = dmabuf.tail – UART_DMAChannel->CNDTR; avail = dma_pos – dmabuf.pend_ext; if(avail>=len) { memcpy(dmabuf.buf+dmabuf.pend_ext, str, len); dmabuf.pend_ext += len; done=1; } } else { if(dmabuf.pend_len) { uint16_t avail1, dma_pos; avail1 = BUFSIZE – dmabuf.pend_off – dmabuf.pend_len; // till buffer end if(avail1>=len) { memcpy(dmabuf.buf+dmabuf.pend_off+dmabuf.pend_len, str, len); dmabuf.pend_len += len; done=1; } else { dma_pos = dmabuf.tail – UART_DMAChannel->CNDTR; if(avail1+dma_pos>=len) { memcpy(dmabuf.buf+dmabuf.pend_off+dmabuf.pend_len, str, avail1); dmabuf.pend_len += avail1; memcpy(dmabuf.buf, str+avail1, len-avail1); dmabuf.pend_ext = len-avail1; done=1; } } } else // no pending transfer { if(UART_DMAChannel->CNDTR) // not finished { uint16_t avail1, dma_pos; avail1 = BUFSIZE – dmabuf.tail; if(avail1>=len) { memcpy(dmabuf.buf+dmabuf.tail, str, len); dmabuf.pend_off = dmabuf.tail; dmabuf.pend_len = len; done=1; } else { dma_pos = dmabuf.tail – UART_DMAChannel->CNDTR; if(avail1+dma_pos>=len) { memcpy(dmabuf.buf+dmabuf.tail, str, avail1); dmabuf.pend_off = dmabuf.tail; dmabuf.pend_len = avail1; memcpy(dmabuf.buf, str+avail1, len-avail1); dmabuf.pend_ext = len-avail1; done=1; } } } else // finished already { dma_stopped=1; } } } taskEXIT_CRITICAL(); if(!done) { if(!dma_stopped) wait_dma_finish(); while(BUFSIZE < len) { memcpy(dmabuf.buf, str, BUFSIZE); uart_conf_dma(0, BUFSIZE); wait_dma_finish(); len -= BUFSIZE; str += BUFSIZE; } memcpy(dmabuf.buf, str, len); uart_conf_dma(0, len); } xSemaphoreGive(uart_mutex);}

Note that unlike before, I used taskENTER_CRITICAL() to set the critical section, prohibiting DMA interrupts from interrupting execution and prohibiting task scheduling. Because the configuration of the buffer is now more complex, if the DMA transfer status changes during the execution of uart_puts(), it is easy to cause misjudgment.

The complete program with the auxiliary functions used above is in the attachment, and I will not list all of it here.

Performance Comparison

To test how much performance the program using DMA improves compared to using only interrupts, I designed a test code: using one task to continuously print a counter variable, this task is assigned high priority, so the serial port will continuously output.volatile int counter;static portTASK_FUNCTION( vPrint, pvParameters ){ char str[128]; strcpy(str, ” **** 0x”); for(;;) { char *text = str+8; int8_t i; uint32_t x = counter; text[8]=0; for(i=7;i>=0;i–) { unsigned char h=x%16; x>>=4; if(h<10) *(text+i)=’0’+h; else *(text+i)=’A’+h-10; } LED_B_on(); uart_puts(str); LED_B_off(); }}And this counter variable is rewritten in a lower priority task:static portTASK_FUNCTION( vCount, pvParameters ){ for(;;) { counter++; }}Then, I set a time: 10 seconds, to see how much the counter has increased from 0, which can determine the effective execution time of the vCount task. This is done with another task, given the highest priority:static portTASK_FUNCTION( vControl, pvParameters ){ vTaskDelay(10000); vTaskSuspendAll(); for(;;) { }}

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

When comparing the two implementation methods, I removed the task mutex code from uart_puts() because only one task is calling it. Comparing the final output count results:

BUFSIZE=256 BUFSIZE=512
DMA (Ex 3) 0x4DC6F6 0x4E5AAD
PIO (Ex 2) 0x3D4DC3 0x3DF186

To visualize:

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing After using DMA transmission, the CPU time obtained by the counting task increased by 25%. It can be considered that the overhead of the interrupt ISR has been saved, along with the efficiency changes brought about by the change in program design.

Recommended Reading

Dry Goods | FreeRTOS Learning Notes – Application Scenarios

Dry Goods | FreeRTOS Learning Notes – Stack (Key to Task Switching)

Dry Goods | FreeRTOS Learning Notes – Task Status and Switching

Dry Goods | FreeRTOS Learning Notes – Inter-task Communication

Dry Goods | Skills Get√, Simple Measurement of Operational Amplifier

Dry Goods | Regarding PCB High-Frequency Circuit Board Wiring

Dry Goods | 40 Animated Images to Let You Understand Various Common Sensor Working Principles in Seconds

Dry Goods | Exploration of Losses in Switching Power Supplies

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

All the following WeChat public accounts belong to

EEWorld (www.eeworld.com.cn)

Welcome to long press the QR code to follow!

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

EEWorld Subscription Account: Electronic Engineering World

FreeRTOS Learning Notes: Experiment on Serial Port Background Printing

EEWorld Service Account: Electronic Engineering World Welfare Society

Leave a Comment

×