Data Transfer Techniques Beyond Core and DMA in MCU

Hello everyone, I am Pi Zi Heng, a serious tech enthusiast. Today, I will share with you the limitations of random address small data write operations on CM7 TCM using PXP on i.MXRT1170.

In an MCU, the main devices (Master) that can perform read and write operations on on-chip and off-chip mapped memory, besides the common Core and DMA, actually include some peripherals aimed at high-speed data transfer (such as USB, uSDHC, ENET interfaces, etc.) or other specific functions (such as GPU, LCD, Crypto, etc.). However, for user data transfer processing, we generally rely only on Core and DMA.

On the i.MXRT series, there is a peripheral called PXP, which is originally a module for pixel data processing, but it can also perform general data transfer tasks. When we use PXP for data transfer, we find that there are some usage limitations when writing to CM7 TCM. Today, let’s discuss this topic:

1. Introduction to PXP Functionality

First, let’s take a look at the functional block diagram of the PXP module. Since it is aimed at image data processing, common functionalities such as image scaling, color space conversion, and image rotation support are essential (the three independent engines in the blue box in the figure below are integrated into PXP). These operations actually involve FrameBuffer pixel data processing (read and rewrite).

Data Transfer Techniques Beyond Core and DMA in MCU

Further analyzing PXP features, we find that in addition to pixel processing, it is also a standard 2D DMA (where 2D means designed for transferring two-dimensional image data), which is the data transfer feature we are looking for. When performing data transfer operations using PXP, if the source FrameBuffer and destination FrameBuffer are of the same size, and the transfer target size is exactly the length of the FrameBuffer, it transforms into the ordinary DMA that everyone is familiar with.

Data Transfer Techniques Beyond Core and DMA in MCU

2. An Errata for RT1170

When we tested the PXP data transfer functionality, let’s first look at an errata unique to RT1160/1170, which is precisely why I paid attention to the 2D DMA functionality of PXP.

This errata mentions that several master devices with memory read/write capabilities in RT1160/1170 may cause data errors when performing Sparse write (random address small data write operations) to CM7 TCM, and PXP is one of them. The solution is to avoid using CM7 TCM as the destination FrameBuffer.

  • Note: The limited master devices listed are mostly newly added peripherals in RT1170 (CAAM, ENET_1G, ENET_QOS, GC355, LCDIFv2), except for PXP, which also exists on RT10xx, but the RT10xx PXP does not have this limitation.
Data Transfer Techniques Beyond Core and DMA in MCU

3. Testing Data Transfer with PXP

To test the PXP data transfer functionality, we can directly use the example located at \\SDK_2_16_000_MIMXRT1170-EVKB\\boards\\evkbmimxrt1170\\driver_examples\\pxp\\copy_pic\\cm7. The main function APP_CopyPicture() is excerpted below, and the code is clear and straightforward. To conduct different tests, we only need to link s_inputBuf and s_outputBuf in different memory spaces and set different copy block sizes and coordinate positions.

  • Note: Only adjust the COPY_WIDTH and DEST_OFFSET_X values to test the impact on one-dimensional data transfer (data length, starting address alignment factors).
#include "fsl_pxp.h"
// Source/Destination Buffer width and height settings (for testing convenience, can be set to the same)
#define BUF_WIDTH   64
#define BUF_HEIGHT  64
// Copy block width and height and coordinate settings in the destination Buffer (fixed to [0,0] in the source Buffer)
#define COPY_WIDTH        8
#define COPY_HEIGHT       8
#define DEST_OFFSET_X     1
#define DEST_OFFSET_Y     1

uint16_t s_inputBuf[BUF_HEIGHT][BUF_WIDTH];
uint16_t s_outputBuf[BUF_HEIGHT][BUF_WIDTH];

static void APP_CopyPicture(void)
{
    pxp_pic_copy_config_t pxpCopyConfig;
    // Set copy parameters (copy data of size 8x8 starting from coordinate [0,0] in s_inputBuf to position [1,1] in s_outputBuf)
    // Source Buffer address and copy block coordinate settings
    pxpCopyConfig.srcPicBaseAddr  = (uint32_t)s_inputBuf;
    pxpCopyConfig.srcPitchBytes   = sizeof(uint16_t) * BUF_WIDTH;
    pxpCopyConfig.srcOffsetX      = 0;
    pxpCopyConfig.srcOffsetY      = 0;
    // Destination Buffer address and copy block coordinate settings
    pxpCopyConfig.destPicBaseAddr = (uint32_t)s_outputBuf;
    pxpCopyConfig.destPitchBytes  = sizeof(uint16_t) * BUF_WIDTH;
    pxpCopyConfig.destOffsetX     = DEST_OFFSET_X;
    pxpCopyConfig.destOffsetY     = DEST_OFFSET_Y;
    // Copy block size settings (pixel format is RGB565, i.e., 2bytes)
    pxpCopyConfig.width           = COPY_WIDTH;
    pxpCopyConfig.height          = COPY_HEIGHT;
    pxpCopyConfig.pixelFormat     = kPXP_AsPixelFormatRGB565;
    // Start copy (move the copy block data from source Buffer to destination Buffer)
    PXP_StartPictureCopy(PXP, &pxpCopyConfig);
    while (!(kPXP_CompleteFlag & PXP_GetStatusFlags(PXP)));
    PXP_ClearStatusFlags(PXP, kPXP_CompleteFlag);
}

Testing shows that when s_outputBuf is placed in OCRAM or external RAM space, the transfer results are exactly as expected. However, when s_outputBuf is placed in CM7 ITCM or DTCM, abnormal results occur, and the abnormal behavior in ITCM/DTCM is consistent.

Different combinations of COPY_WIDTH and DEST_OFFSET_X values lead to different abnormal results. Here, we only show a case with COPY_WIDTH = 1 and DEST_OFFSET_X = 3 for reference; it can be seen that besides the target address data, some extra data is also written before and after, making such a data transfer operation evidently unreliable.

Data Transfer Techniques Beyond Core and DMA in MCU

Of course, it does not mean that placing s_outputBuf in CM7 TCM will definitely cause anomalies; as long as the length of the one-dimensional data being copied is a multiple of 16 bytes and the starting address of the destination is aligned to 8, there will be no errors. Writes that do not meet this condition are referred to as risky Sparse writes (random address small data write operations).

Thus, I have introduced the limitations of random address small data write operations on CM7 TCM using PXP on i.MXRT1170. Where’s the applause~~~

Leave a Comment