
Zhaoyi Innovation, as a leader in the field of 32-bit general-purpose microcontrollers in China, has recently launched the GD32H7 series MCU, completing the puzzle of ultra-high-performance MCUs in China.
The GD32H7 series MCU adopts a 600MHz Arm® Cortex®-M7 core based on Armv7E-M architecture. With its dual-issue 6-stage pipeline architecture and support for high-bandwidth AXI and AHB bus interfaces, it achieves higher clock frequency and processing performance, reaching an excellent result of 1552 DMIPS and 2888 CoreMarks. Compared to other core products, the performance of GD32H7 has significantly improved, supporting high-computing applications such as advanced DSP and edge AI.
The GD32H7 is equipped with up to 4MB of on-chip Flash and 1MB of SRAM, supporting large capacity code storage. Its unique TCM memory and L1 cache greatly improve the access efficiency of internal and external memory. Among them, the 512KB super large tightly coupled memory TCM can be freely configured as I-TCM or D-TCM, used to place programs and data that need acceleration, achieving zero-wait operation and improving system performance. It also integrates a 64KB L1-Cache (I-Cache, D-Cache) cache, whose storage speed is close to the working speed of the CPU core, solving the problem of excessive speed difference between CPU and memory, providing sufficient support for running complex operating systems and advanced algorithms.
The GD32H7 provides various security encryption functions, including DES, Triple DES, AES algorithms, and hash algorithms. The integrated RTDEC module can also protect the data security of external memory connected to the AXI or AHB bus, preventing threats during communication in factories and on-site, ensuring the data security of IoT hardware.
Compared to existing high-performance products, the GD32H7 has greatly expanded peripheral resources and unprecedented improvements in analog performance. It integrates 2 14-bit ADCs with a sampling rate of up to 4MSPS and 1 12-bit ADC with a sampling rate of up to 5.3MSPS, providing high precision sampling rates and rapid response in applications such as motor control and photovoltaic energy storage. Three CAN-FD interfaces and two Ethernet controllers also provide great advantages for industrial network cards, inverters, and servers.
This time, the editor is not looking at anything else, such as how many ADCs there are on-chip, but just looking at the CPU core’s capabilities. The official claims that the Coremark test score can reach 2888 points, so I will run a score by transplanting the open-source Coremark code. The chip model I have is: GD32H759I
01
Download the code and add it to the project



Coremark homepage: http://www.eembc.org/coremark/index.php
Github code address: https://github.com/eembc/coremark
First, go to the hub to download the Coremark code, the homepage address is as above.
We use a code that already has a serial port, so we choose the 04_USART_Printf example in the official demo, which uses UART0 to print data to the PC through CH340. We put Coremark related .C and .H into the project folder. At this point, we create a folder named Coremark to store these code files. Of course, we also need to add the code files to the MDK project and include the header file paths. OK, at this point, the Coremark code has been added to the project, but just adding it simply cannot run the score; adjustments need to be made to the code based on the actual chip.
By the way, I also displayed some test data of chips included in the official collection, mainly searching for some models of STM32.
02
Configure relevant code settings to run the score




First, we need to have the USART0 configuration function, 115200 baud rate, and we need to redirect the serial port so that we can use the printf function, and we also need to enable I, D caches. We added this code to the Coremark’s core_portme.c. And call in the relevant functions:
portable_init(core_portable *p, int *argc, char *argv[]){ usart_config(); // Call the serial port configuration function printf("Start testing....");// Start testing prompt}
In addition, we also need to initialize the systick and enable the cache, this function is implemented in the start_time(), stop_time(), get_time() three functions. At the same time, we need to declare a counter and increment it in the systick interrupt function.
start_time(void){ cache_enable(); systick_config(); }
uint32_t gTick = 0; // Declare counter
void SysTick_Handler(void){ gTick++; delay_decrement();}
Next, we also need to configure some macros, first the stack needs to be adjusted larger, and the number of iterations needs to be increased, at least let the program run for more than 10 seconds, I set it to 50000, a bit larger is fine, give a reference, here is a reference value, generally 72MHz main frequency can use 2000, 120MHz can use 3500. Here GD32H759I is 600MHZ, you can calculate it yourself. The number of iterations per second should also be set to 1000, which should match the systick.
volatile ee_s32 seed4_volatile = 50000;//ITERATIONS;#define EE_TICKS_PER_SEC 1000 //(NSECS_PER_SEC / TIMER_RES_DIVIDER)
After seeing the experiences of many netizens, we also need to ensure the correctness of several macros we use. Since we use the printf function, we need to change HAS_PRINTF, and set the optimization level to -O3. At the same time, the IDE’s optimization level should also be set to match.
#ifndef HAS_PRINTF#define HAS_PRINTF 1
#ifndef COMPILER_FLAGS#define COMPILER_FLAGS \