1 Introduction
The open-source architecture processor RISC-V[1] has been increasingly applied in embedded systems, with several processor manufacturers releasing RV32 architecture MCU in recent years. In April 2019, SiFive[2] released Freedom E310; in February 2020, GigaDevice[3] released RV32 MCU GD32VF103; Qinheng Microelectronics[4] released CH32V, CH32X, and CH32L series RV32MCU; Xianji Semiconductor[5] released HPM5XXX and HPM6XXX series RV32 single-core and multi-core MCU; Renesas Electronics[6] released RV32 automotive MCU RH850/U2B and general-purpose MCU R9A02G021, etc.
With the rapid development of RISC-V MCU, its software ecosystem is gradually improving and enriching. Embedded Studio [7], IAR [8] and other mainstream commercial IDEs as well as open-source IDE eclipse[9] support RISC-V MCU. Some mainstream operating systems have already supported RISC-V architecture[10]. QEMU RISC-V virtualization platform[11] supports various RISC-V processor emulation; FreeRTOS released a version supporting RISC-V MCU[12]; Zou Yang et al.[13] implemented OpenHarmony porting and optimization based on QEMU RISC-V; Nicholas Gordon et al.[14] ported the Kitten Lightweight Kernel operating system to RISC-V; Luming Zhang[15] ported and optimized RISC-V UFEI Boot; Robert Balas et al.[16] analyzed the structural characteristics of RV32IMC and proposed programming methods.
The results of the second Di Shui Lake China RISC-V industry forum site survey[17] show that RISC-V architecture processors have been successfully applied in wireless connection chips, industrial control chips, network communication chips, and edge computing chips. Currently, in mobile communication, the Internet of Things, and industrial control embedded application fields, ARM architecture processors dominate the market[18]. In some fields and application scenarios, RISC-V will challenge ARM’s market position. RV32 MCU is competing for ARM Cortex-M MCU’s market share. Porting Cortex-M MCU applications to RV32 MCU, fully utilizing Cortex-M MCU’s mature ecosystem, will be beneficial for RISC-V ’s development and promotion. Due to the differences in structure and programming models between RV32 and Cortex-M, although using programming development tools can compile, assemble, and link the source program to generate RV32 executable programs, some issues will still be encountered during the program porting process.
This article analyzes the differences between RV32 and Cortex-M in structure, programming models, and calling conventions, discusses the issues encountered during the porting of applications from RV32 to Cortex-M, proposes solutions and suggestions, and conducts relevant performance analysis and comparison.
This article introduces the development status of RISC-V , discusses the application prospects of RV32; the second section compares RV32 and Cortex-M exception handling mechanisms, explaining the issues faced in porting interrupt service routines; the third section compares RV32 and Cortex-M instruction architectures, analyzing the impact of RV32 instruction module combinations on program performance; the fourth section discusses the differences in program calling conventions between RISC-V and ARM processors, analyzing their impact on program porting; the fifth section summarizes the work of this paper.
2 Interrupt Handling
In the ARM architecture processor manual[19] and RISC-V architecture processor user manual[20], programming models are used to represent the content related to program development in processor architecture. Programming models typically include data types supported by the processor, general and special function registers, exception and interrupt response mechanisms, and instruction sets, etc., with exception and interrupt handling being the basic functions of the processor.
Exceptions triggered by external signals are generally referred to as interrupts. Interrupt handling is one of the key functions of MCU application systems. Interrupt handling functions include two parts: the interrupt response mechanism and the interrupt service routine (Interrupt Service Routine). The interrupt response mechanism is determined by the processor hardware structure, while the interrupt service routine is related to the interrupt response mechanism.
2.1
Interrupt Response Mechanism
Table 1 lists the differences between Cortex-M and RV32 interrupt response mechanisms. When porting applications from Cortex-M to RV32, it is necessary to modify the related programs for interrupt handling functions.
Table 1 RV32 and Cortex-M Exception Management
Cortex-M only supports vector interrupt response, where the interrupt vector points to the entry of the corresponding interrupt service routine, and the processor relocates the interrupt vector table through the register VTOR after startup. RV32 supports both vector and non-vector interrupt response methods, with the default being non-vector response mode upon processor reset, pointing to the starting address of the program space, usually 0x00. After the processor starts, the interrupt response mode and exception vector entry address are set through the machine mode exception vector base address register mtvec.
To ensure consistency in functionality before and after porting, the application program must ensure vector interrupt response mode after porting to RV32. After resetting, the RV32 MCU first executes the initialization sequence, writing the high 30 bits of the interrupt vector table address value into the RV32 register (CSR) mtvec’s mtvec [31:2], and writing 01 into mtvec[1:0] to select the vector interrupt response mode.
Currently, the types of peripherals, interfaces, and control methods of RV32 MCU in the market are similar to those of Cortex-M MCU[3][19][20] and the structure of the interrupt vector table is similar. Table 2 compares the interrupt vector table structure of Cortex-M MCU STM32F429 and RV32 MCU GD32VF103.
Table 2 STM32F429 and GD32VF103 Interrupt Vector Table Structure
As shown in Table 2, in the RV32 MCU interrupt vector, except for vector 0, each 32-bit vector value is the entry address of the corresponding interrupt service routine. Since RV32 defaults to non-vector interrupt response mode after reset, it is necessary to first set the interrupt response mode and vector table base address, where vector 0 is a jump instruction that jumps to the reset startup program entry.
2.2
Context Handling
Cortex-M MCU automatically pushes xPSR, PC, LR, R12 and R3-R0 onto the stack when responding to an interrupt request, saving the context; when the interrupt service routine returns, the hardware automatically pops the data back to the corresponding registers, restoring the context.
RV32 MCU saves PC to the control and status register mepc when responding to an interrupt request, and saves the privilege mode to mstatus.MPP, but does not save the context-related general registers and other special function registers. When returning from the interrupt service routine, the hardware only automatically restores registers mstatus and PC. When porting the interrupt service routine from Cortex-M MCU to RV32 MCU, it is necessary to add statements or functions to save and restore context within the interrupt service routine.
Cortex-M MCU applications and interrupt service routines adhere to the ARM procedure calling standard AAPCS[17], where the hardware automatically saves 4 special function registers and 4 parameter registers r0-r3, also known as a0-a3. RISC-V application procedure calling standard specifies 8 parameter registers x10-x17, also known as a0-a7. Compared to the context content automatically saved by Cortex-M MCU, the RV32 MCU interrupt service routine needs to save and restore at least the parameter registers a0-a7 and special function registers. Table 3 lists the minimum set of program statements for saving and restoring context in RV32 MCU interrupt service routine.
3 Instruction Set Modules
Cortex-M MCU uses the Thumb instruction set. RV32 MCU uses modular instructions, allowing the selection of instruction set modules and their combinations when generating applications. When porting Cortex-M MCU applications to RV32 MCU, choose the instruction set module combination corresponding to the Thumb instruction set to maintain consistency in program functionality and performance before and after porting.
For ease of analysis and performance evaluation, this article selects STM32F429 and FE310 as samples for Cortex-M MCU and RV32 MCU respectively, using CoresMark [22] for performance analysis; the program generation tool selected is Segger Embeddedstudio[7]; the assembler and compiler selected is gcc, version gnu4.2.1.
STM32F429 has a Cortex-M4 [23] core, and its instruction set is Thumb2. FE310 supports RV32imac instruction set modules, and the processor instruction set functionality is the union of all submodule functionalities. Table 4 compares some functional aspects of the instruction sets of STM32F429 and FE310.
Table 4 STM32F429 and FE310 Instruction Set Functional Aspects
When generating applications for RV32 MCU, selecting different instruction sets or combinations of instruction sets will affect program performance.
3.1
RV32i vs RV32im
Embedded studio defaults to using RV32i instruction set modules when creating FE310 application projects, which do not include multiplication and division instructions. The multiplication and division operations in the program can compile normally, as the compiler implements them by calling the operation function library implemented by RV32i instructions. If the RV32im instruction set module combination is selected during compilation, the compiler will directly use multiplication and division instructions to perform operations, speeding up program execution. Table 5 lists the assembly instruction comparison generated by compiling c programs using RV32i and RV32im instruction sets.
Table 5 RV32i and RV32im Assembly Instructions
Table 6 lists the CoreMark scores obtained by running RV32i and RV32im instruction sets on the FE310 simulator. The results show that if the application program contains multiplication and division operations, changing from RV32i to RV32im instruction set will improve program execution speed.
Table 6 RV32i and RV32im CoreMark Scores
3.2
RV32im vs RV32imc
Due to limited storage resources in MCU processors, the demand for ROM and RAM is one of the key focuses in developing MCU applications. Cortex-M uses a 16-bit instruction Thumb instruction mode to reduce the size of applications. RV32i and RV32im instruction lengths are 32 bits. For the same source program, the length of the binary target program generated using RV32im instruction set will be larger than that of the Cortex-M target program, thus increasing the demand for storage resources. Choosing RV32imc instruction set combination will reduce the instruction length from 32 bits to 16 bits, decreasing the length of the generated binary target program. Table 7 lists the binary code lengths generated by different instruction sets for the 4 main files of CoreMark.
Table 7 CoreMark Main Files Binary Code Length (Bytes)
As seen in Table 7, the average length of RV32i programs is 2.05 times that of Cortex-M4, while the average length of RV32imc programs is 1.2 times that of Cortex-M4. Therefore, when porting programs from Cortex-M to RV32, selecting RV32imc instruction set combination will basically meet the original system’s storage resource constraints.
By adding the instruction set module “A”, selecting the RV32imac instruction set combination, the length of the main file binary code after compilation is exactly the same as that of RV32imac , with a CoreMark score of 2.34/MHz, which is close to that when selecting RV32imc instruction set combination.
4 Procedure Calling Convention
The procedure calling convention defines the application binary interface ( Application Binary Interface), which typically includes the method of parameter passing and result returning in function or procedure calls, processor register usage, and data type handling, etc. When porting Cortex-M MCU applications to RV32 MCU, it is necessary to consider the differences between the ARM processor and RISC-V processor calling conventions. This section will discuss the impact of differences in parameter passing methods during function calls on the ported program.
ARM architecture calling convention ( AAPCS)[24] stipulates that during function and procedure calls, the caller (main program) passes parameters to the function (called) through 4 registers r0-r3 or a0-a3; if the number of parameters exceeds 4 registers, the excess is passed via the stack; the function returns results through a0-a1. The RISC-V procedure calling convention specifies that the caller passes parameters to the called through 8 registers x10-x17 or a0-a7; if the number of parameters exceeds 8 registers, the excess is passed via the stack; the called function returns results through a0-a1. Table 8 lists the assembly functions generated by compiling a C language function with 6 integer parameters for Cortex-M and RV32imac.
Table 8 Assembly Functions Generated by C Functions
Since the delay in accessing the stack is higher than that of accessing registers, when designing Cortex-M MCU application functions, parameters are usually kept within 4*32 bits. When porting Cortex-M MCU applications to RV32 MCU, utilizing the feature of passing more parameters through up to 8*32 bit registers will reduce the delay caused by function and procedure calls.
5 Summary and Outlook
This article compares the differences between MCU interrupt handling mechanisms, instruction set module combinations, and procedure calling conventions between Cortex-M and RV32MCU, analyzing the impact of these differences on porting applications from Cortex-M MCU to RV32 MCU. To ensure compatibility with interrupt handling programs, the RV32 MCU is set to vector interrupt response mode, and save and restore context statements are added in the interrupt service routine. When generating applications, the RV32imc instruction set combination is selected to achieve high-performance and small-sized applications. By utilizing MCU registers to pass more parameters during function and procedure calls, the delay during calls is reduced.
There are significant differences in instructions between RV32 MCU and Cortex-M MCU, and the differences between instructions pose challenges for optimizing program performance during porting. Future research will further explore performance optimization issues in RV32 program porting.
References
[1] Waterman, A. (2016) Design of the RISC-V Instruction Set Architecture. Ph.D. Dissertation, University of California.https://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html
[2] SiFive FE310-G002 Manual v1p5.https://www.sifive.com/
[3] RISC-V.https://www.gigadevice.com/product/mcu/risc-v
[4] Qingke RISC-V General Series[EB/OL].https://www.wch.cn/products/productsCenter/mcuInterfacecategoryId=70, 2024-06-10.
[5] Microcontroller[EB/OL].https://www.hpmicro.com/product/product.html?id=07e8d638-1c6c-44d1-9e53-a48d8560ad78, 2024-06-10.
[6] Reness Introduces Industry’s First General-Purpose 32-bit RISC-V MCUs with Internally Developed CPU Core.https://www.renesas.com/us/en/about/press-room/renesas-introduces-industry-s-first-general-purpose-32-bit-risc-v-mcus-internally-developed-cpu-core
[7] Embedded-Studio.https://www.segger.com/downloads/embedded-studio/
[8] The Leading Commercial Tools for RISC-V.https://www.iar.com/products/architectures/risc-v/
[9] Eclipse Embedded CDT (C/C++ Development Tools).https://projects.eclipse.org/projects/iot.embed-cdt
[10] Singhal, S.P., Sridevi, M., Narayanan, N.S., et al. (2021) Porting of eChronos RTOS on RISC-V Architecture. In: Hura, G.S., Singh, A.K. and Siong Hoe, L., Eds., Advances in Communication and Computational Technology, Springer, 1269-1279.https://doi.org/10.1007/978-981-15-5341-7_96
[11] Pieper, P., Herdt, H. and Drechsler, R. (2022) Advanced Embedded System Modeling and Simulation in an Open Source RISC-V Virtual Prototype. Journal of Low Power Electronics and Applications, 12, Article 52.https://doi.org/10.3390/jlpea12040052
[12] Using FreeRTOS on RISC-V[EB/OL].https://www.freertos.org/zh-cn-cmn-s/Using-FreeRTOS-on-RISC-V.html, 2024-06-13.
[13] Zou Yang, Han Changgang, Quan Yu, Yu Jiageng, Wu Yanjun. Based on QEMU RISC-V Architecture OpenHarmony Standard System Porting [J]. Computer System Applications, 2023, 32(11): 21-28.
[14] Gordon, N., Pedretti, K. and Lange, J.R. (2022) Porting the Kitten Lightweight Kernel. Operating System to RISC-V. IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), Dallas, 13-18 November 2022, 1-7.https://doi.org/10.1109/ROSS56639.2022.00008
[15] Zhang, L. (2022) The Porting and Optimization of RISC-V UEFI Boot. 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, 15-17 April 2022, 840-843.https://doi.org/10.1109/ICSP54964.2022.9778600
[16] Balas, R. and Benini, L. (2021) RISC-V for Real-Time MCUs—Software Optimization and Microarchitectural Gap Analysis. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 1-5 February 2021, 874-877.https://doi.org/10.23919/DATE51398.2021.9474114
[17] Arm Holdings Market Share across Key Technology Markets Worldwide 2020-2022.https://www.statista.com/statistics/1132112/arm-market-share-targets/
[18] 21Century Economic Report. RISC-V is Growing Rapidly, Which Application Areas Are Landing Faster and More Completely? [EB/OL].https://new.qq.com/rain/a/20221130A0AATG00.html, 2022-11-30.
[19] ARMv7-M Architecture Application Level Reference Manual.https://www.arm.com/
[20] RISC-V Foundation, the RISC-V Instruction Set Manual, Volume I: Unprivileged ISA.https://riscv.org/
[21] GigaDevice, GD32VF103 User Manual V1.2. fhttps://www.gd32mcu.com/data/documents/userManual/GD32VF103_User_Manual_Rev1.5.pdf
[22] Coremark 1.0.https://github.com/eembc/coremark
[23] STM32 Cortex®-M4 MCUs and MPUs Programming Manual.https://www.st.com.cn/resource/en/programming_manual/pm0214-stm32-cortexm4-mcus-and-mpus-programming-manual-stmicroelectronics.pdf
[24] Procedure Call Standard for the ARM Architecture.https://eecs.umich.edu/courses/eecs373/readings/ARM-AAPCS-EABI-v2.08.pdf
[25] Understanding RISC-V Calling Convention.https://cs.sfu.ca/~ashriram/Courses/CS7ARCH/assets/notebooks/RISCV/RISCV_CALL.pdf
(Author Affiliation: Peking University School of Software and Microelectronics, Beijing)
This article is authorized to be published by “Embedded Technology and Intelligent Systems”, the original text was published in2024 Issue 1. The journal “Embedded Technology and Intelligent Systems” is published by Hans Chinese Open Source Journal Academic Exchange Platform, focusing on the latest developments in traditional embedded technology and emerging intelligent systems. The editorial team brings together well-known embedded system experts and scholars from China. Read the original text to learn more about the journal and download the paperPDF version.