Common Performance Metrics for MCU CPUs
1. Basic Performance Metrics
Clock Frequency
- System Clock Frequency: 48MHz, 72MHz, 168MHz, 400MHz, etc.
- Internal Oscillator Accuracy: ±1%, ±2%, etc.
- External Crystal Support: Range of 4-25MHz
- PLL Multiplication Capability: Maximum multiplication factor
Instruction Execution Performance
- DMIPS/MHz: Dhrystone MIPS per MHz
- ARM Cortex-M0: ~0.9 DMIPS/MHz
- ARM Cortex-M3: ~1.25 DMIPS/MHz
- ARM Cortex-M4: ~1.25 DMIPS/MHz
- ARM Cortex-M7: ~2.14 DMIPS/MHz
- CoreMark/MHz: Modern benchmark score
- Instruction Set Efficiency: Thumb-2, RISC-V RV32I, etc.
2. Memory Performance
Flash Performance
- Read Wait States: 0-15 wait states
- Prefetch Buffer: Support for instruction prefetch
- Flash Accelerator: Technologies like ART, Prefetch, etc.
- Code Execution Speed: Performance of executing directly from Flash
SRAM Performance
- Zero Wait Access: SRAM typically has 0 wait states
- TCM (Tightly Coupled Memory): Tightly coupled memory
- Multi-Bank Design: Parallel access capability
Cache Performance (High-end MCUs)
- Instruction Cache Hit Rate: Typically >90%
- Data Cache Hit Rate: Varies by application
- Cache Size: 4KB-64KB
3. Real-Time Performance Metrics
Interrupt Response
- Interrupt Latency: 12-16 clock cycles
- Interrupt Nesting Depth: Supported levels of nesting
- Number of Priorities: 4-256 priorities
- Tail-Chaining Interrupts: Optimization for consecutive interrupt handling
Task Switching
- Context Switch Time: Microsecond level
- Register Save/Restore: Automatic/Manual
- Stack Pointer Switching: PSP/MSP switch time
4. Mathematical Operation Performance
Integer Operations
- 32-bit Multiplication: Single-cycle or multi-cycle
- Division Performance: Hardware divider support
- Bit Operations: CLZ, REV, etc.
Floating Point Operations (MCUs with FPU)
- Single Precision Floating Point: IEEE 754 standard
- Double Precision Support: Available in some high-end MCUs
- Floating Point Instruction Latency: 1-14 clock cycles
- DSP Instructions: SIMD, MAC, etc.
5. Power Consumption Performance
Dynamic Power Consumption
- Operating Mode Power Consumption: µA/MHz
- Peripheral Power Consumption: Power contribution from various peripherals
Low Power Modes
- Sleep Mode: CPU stops, peripherals run
- Stop Mode: Most clocks stop
- Standby Mode: Lowest power mode
- Wake-Up Time: Time to recover from low power mode
Open Source Performance Testing Tools
1. Comprehensive Performance Benchmarking
CoreMark
Dhrystone
复制
// Classic integer performance test
// Outputs DMIPS (Dhrystone MIPS)
#include "dhry.h"
void dhrystone_test(void) {
// Execute Dhrystone benchmark
}
Whetstone
复制
// Floating point performance test (for MCUs with FPU)
// Outputs MWIPS (Million Whetstone Instructions Per Second)
2. Real-Time Performance Testing Tools
rt-tests
复制
# Real-time performance testing in Linux environment
git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
cd rt-tests
make
./cyclictest -p 80 -n -i 10000 -l 10000
MCU-Specific Interrupt Latency Testing
复制
// Custom interrupt response time test
void interrupt_latency_test(void) {
// Configure GPIO and timer
// Measure delay from external signal to interrupt handling
}
3. Memory Performance Testing
STREAM Benchmark (Improved version for MCUs)
复制
// Memory bandwidth test
void stream_test(void) {
// Copy: a[i] = b[i]
// Scale: a[i] = q*b[i]
// Add: a[i] = b[i] + c[i]
// Triad: a[i] = b[i] + q*c[i]
}
Memory Test
复制
// Memory access pattern test
void memory_pattern_test(void) {
// Sequential access
// Random access
// Stride access
}
4. Floating Point Performance Testing Tools
Linpack (MCU version)
复制
// Linear algebra operation performance
void linpack_test(void) {
// Solve linear equations
// Output MFLOPS
}
FFT Benchmark
复制
// Using CMSIS-DSP library
#include "arm_math.h"
void fft_performance_test(void) {
arm_cfft_f32(&arm_cfft_sR_f32_len1024, testInput_f32_10khz, 0, 1);
}
5. Power Analysis Tools
EnergyTrace (TI)
- Hardware tool, but has open-source analysis software
- Real-time power monitoring
Power Profiler Kit (Nordic)
- Open-source host software
- Accurate current measurement
Custom Power Testing
复制
// Software power benchmark test
void power_benchmark(void) {
// Power testing under different workloads
// CPU-intensive tasks
// I/O-intensive tasks
// Mixed load
}
6. Open Source MCU Benchmark Suites
EEMBC IoTMark
复制
# IoT device performance benchmark
git clone https://github.com/eembc/iotmark.git
# Needs to be ported to target MCU platform
MiBench (Embedded version)
复制
# Embedded system benchmark suite
git clone https://github.com/MiBench/MiBench.git
# Includes tests for various application scenarios
TinyMLPerf
复制
# Machine learning inference performance testing
git clone https://github.com/mlcommons/tiny.git
# Suitable for MCUs that support AI acceleration
7. Custom Performance Testing Framework
MCU Performance Test Suite
复制
typedef struct {
char* test_name;
void (*test_func)(void);
uint32_t iterations;
uint32_t expected_cycles;
} perf_test_t;
void run_performance_suite(void) {
perf_test_t tests[] = {
{"Integer Math", integer_math_test, 10000, 0},
{"Float Math", float_math_test, 1000, 0},
{"Memory Copy", memory_copy_test, 1000, 0},
{"GPIO Toggle", gpio_toggle_test, 100000, 0}
};
for(int i = 0; i < sizeof(tests)/sizeof(tests[0]); i++) {
measure_performance(&tests[i]);
}
}
8. Cross-Compilation and Porting Considerations
复制
# Compilation options for ARM Cortex-M
CFLAGS += -mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=hard
CFLAGS += -O2 -g -Wall
CFLAGS += -DCOREMARK_ITERATIONS=1000
These tools and metrics can help you comprehensively evaluate the CPU performance of MCUs, select the appropriate MCU, and optimize application performance.