Understanding CoreMark: A Lightweight Benchmark for CPU Performance

Introduction Brothers, what is the most concerning thing when writing code and debugging chips? That’s right, it’s not knowing how fast your processor really is. Running a few demos and checking a few lines of logs is simply unreliable. Thus, the industry has introduced CoreMark—a lightweight benchmark specifically designed to measure CPU core performance. Today, I will explain in the most colloquial way what it is, what it can do, how to install it, how to run it, and its pros and cons, helping you get started quickly.

What is CoreMark? CoreMark is developed by EEMBC (Embedded Microprocessor Benchmark Consortium) and has a very clear focus: it only measures the computational power of the CPU core, without interference from peripherals, system scheduling, IO, and other noise. Its core algorithms include:

Module Main Function
Linked List Linked list search, merge sort, CRC check
Matrix Multiply Small matrix multiplication (typical memory/operator intensive)
State Machine State machine parsing strings, testing branch prediction

These three pieces of code total only thousands of lines, with high readability, and almost every line is commented. In other words, running CoreMark allows the CPU to loop through these three “small tasks” thousands of times to see how many iterations it can complete per second—this is Iterations/Sec, which is commonly referred to as the “score”.

What Pain Points Does CoreMark Address?

Pain Point Traditional Approach CoreMark’s Breakthrough
Inconsistent Scores Each vendor writes their own benchmark, making results incomparable Unified source code and scoring criteria, with the same rules used globally
Results Affected by System Interference Including OS scheduling and IO delays, leading to significant score fluctuations Only measures core code, almost unaffected by the OS (can run on bare metal)
High Porting Costs Requires extensive modifications to the source code to run on a new platform Provides a core_portme abstraction layer, allowing changes to just a few configuration files
Confusing Reporting Rules Some reports iterations, some report MIPS, some report MHz, making comparisons difficult Officially provides a strict Report Syntax, making it easy to understand at a glance.

Installation and Running Made Easy

Linux/macOS (systems with <span>make</span>)

$ git clone https://github.com/eembc/coremark
$ cd coremark
$ make            # By default runs 2 logs: run1.log (performance) run2.log (validation)
$ cat run1.log    # Contains Iterations/Sec, Compiler, CRC, etc.

Bare Metal/Microcontroller (no OS) 1️⃣ Copy the default porting directory (e.g., <span>linux</span>) to your platform folder, such as <span>my_mcu</span>. 2️⃣ Modify <span>core_portme.mak</span> and <span>core_portme.h</span> (mainly compiler, linker options, timing functions). 3️⃣ Execute <span>make PORT_DIR=my_mcu</span>, to get the executable <span>coremark.exe</span>, which can be flashed onto the chip to run.

Custom Iteration Count (for shorter or longer scoring)

$ make ITERATIONS=10      # Only runs 10 times, suitable for simulators or power testing

Multi-core Parallelism (for multi-thread testing)

$ make XCFLAGS="-DMULTITHREAD=4 -DUSE_PTHREAD -pthread"

This will allow CoreMark to execute in parallel on 4 POSIX threads, and the report will show <span>FORK:4</span> or <span>PTHREAD:4</span> after scoring.

Validation As long as <span>run2.log</span> contains <span>Correct operation validated.</span>, it indicates that your porting did not alter the core algorithm, and the score is reliable.

Pros and Cons Overview

Pros Description
Lightweight: Source code < 2 KB (core code), binary after compilation only a few hundred bytes Suitable for resource-constrained MCUs
Cross-platform: Supports Linux, Windows, bare metal, RTOS, etc. Low porting costs
Standardized Reporting: Officially provides a unified scoring format for easy comparison Widely recognized in the industry
Scalable: Supports multi-threading, PRO version (large data sets) Meets needs from low-end to high-end
Open Source: Complete source code on GitHub, can be forked and improved at any time Active community, issues are easily resolved
Cons Description
Only Measures Core: Does not include cache, memory bandwidth, IO, and other system characteristics Overall system performance evaluation still requires other benchmarks
Compiler Sensitive: Different optimization options can lead to significant score fluctuations Compiler version and flags must be specified in the report
Run Time ≥ 10 seconds: Short simulations may not yield effective scores Special settings are needed in low power scenarios
Reporting Rules are Complicated: Newcomers may overlook validating seed/buffer sizes The official documentation is detailed, and it is recommended to read it carefully

Practical Summary

  • • Want to quickly understand the computational power of an MCU? Run CoreMark once, and you’re done in minutes.
  • • If you want to boast in the product datasheet that “CoreMark = XXXX iterations/s”, remember to include the compiler, optimization parameters, thread count, and memory allocation method, or you may be challenged.
  • • If you are a chip manufacturer, it is recommended to include CoreMark-PRO (with larger data sets) in the testing pipeline to provide customers with a more complete performance picture.
  • • Finally, don’t forget to submit the generated <span>run1.log</span> to the official submission page to compare your chip’s performance with the rest of the world and see if you are “average” or a “top performer”.

Project Address: https://github.com/eembc/coremark

Leave a Comment