How to Calculate MIPS Data

MIPS stands for Million Instructions Per Second, which indicates the number of millions of instructions executed per second. Most DSPs belong to the RISC type, which means they have a small instruction set, with the majority being single-cycle instructions. Only a few instructions, such as jumps and function calls, require multiple cycles to execute. Therefore, for a DSP chip, the measure of its MIPS is based on its maximum supported clock frequency, such as the 400MHz SHARC 21489. Under the condition of single-cycle instructions, for a 400MHz DSP, we consider its maximum MIPS to be equal to 400*10^6 / 10^6 = 400MIPS. Some engineers use MCPS to represent Million Cycles Per Second, which is more accurate, but it is almost the same as MIPS.

In practical projects, due to power consumption, heat dissipation, hardware settings, and other requirements, a DSP that can theoretically run at a maximum frequency of 400MHz may only operate at a lower frequency, which will correspondingly reduce its supported MIPS. In battery-powered products, lower MIPS consumption means longer battery life. For example, the SHARC 21489 can actually run up to 450MHz, but it requires a higher operating voltage. Sometimes, the idle MIPS data of a DSP, along with memory considerations, becomes crucial for chip selection. Occasionally, IC suppliers may exaggerate the MIPS capability of their chips. For instance, a 160MHz HiFi5 DSP, which is understood to mean a maximum of 160MIPS, may utilize HiFi5’s parallel instructions, such as supporting 8 MACs in a single cycle or 8 single-cycle 32-bit floating-point multiplications, claiming a maximum of 160*8=1280MIPS. However, in reality, our DSP programs can only utilize such parallel instructions in a very small portion; most of the time, they will be executing other single-cycle instructions. In fact, almost all DSPs support parallel instructions. For example, the SHARC 21489 can support one floating-point multiplication, one floating-point addition, and two memory read/write operations in a single cycle. If we also utilize its SIMD features, the MIPS would become 400*4*2=3200, but in practice, DSP programs cannot consist entirely of such operations.

Therefore, the measure of a DSP’s MIPS capability is its configured operating frequency. Returning to the topic of how to calculate the MIPS required for an algorithm to run, there are two methods:

1. Count the DSP cycles required for the algorithm to run.

Many DSPs have registers that record the DSP running cycles. The value of the register increments or decrements by 1 for each cycle. When the counting register increases and reaches its limit, it automatically resets to 0; when it decrements and reaches its lower limit, usually 0, it resets to a preset maximum value. If we can obtain the cycles consumed by the algorithm execution, we can use the following formula to calculate the required MIPS. For audio DSP processing, which is commonly processed by frame:

MIPS = CYCLES_USED * Sample_rate / frame_size / 10^6.

Where CYCLES_USED is the number of DSP core cycles consumed by the algorithm, which can be recorded by noting the value of the cycles register at the start of the algorithm, say a; then at the end of the algorithm, record the value of the cycles register, say b.

frame_size is the number of PCM samples in this frame. It is important to set the correct frame length. For example, in traditional speech signal processing, we use STFT, where the FFT length is 512, and there may be a 50% overlap, so the corresponding frame_size should be 512-512*50%=256.

The sample code is as follows:

a = get_ccount();

algorithm_exec();

b = get_ccount();

mips = (b-a)*sample_rate/frame_size/10^6;

When using the above method to count MIPS on actual hardware, it is important to consider the impact of other high-priority processes, such as interrupts and tasks, and to disable them.

Counting MIPS in this way does not require knowledge of the CPU’s clock frequency, as the frequency does not affect the number of executed instructions.

Taking the HiFi5 DSP as an example, its CYCLES counting register is CCOUNT, which is reset during the reset process. The instruction XT_RSR_CCOUNT() is used to read the value of this counting register. The incrementing method has an upper limit of 0xFFFFFFFFUL.

As for the SHARC 21489, its CYCLES counting register is TCOUNT, and the following inline assembly instruction can be used to read it into a variable named variable_name.

asm(“%0 = TCOUNT;” : “=d”(variable_name));

This register operates in a decrementing manner, and its upper limit is determined by the TPERIOD register, which needs to be enabled separately. The complete process is:

// set TPERIOD and TCOUNT in init

timer_set(MAX_COUNT,MAX_COUNT);

timer_on(); // start timer

// test MIPS

asm(“%0 = TCOUNT;” : “=d”(start_cnt));

algorithm_exec();

asm(“%0 = TCOUNT;” : “=d”(end_cnt));

cycles_used = start_cnt – end_cnt

2. Count the time required for the algorithm to run.

For some DSPs that do not have cycle counting registers, we can also obtain MIPS by measuring the actual time. The process is similar to the above:

MIPS = time_used_in_us * sample_rate * cpu_clk_in_MHz / frame_size / 10^6

Where time_used_in_us is in microseconds,

cpu_clk_in_MHz is in MHz.

If the time cannot be obtained, we can record the time using an oscilloscope by controlling the GPIO levels on the hardware.

Once we have set up a method to quickly count MIPS, we can then optimize the DSP code in a software simulation environment to minimize the consumed CYCLES as much as possible.

Leave a Comment