Essential Average Algorithms for Embedded Development: Principles and Pros & Cons Analysis

In embedded software development, a significant portion of the workload involves data collection, processing, and output. The average value algorithm plays a crucial role in this data processing, serving as a core tool for handling sensor data, signal filtering, and error compensation.

The core requirements are tobalance accuracy, real-time performance, resource usage (memory / computational power), and anti-interference capability. Different algorithms are suitable for different hardware conditions, data characteristics (such as stable data / impulse noise), and usage scenarios. Below is a detailed analysis of six commonly used average algorithms:

1. Arithmetic Mean

1. Algorithm Principle

The most basic average algorithm sums upN consecutive sampled datax₁, x₂, ..., xₙ and divides by the number of data points, with the formula:

Avg = (x₁ + x₂ + ... + xₙ) / N

Optimized Implementation: In embedded systems, it is common to use “accumulation – single division” (to avoid multiple divisions consuming computational power). IfN is a power of 2 (such as 8, 16), it can be replaced with a right shift operation (such as/8 = >>3), further improving efficiency.

2. Advantages

Extremely Simple Calculation requires only “addition + division (or shifting)”, with very low computational requirements;
No Additional Memory does not require storing historical data (only one accumulation variable), resulting in low memory usage;
Intuitive and Understandable results align with common sense, suitable for scenarios with low accuracy requirements.

3. Usage Scenarios

Data is stable without sudden noise: such as indoor temperature monitoring (slow changes, no pulse interference);
Low-end embedded systems with limited hardware resources: such as the 51 microcontroller we used previously;
High real-time requirements but low accuracy requirements: such as simple current sampling, counting the duration of button presses.

4. Limitations

Poor Anti-Interference Capability if there is one extreme value in the data (such as a pulse caused by sensor mis-triggering), it will severely skew the result (e.g.: the average of[1,2,3,100] is 26.5, far from the true level);
Requires Fixed Data VolumeN: must wait forN data points to be collected before calculating, cannot “output the current average in real-time”.

2. Sliding Window Average

1. Algorithm Principle

Maintains a fixed sizeN “data window” (usually stored in an array). For each new data point collected, the oldest data in the window is removed, the new data is added, and the average of all data in the window is calculated. The formula is:

New_Avg = (Old_Sum - Oldest_Data + New_Data) / N

(Optimization point: by maintaining the “window sum”, it avoids recalculating the sum each time the average is computed, requiring only 3 arithmetic operations)

2. Advantages

Strong Real-Time Performance the average can be updated with each new data point collected, without waiting forN data points;
Resistance to Short-Term Interference extreme values will be removed as the window slides, having limited impact on the result;
Controllable Stability the window sizeN can be flexibly adjusted (N larger results in smoother results but slower response; N smaller results in faster response but greater fluctuations).

3. Usage Scenarios

Data changes slowly but occasionally has short-term interference: such as vehicle water temperature monitoring (may have short-term fluctuations when the engine starts), indoor light sensor data processing;
Scenarios requiring real-time output of smooth results: such as smart home temperature and humidity display screens (need to update in real-time without noticeable jumps);

4. Limitations

Requires Additional Memory needs to storeN historical data,N larger means higher memory usage;
Not effective against sustained extreme values: if the interference persists (such as a sensor fault outputting a fixed maximum value), all data in the window will be extreme values, and the result will still be severely biased.

3. Weighted Average

1. Algorithm Principle

Assigns different “weights” to different data (usually newer data has a higher weight, older data has a lower weight), emphasizing the influence of recent data, with the formula:

Avg = (w₁x₁ + w₂x₂ + ... + wₙxₙ) / (w₁ + w₂ + ... + wₙ)

wherew₁ < w₂ < ... < wₙ (the weight of new dataxₙ is the largest), common weight designs are “linearly increasing” (e.g.,w=[1,2,3,4]) or “exponentially increasing” (e.g.,w=[1,2,4,8]).

2. Advantages

Dynamic Response new data has a higher weight, allowing for quick tracking of data changes (faster response than sliding window average);
Balance Between Stability and Responsiveness avoids the slow response of arithmetic mean while preventing “instantaneous value jumps” (old data still contributes, ensuring smooth transitions);
Flexible Weights can be adjusted based on the scenario (e.g., in industrial control, assign higher weights to “critical period data”).

3. Usage Scenarios

Data changes quickly but requires smooth transitions: such as motor speed control (speed adjustments need to respond quickly but cannot jump instantly), lithium battery voltage monitoring (voltage rises quickly during charging, requiring real-time tracking without fluctuations);
Scenarios requiring emphasis on recent data value: such as real-time heart rate monitoring (recent heartbeat data reflects the current state better than data from 10 seconds ago);
For embedded systems with high accuracy requirements (such as blood oxygen sensor data processing in medical devices).

4. Limitations

High Computational Complexity requires additional calculations for “weighted sum” and “weight sum”, which has certain requirements for low-end MCU computational power;
Weight Design Relies on Experience: if the weight distribution is unreasonable (e.g., too large a gap between weights), it may lead to result fluctuations or delayed responses.

4. Median Average (also known as “Impulse Interference Resistant Average”)

1. Algorithm Principle

Belongs to “robust algorithms”, the core is to first eliminate extreme values, then calculate the average, with a typical process as follows:

CollectN data points (N is usually an odd number, such as 5, 7);

Sort the data (in ascending / descending order);

Remove theK smallest values and theK largest values after sorting (usuallyK=1 orK=2, i.e., eliminate 1-2 extreme values);

Calculate the arithmetic average of the remainingN-2K data points. Example:N=5, K=1, data[1,2,100,3,4] sorted becomes[1,2,3,4,100], removing 1 and 100, the average of[2,3,4] is 3.

2. Advantages

Strong Resistance to Impulse Interference even with multiple extreme values (e.g.,N=7 can eliminate 2 maximum + 2 minimum values), it can effectively filter, resulting in values close to the true value;
High Robustness does not depend on data distribution (such as normal distribution, uniform distribution), and has good fault tolerance for sensor sudden failures (such as instantaneously outputting maximum values).

3. Usage Scenarios

Data contains a lot of impulse noise: such as current sampling in industrial environments (strong electromagnetic interference during motor start-stop causes data jumps), outdoor light sensors (cloud cover / direct sunlight causes instantaneous fluctuations);
Scenarios requiring extremely high accuracy and stability: such as blood pressure monitoring in medical devices, wheel speed data processing in automotive ABS systems (extreme values may lead to misjudgment).

4. Limitations

Poor Real-Time Performance requires collectingN data points and sorting them before calculating, and the sorting operation (such as bubble sort) consumes significant computational power on MCUs (N=7 requires about 21 comparison operations);
Requires storingN data points: memory usage is comparable to sliding window, but adds the time overhead of sorting.

5. Exponential Moving Average (EMA)

1. Algorithm Principle

A type of “infinite window” weighted average that only requires the current data, the last EMA result, and the smoothing coefficient α (0<α<1) to calculate, with the formula:

EMAₙ = α × xₙ + (1-α) × EMAₙ₋₁

whereα determines the degree of smoothing:α larger (e.g., 0.8) gives higher weight to new data, resulting in faster response; α smaller (e.g., 0.1) results in smoother results but slower response. Embedded Optimization Implementation: to avoid floating-point operations (reducing computational power consumption),α is expressed as a fraction (e.g.,α=1/8), and the formula is converted to integer operations:EMAₙ = (xₙ + 7×EMAₙ₋₁) >> 3 (>>3 is equivalent to/8).

2. Advantages

Extremely Low Memory Usage does not require storing historical data, only the last EMA result (1 variable) needs to be saved, suitable for MCUs with very small memory (such as 8-bit AVR microcontrollers);
Strong Real-Time Performance the average can be updated with each new data point collected, requiring only 2 multiplications (or shifts) + 1 addition, resulting in fast computation speed;
Adjustable Smoothness by adjustingα, the balance between response speed and smoothness can be achieved without modifying the window size.

3. Usage Scenarios

Scenarios with extremely limited memory and computational power: such as IoT sensor nodes (e.g., low-power mode of ESP8266, requiring minimal resource usage), step counting in smart bands (real-time updates with limited memory);
Data changes slowly and requires real-time smoothing: such as indoor temperature and humidity monitoring, mobile phone battery level estimation (requiring real-time display without noticeable jumps);
Scenarios unsuitable for storing historical data: such as one-time use sensor modules (only need to output real-time smooth values without needing to backtrack data).

4. Limitations

Sensitive to Initial Value the initial EMA value must be manually set during the first calculation (usually using the first sampled value), and the results may deviate from the true value for the first few iterations, stabilizing after 3-5 iterations;
Sensitive to sustained extreme values: if new data is persistently extreme, EMA will slowly approach that value (due to the cumulative effect of(1-α)), and cannot directly eliminate like the median average.

6. Cumulative Average

1. Algorithm Principle

Calculates the average of “all data from the start of sampling to the current point”, with the formula:

Avgₙ = (Avgₙ₋₁ × (n-1) + xₙ) / n

wheren is the current sampling count (for the first samplingn=1,Avg₁=x₁; for the second samplingn=2,Avg₂=(x₁+x₂)/2, and so on).

2. Advantages

No Fixed Window Limitations as the amount of data increases, the result gradually approaches the true average, suitable for “long-term monitoring” scenarios;
Low Memory Usage: only the last average value and sampling countn need to be saved, without storing historical data.

3. Usage Scenarios

Long-Term Data Trend Analysis: such as monthly average temperature statistics at environmental monitoring stations, average current consumption calculations during equipment operation (need to accumulate all data to reflect long-term status);
Scenarios that do not require real-time responses to rapid changes: such as average voltage monitoring during battery charge and discharge cycles (focusing on the average level over the entire cycle rather than instantaneous fluctuations).

4. Limitations

Poor Real-Time Performance the influence of new data on the average diminishes asn increases (e.g.,n=1000, new data only accounts for 0.1% of the average), making it unable to track short-term changes;
Poor Anti-Interference Capability: early extreme values will have a long-term impact on the result (e.g.,n=1000, the extreme value from the first sampling still accounts for 0.1% of the average).

Comparison of Algorithms and Selection Recommendations

Algorithm Type	Core Advantages	Core Disadvantages	Typical Scenarios
Arithmetic Mean	Extremely Simple, Low Resource	Poor Anti-Interference, Poor Real-Time Performance	Stable Data, Low-End Devices
Sliding Window Average	Good Real-Time Performance, Resistance to Short-Term Interference	Requires Storing Historical Data	Slowly Changing Data, Real-Time Display
Weighted Average	Fast Response, Balances Smoothness and Dynamics	Complex Calculation, Difficult Weight Design	Fast Changing Data, High Accuracy Requirements
Median Average	Extremely Strong Resistance to Impulse Interference, High Robustness	Poor Real-Time Performance, Requires Sorting	Strong Interference Environments, High Reliability Requirements
Exponential Moving Average	Low Memory, High Real-Time Performance	Sensitive to Initial Value, Weak Against Sustained Interference	Low Resource Devices, Real-Time Smoothing
Cumulative Average	Long-Term Trend Accuracy, Low Memory	Poor Short-Term Response, Weak Anti-Interference	Long-Term Data Statistics, Trend Analysis

Core Principles for Selection

Resource Priority if using an 8-bit MCU or with insufficient memory, prioritizeArithmetic Mean, EMA;
Anti-Interference Priority if there is impulse noise, prioritizeMedian Average; for short-term interference, chooseSliding Window;
Real-Time Performance Priority if real-time updates are needed, chooseEMA, Sliding Window; if delays are acceptable, chooseArithmetic Mean, Median Average;
Accuracy Priority if balancing dynamic response and smoothness is needed, chooseWeighted Average, EMA; if filtering extreme values is needed, chooseMedian Average.

1. Arithmetic Mean

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

2. Sliding Window Average

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

3. Weighted Average

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

4. Median Average (also known as “Impulse Interference Resistant Average”)

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

5. Exponential Moving Average (EMA)

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

6. Cumulative Average

1. Algorithm Principle

2. Advantages

3. Usage Scenarios

4. Limitations

Comparison of Algorithms and Selection Recommendations

Core Principles for Selection

Related posts

Leave a Comment Cancel reply