
In embedded software development, a significant portion of the workload involves data collection, processing, and output. The average value algorithm plays a crucial role in this data processing, serving as a core tool for handling sensor data, signal filtering, and error compensation.
The core requirements are tobalance accuracy, real-time performance, resource usage (memory / computational power), and anti-interference capability. Different algorithms are suitable for different hardware conditions, data characteristics (such as stable data / impulse noise), and usage scenarios. Below is a detailed analysis of six commonly used average algorithms:
1. Arithmetic Mean
1. Algorithm Principle
The most basic average algorithm sums up<span>N</span> consecutive sampled data<span>x₁, x₂, ..., xₙ</span> and divides by the number of data points, with the formula:
Avg = (x₁ + x₂ + ... + xₙ) / N
Optimized Implementation: In embedded systems, it is common to use “accumulation – single division” (to avoid multiple divisions consuming computational power). If<span>N</span> is a power of 2 (such as 8, 16), it can be replaced with a right shift operation (such as<span><span>/8 = >>3</span></span>), further improving efficiency.
2. Advantages
- Extremely Simple Calculation requires only “addition + division (or shifting)”, with very low computational requirements;
- No Additional Memory does not require storing historical data (only one accumulation variable), resulting in low memory usage;
- Intuitive and Understandable results align with common sense, suitable for scenarios with low accuracy requirements.
3. Usage Scenarios
- Data is stable without sudden noise: such as indoor temperature monitoring (slow changes, no pulse interference);
- Low-end embedded systems with limited hardware resources: such as the 51 microcontroller we used previously;
- High real-time requirements but low accuracy requirements: such as simple current sampling, counting the duration of button presses.
4. Limitations
- Poor Anti-Interference Capability if there is one extreme value in the data (such as a pulse caused by sensor mis-triggering), it will severely skew the result (e.g.: the average of
<span>[1,2,3,100]</span>is 26.5, far from the true level); - Requires Fixed Data Volume
<span>N</span>: must wait for<span>N</span>data points to be collected before calculating, cannot “output the current average in real-time”.
2. Sliding Window Average
1. Algorithm Principle
Maintains a fixed size<span>N</span> “data window” (usually stored in an array). For each new data point collected, the oldest data in the window is removed, the new data is added, and the average of all data in the window is calculated. The formula is:
New_Avg = (Old_Sum - Oldest_Data + New_Data) / N
(Optimization point: by maintaining the “window sum”, it avoids recalculating the sum each time the average is computed, requiring only 3 arithmetic operations)
2. Advantages
- Strong Real-Time Performance the average can be updated with each new data point collected, without waiting for
<span>N</span>data points; - Resistance to Short-Term Interference extreme values will be removed as the window slides, having limited impact on the result;
- Controllable Stability the window size
<span>N</span>can be flexibly adjusted (<span>N</span>larger results in smoother results but slower response;<span>N</span>smaller results in faster response but greater fluctuations).
3. Usage Scenarios
- Data changes slowly but occasionally has short-term interference: such as vehicle water temperature monitoring (may have short-term fluctuations when the engine starts), indoor light sensor data processing;
- Scenarios requiring real-time output of smooth results: such as smart home temperature and humidity display screens (need to update in real-time without noticeable jumps);
4. Limitations
- Requires Additional Memory needs to store
<span>N</span>historical data,<span>N</span>larger means higher memory usage; - Not effective against sustained extreme values: if the interference persists (such as a sensor fault outputting a fixed maximum value), all data in the window will be extreme values, and the result will still be severely biased.
3. Weighted Average
1. Algorithm Principle
Assigns different “weights” to different data (usually newer data has a higher weight, older data has a lower weight), emphasizing the influence of recent data, with the formula:
Avg = (w₁x₁ + w₂x₂ + ... + wₙxₙ) / (w₁ + w₂ + ... + wₙ)
where<span>w₁ < w₂ < ... < wₙ</span> (the weight of new data<span>xₙ</span> is the largest), common weight designs are “linearly increasing” (e.g.,<span>w=[1,2,3,4]</span>) or “exponentially increasing” (e.g.,<span>w=[1,2,4,8]</span>).
2. Advantages
- Dynamic Response new data has a higher weight, allowing for quick tracking of data changes (faster response than sliding window average);
- Balance Between Stability and Responsiveness avoids the slow response of arithmetic mean while preventing “instantaneous value jumps” (old data still contributes, ensuring smooth transitions);
- Flexible Weights can be adjusted based on the scenario (e.g., in industrial control, assign higher weights to “critical period data”).
3. Usage Scenarios
- Data changes quickly but requires smooth transitions: such as motor speed control (speed adjustments need to respond quickly but cannot jump instantly), lithium battery voltage monitoring (voltage rises quickly during charging, requiring real-time tracking without fluctuations);
- Scenarios requiring emphasis on recent data value: such as real-time heart rate monitoring (recent heartbeat data reflects the current state better than data from 10 seconds ago);
- For embedded systems with high accuracy requirements (such as blood oxygen sensor data processing in medical devices).
4. Limitations
- High Computational Complexity requires additional calculations for “weighted sum” and “weight sum”, which has certain requirements for low-end MCU computational power;
- Weight Design Relies on Experience: if the weight distribution is unreasonable (e.g., too large a gap between weights), it may lead to result fluctuations or delayed responses.
4. Median Average (also known as “Impulse Interference Resistant Average”)
1. Algorithm Principle
Belongs to “robust algorithms”, the core is to first eliminate extreme values, then calculate the average, with a typical process as follows:
-
Collect
<span>N</span>data points (<span>N</span>is usually an odd number, such as 5, 7);
- Sort the data (in ascending / descending order);
- Remove the
<span>K</span>smallest values and the<span>K</span>largest values after sorting (usually<span>K=1</span>or<span>K=2</span>, i.e., eliminate 1-2 extreme values);
- Calculate the arithmetic average of the remaining
<span>N-2K</span>data points. Example:<span>N=5, K=1</span>, data<span>[1,2,100,3,4]</span>sorted becomes<span>[1,2,3,4,100]</span>, removing 1 and 100, the average of<span>[2,3,4]</span>is 3.
2. Advantages
- Strong Resistance to Impulse Interference even with multiple extreme values (e.g.,
<span>N=7</span>can eliminate 2 maximum + 2 minimum values), it can effectively filter, resulting in values close to the true value; - High Robustness does not depend on data distribution (such as normal distribution, uniform distribution), and has good fault tolerance for sensor sudden failures (such as instantaneously outputting maximum values).
3. Usage Scenarios
- Data contains a lot of impulse noise: such as current sampling in industrial environments (strong electromagnetic interference during motor start-stop causes data jumps), outdoor light sensors (cloud cover / direct sunlight causes instantaneous fluctuations);
- Scenarios requiring extremely high accuracy and stability: such as blood pressure monitoring in medical devices, wheel speed data processing in automotive ABS systems (extreme values may lead to misjudgment).
4. Limitations
- Poor Real-Time Performance requires collecting
<span>N</span>data points and sorting them before calculating, and the sorting operation (such as bubble sort) consumes significant computational power on MCUs (<span>N=7</span>requires about 21 comparison operations); - Requires storing
<span>N</span>data points: memory usage is comparable to sliding window, but adds the time overhead of sorting.
5. Exponential Moving Average (EMA)
1. Algorithm Principle
A type of “infinite window” weighted average that only requires the current data, the last EMA result, and the smoothing coefficient α (<span>0<α<1</span>) to calculate, with the formula:
EMAₙ = α × xₙ + (1-α) × EMAₙ₋₁
where<span>α</span> determines the degree of smoothing:<span>α</span> larger (e.g., 0.8) gives higher weight to new data, resulting in faster response; <span>α</span> smaller (e.g., 0.1) results in smoother results but slower response. Embedded Optimization Implementation: to avoid floating-point operations (reducing computational power consumption),<span>α</span> is expressed as a fraction (e.g.,<span>α=1/8</span>), and the formula is converted to integer operations:<span>EMAₙ = (xₙ + 7×EMAₙ₋₁) >> 3</span> (<span>>>3</span> is equivalent to<span>/8</span>).
2. Advantages
- Extremely Low Memory Usage does not require storing historical data, only the last EMA result (1 variable) needs to be saved, suitable for MCUs with very small memory (such as 8-bit AVR microcontrollers);
- Strong Real-Time Performance the average can be updated with each new data point collected, requiring only 2 multiplications (or shifts) + 1 addition, resulting in fast computation speed;
- Adjustable Smoothness by adjusting
<span>α</span>, the balance between response speed and smoothness can be achieved without modifying the window size.
3. Usage Scenarios
- Scenarios with extremely limited memory and computational power: such as IoT sensor nodes (e.g., low-power mode of ESP8266, requiring minimal resource usage), step counting in smart bands (real-time updates with limited memory);
- Data changes slowly and requires real-time smoothing: such as indoor temperature and humidity monitoring, mobile phone battery level estimation (requiring real-time display without noticeable jumps);
- Scenarios unsuitable for storing historical data: such as one-time use sensor modules (only need to output real-time smooth values without needing to backtrack data).
4. Limitations
- Sensitive to Initial Value the initial EMA value must be manually set during the first calculation (usually using the first sampled value), and the results may deviate from the true value for the first few iterations, stabilizing after 3-5 iterations;
- Sensitive to sustained extreme values: if new data is persistently extreme, EMA will slowly approach that value (due to the cumulative effect of
<span>(1-α)</span>), and cannot directly eliminate like the median average.
6. Cumulative Average
1. Algorithm Principle
Calculates the average of “all data from the start of sampling to the current point”, with the formula:
Avgₙ = (Avgₙ₋₁ × (n-1) + xₙ) / n
where<span>n</span> is the current sampling count (for the first sampling<span>n=1</span>,<span>Avg₁=x₁</span>; for the second sampling<span>n=2</span>,<span>Avg₂=(x₁+x₂)/2</span>, and so on).
2. Advantages
- No Fixed Window Limitations as the amount of data increases, the result gradually approaches the true average, suitable for “long-term monitoring” scenarios;
- Low Memory Usage: only the last average value and sampling count
<span>n</span>need to be saved, without storing historical data.
3. Usage Scenarios
- Long-Term Data Trend Analysis: such as monthly average temperature statistics at environmental monitoring stations, average current consumption calculations during equipment operation (need to accumulate all data to reflect long-term status);
- Scenarios that do not require real-time responses to rapid changes: such as average voltage monitoring during battery charge and discharge cycles (focusing on the average level over the entire cycle rather than instantaneous fluctuations).
4. Limitations
- Poor Real-Time Performance the influence of new data on the average diminishes as
<span>n</span>increases (e.g.,<span>n=1000</span>, new data only accounts for 0.1% of the average), making it unable to track short-term changes; - Poor Anti-Interference Capability: early extreme values will have a long-term impact on the result (e.g.,
<span>n=1000</span>, the extreme value from the first sampling still accounts for 0.1% of the average).
Comparison of Algorithms and Selection Recommendations
| Algorithm Type | Core Advantages | Core Disadvantages | Typical Scenarios |
|---|---|---|---|
| Arithmetic Mean | Extremely Simple, Low Resource | Poor Anti-Interference, Poor Real-Time Performance | Stable Data, Low-End Devices |
| Sliding Window Average | Good Real-Time Performance, Resistance to Short-Term Interference | Requires Storing Historical Data | Slowly Changing Data, Real-Time Display |
| Weighted Average | Fast Response, Balances Smoothness and Dynamics | Complex Calculation, Difficult Weight Design | Fast Changing Data, High Accuracy Requirements |
| Median Average | Extremely Strong Resistance to Impulse Interference, High Robustness | Poor Real-Time Performance, Requires Sorting | Strong Interference Environments, High Reliability Requirements |
| Exponential Moving Average | Low Memory, High Real-Time Performance | Sensitive to Initial Value, Weak Against Sustained Interference | Low Resource Devices, Real-Time Smoothing |
| Cumulative Average | Long-Term Trend Accuracy, Low Memory | Poor Short-Term Response, Weak Anti-Interference | Long-Term Data Statistics, Trend Analysis |
Core Principles for Selection
- Resource Priority if using an 8-bit MCU or with insufficient memory, prioritizeArithmetic Mean, EMA;
- Anti-Interference Priority if there is impulse noise, prioritizeMedian Average; for short-term interference, chooseSliding Window;
- Real-Time Performance Priority if real-time updates are needed, chooseEMA, Sliding Window; if delays are acceptable, chooseArithmetic Mean, Median Average;
- Accuracy Priority if balancing dynamic response and smoothness is needed, chooseWeighted Average, EMA; if filtering extreme values is needed, chooseMedian Average.