Next-Generation MCU Development Towards Edge AI and Real-Time Control

Real-time control systems are a key driving force in the development of modern systems, with embedded processing playing an indispensable role at their core. The core task of these systems is to perform precise sensing of the physical world; analog signals from the motor control system, such as current, voltage waveforms, and position sensors, are collected and converted into digital signals through analog-to-digital converters (ADC) and sampling circuits, and then sent to the real-time control microcontroller (MCU); after a series of complex mathematical transformations and calculations, the results are ultimately output to the actuators via pulse-width modulation (PWM).

In this electronic real-time control world, the MCU or digital signal processor (DSP) undoubtedly serves as the intelligent brain of the entire system. The sensing part acts like the sensory organs, the actuators are akin to the muscular system, while communication modules such as EtherCAT, Ethernet, CAN bus, or serial communication form the lifeblood of this foundational real-time control system.

In real life, especially in industrial and automotive applications, motor drives and digital power conversion are the most common real-time control systems. Both applications require processors to have extremely high real-time capabilities, not only needing powerful mathematical computation and real-time processing abilities but also excellent ADC and PWM, which together form an efficient and organic real-time control system through a series of interlinked mechanisms.

Challenges in Optimizing Real-Time Control Performance and Power Consumption

So, what challenges do engineers face when designing real-time systems? “How to utilize high-level real-time control MCUs to create a control system that is both precise and safe, while also being cost-effective, is undoubtedly a problem they face,” said Shi Ying, Director of Technical Support for Texas Instruments (TI) China, at the recent launch of the C2000 MCU series. “The C2000 series MCU products from TI have demonstrated outstanding performance in power conversion, solar energy, inverter systems, servo drives, AC motor drives, DC brushless motor drives, as well as automotive conversion and traction inverters, thanks to their rich application experience.”

Launch of Two New C2000 MCU Products

In response to the aforementioned technical demands and challenges, Texas Instruments recently launched two new products—the TMS320F28P55x series and the F29H85x series. According to Shi Ying, the C2000 series has undergone 30 years of continuous evolution and innovation since its inception. These two new products undoubtedly have epoch-making significance compared to previous products.

The TMS320F28P55X series introduces a neural processing unit (NPU) for the first time in the C2000 series. This AI accelerator, also known as a neural network inference accelerator engine, can independently perform common computational operators in the AI field. In neural network inference calculations, convolution calculations dominate. While a regular CPU can also perform this computation, its efficiency is relatively low. By utilizing the neural network acceleration unit, performance can improve by 5 to 10 times compared to using the C2000’s CPU.
The F29H85x series represents a significant upgrade of the C2000 core. This series is an iterative upgrade from the C28 to the C29 version, occurring 23 years later. It employs a very long instruction word (VLIW) architecture, allowing for the simultaneous execution of 8 instructions within a single machine cycle. This improvement enhances basic computational performance by more than 2 times compared to the C28. For example, for fast Fourier transform (FFT) calculations, the performance of the C29 is 5 times better than that of the C28; for mathematical computations required for motor control, the average performance of the C29 is 1.8 times that of the C28; for some mathematical computations in digital power conversion, the C29’s computational performance is 2.8 times that of the C28.

TI C2000 MCU Development History

To better understand the iteration and evolution of the C2000 CPU versions, Shi Ying presented the historical development of the C2000 series products. As shown in the following figure, since the launch of the first TMS320C10 processor in 1994, which provided one of the very few and powerful DSP single-chip computations at the time, the C2000 DSP has undergone 30 years of glorious development. From the initial 16-bit computing width to the current 64-bit computing width, the C2000 series has continuously iterated and upgraded, adding floating-point computation units, mathematical co-processors, trigonometric function calculators, vector computation acceleration units, and continuously optimized PWM and ADC peripherals. Notably, in 2023, the 280015X saw TI implement a lock-step (Lock Step) CPU core, allowing two CPUs to run in lock-step, supporting the device’s functional safety. The latest F28P55X also incorporates an NPU, capable of performing AI convolution calculations, improving performance by 5 to 10 times compared to the CPU era. The next-generation C2000 core achieves a computing width of 64 bits.

TMS320F28P55x Series: The First Real-Time MCU with Integrated Neural Processing Unit (NPU) in the Industry

Currently, in real-time control systems in industrial and automotive fields, more and more tasks are leaning towards adopting smarter, AI-based methods. Regarding the integrated NPU, several practical application cases have emerged, such as arc detection applications in solar energy and power supply systems. When contacts occur, high-voltage wires or contacts often generate arcing, a potential danger that can lead to fires, making effective detection and prevention of arcs particularly important.

In addition, predicting the operational status of motors is also crucial, aiming to foresee potential future failures. In fault detection and prediction, AI technology plays an indispensable role. Through the computational mechanism of the F28P55X, the accuracy of fault detection can reach as high as 99%. Whether in motor drives or power conversion, the F28P55X, which has integrated the NPU core, can achieve fault detection functionality on a single chip while maintaining its original functions. This innovative design greatly optimizes the size and cost of electronic systems.

The F28P55X series boasts impressive specifications, with built-in flash memory of up to 1.1MB. For real-time systems, ADC and high-precision PWM are two core peripherals. Specifically, the F28P55X provides 24 high-precision PWM channels and up to 39 ADC channels. The F28P55X is not a single device but a series covering multiple configurations. TI has launched different memory specifications, automotive-grade certifications, and functional safety levels. It is expected that the series will eventually offer more than 40 models to meet diverse market needs.

Edge AI Aids Faster and Safer Decision-Making

Speaking of the NPU, it also has the capability to execute AI computations. Currently, AI technology is trending across various industries, with numerous application examples emerging. From the perspective of the location of the computing unit, AI can be divided into cloud AI and edge AI. For embedded systems or real-time control systems, edge AI is undoubtedly a necessary choice. Edge AI refers to the execution and computation of all neural network inference algorithms on the device side (i.e., the edge). The advantages of this model are mainly reflected in the following three aspects:

First, real-time performance is significantly improved, as there is no need to upload data to the cloud, thus avoiding transmission delays.
Second, through algorithm optimization and the addition of the NPU, the overall power consumption of the system can be reduced. The NPU excels at handling convolution calculations and neural network computations, enabling higher efficiency and lower power consumption.
Finally, from the perspective of safety and reliability, avoiding the process of data collection and transmission to the cloud helps enhance the safety of devices.

Real-Time Optimization of System Fault Detection

The core processing capability of the NPU lies in running convolutional neural network (CNN) models. As mentioned earlier, the NPU can improve computational efficiency by 5 to 10 times compared to the CPU, and in fault detection applications, its accuracy can reach 99%.

Arc Fault Detection: Integrating Edge AI to Enhance System Efficiency and Safety

How to further reduce system costs and shrink system size through the TMS320F28P55X? Shi Ying introduced a typical application scenario of solar inverter systems by taking traditional arc fault detection as an example. As shown in the figure below, the electrical energy generated by solar panels must be processed through DC/DC converters or AC/DC inverters and maximum power point tracking (MPPT) controllers before being sent to the grid. These functions are typically performed by the C2000 series real-time control MCUs. However, in traditional schemes, an additional MCU-2 is needed for power arc detection.

In traditional non-NPU solutions, it is necessary to sample the DC bus voltage and current and set a series of trigger thresholds or rules to determine whether an arc occurs. This method has many limitations, and the detection accuracy is often difficult to improve, typically only reaching around 85%. Inaccurate detection can lead to two consequences: one is a missed report, where an arc occurs but is not detected, increasing the risk of fire or downtime; the other is a false report, where no arc occurs but an alarm is triggered, potentially leading to unnecessary downtime and affecting production efficiency.

In the process of adopting the innovative solution of the F28P55X, the original DC/DC converters, AC/DC inverters, and MPPT systems continue to utilize the core technology of the C2000 series, meaning the original real-time control topology and hardware configuration remain largely unchanged, especially at the software algorithm level, which requires no adjustments. The only change is that the built-in NPU of the F28P55X is used specifically for executing arc detection tasks. So why can it achieve a high detection accuracy of 99%? This is thanks to our advanced offline edge AI tool—TI Edge AI Tools. This tool can deeply train a large amount of current and voltage data occurring during arcing to build an accurate CNN model. Once the model training is completed, it can be easily deployed to the NPU of the F28P55X through dedicated software development tools. Since this process is based on a large dataset for training rather than relying on traditional software design rules and trigger thresholds to determine arc situations, its detection accuracy can reach as high as 99%.

Product Application Case – Arc Detection Module

It is worth mentioning that the above content is not merely theoretical. Although the F28P55X has just been officially released, customers have already put it into practical use. For example, TI’s partner—SOLAX—has successfully developed an arc detection module based on the F28P55X. This module can shorten the time for a single arc judgment to 5 ms and can quickly automatically disconnect the circuit within 0.2 seconds after detecting an arc, with an almost zero false alarm rate. The core of this module is the TMS320F28P55x chip. It is based on the principles mentioned above that SOLAX has achieved such outstanding performance.

At the same time, Li Xinfu, Chairman and General Manager of SOLAX, also commented on this product, saying, “We use TI’s edge AI technology to improve the accuracy of arc fault detection in various solar devices. Traditional arc fault detection methods are limited in adaptability and sensitivity, leading to false alarms or missed actual arc fault events, negatively impacting productivity, maintenance costs, and operator safety. With the support of TI’s edge AI MCU, we can locally train and execute neural network algorithms to identify patterns and detect anomalies, thereby enhancing the safety and reliability of our operations.”

F29H85x Series: The New C29 Core Doubles Real-Time Signal Chain Performance

The F29H85x series is a product equipped with the new C29 core. Shi Ying emphasized again that this is a significant iterative upgrade of the C2000 series CPU over many years, with its processing bit width jumping from 32 bits to 64 bits and equipped with a very long instruction word (VLIW) architecture, allowing for the parallel completion of up to 8 instructions in a single instruction cycle. Parallel computing is a major advantage of the DSP architecture, which is also one of the significant differences between DSP and general-purpose CPUs.

In addition to the significant improvement in CPU performance, the F29H85x series naturally supports two major safety domains: functional safety and information security. Specifically, the F29H85x can reach the ASIL-D level of the automotive ISO26262 standard, which is the highest standard; at the same time, in the industrial field, it also meets the SIL-3 level of the IEC61508 standard, which is likewise the highest standard for industrial safety. In terms of information security, it introduces a widely used hardware security module (HSM) in the industry, enabling this series of devices to meet various strict information security requirements in different regions worldwide. Whether for international or other regional standards, compliance can be achieved through the HSM module. Functional safety and information security are two indispensable key performances of the F29H85x series.

To ensure the functional safety and information security of the CPU internally, TI has designed multiple isolation mechanisms in the architecture of the F29, similar to the functionality of a firewall. In particular, the addition of functional safety and information security units (SSU) enables effective isolation of application program code, application data, confidential content, and ordinary content, fully meeting the requirements of functional safety and information security.

New C29 Core Using VLIW Architecture

The following figure represents the most critical brain in real-time control systems—the computing unit. Why is it necessary to continuously enhance the performance of computing units by introducing co-processors and optimizing architectures to strengthen real-time mathematical computing capabilities? Shi Ying explained that in the industrial and automotive fields, the speed of improvement in execution efficiency is increasing, and the speed of motors is also continuously rising. With the increase in motor speed, the application of the new generation of power semiconductors has led to a synchronous increase in switching modulation frequency. Against this backdrop, the computing efficiency of real-time computing processors (MCUs) urgently needs to be significantly improved.

As mentioned earlier, compared to the C28, the C29 can achieve a 2 to 3 times improvement in signal chain performance. For mathematical calculations and real-time calculations related to motor drives, its performance can be improved by 2 times; in power conversion, the performance of the C29 can be enhanced by about 3 times. If only considering FFT calculations, the calculation speed of the C29 is 5 times faster than that of the C28. Moreover, compared to using Cortex-M7 for FFT calculations, the C29 is 6 times faster. The CPU version of the C29 has achieved a remarkable improvement in mathematical computing capabilities.

Additionally, compared to the C28, the C29’s interrupt response speed has also improved by 4 times. When measuring this interrupt response speed, it is essential to consider the entire loop of real-time computing, from the signal input of the sensor ADC to the output of the PWM. Fast calculation speed alone is not enough; data collection and control command output also need to be accelerated to achieve a substantial improvement in the performance of real-time control systems. Therefore, interrupt response speed is equally critical. The so-called general code performance mainly refers to the performance in aspects such as memory copying, data movement, and communication. The C28 core is essentially a DSP, and from the perspective of past architectural designs, the C28 is not adept at handling internal management code such as data movement instructions. However, this issue has been well addressed in the F29.

Typical Applications

Thanks to the improvement in CPU performance, the F29H85x can be widely applied in numerous real-time control fields. For example, in the integrated architecture of onboard chargers (OBC), DC/DC converters, and main MCUs in vehicles, the F29H85x can perform excellently. It is also suitable for multi-motor traction inverters and power steering systems. Furthermore, the F29H85x is not only applicable to photovoltaic inverters but also supports the latest power topology structures, online UPS, and robotics. It can be said that many real-time control domains can be accomplished by adopting the latest F29H85x.

Enhancing Real-Time Control Capabilities of Electric Vehicles (EVs)

Due to the improved interrupt efficiency of the F29H85x and its support for functional safety and information security, it can achieve the three-in-one functionality of OBC + DC/DC + main MCU with just one MCU. This not only enhances efficiency but also reduces size and lowers costs. How is this efficiency improvement achieved? By enabling faster computations and interrupt responses, support for third-generation semiconductor power devices has significantly increased the PWM switching frequency, thereby improving system efficiency. A higher switching frequency means that the size of magnetic components can be reduced, consequently shrinking the overall size of the system.

In the field of power conversion, various innovative power topology structures have emerged in recent years. The most common type is the matrix converter, which can accomplish complex AC/AC and DC/DC conversions through a single topology. On the one hand, this has changed the demand for power levels, magnetic components, and large capacitor components compared to traditional power topology structures; on the other hand, it has also raised new requirements for the real-time performance, PWM, and ADC channels of embedded processors. Therefore, the F29H85x is very suitable for the new matrix converter topology structures. Simultaneously, in all these systems, the F29H85x naturally meets the highest levels of functional safety and information security requirements.

High-Voltage Integrated Electric Vehicle Demonstration

The following figure is a block diagram example of an onboard charger (OBC) + high and low voltage DC converter (HV-LV DC/DC) + main application. In traditional frameworks, completing the functions of OBC + HVLV DCDC + main typically requires three MCUs. However, when using the F29H85x, due to its internal integration of a pair of lock-step CPUs (CPU1 and CPU2) and an independent C29 core (CPU3), only one MCU is needed to fulfill the entire system’s functionality. The lock-step operation of CPU1 and CPU2 effectively supports ASIL-D level functional safety requirements, and both can run AUTOSAR, which is required by almost all devices. Meanwhile, CPU3 can independently undertake the control loop of OBC and DC/DC, achieving efficient operation of a single-chip system.

Demonstration of Traction Inverter Using a Single MCU

In automotive traction motor control, it often involves not just one motor but potentially multiple motors working in coordination. The most common configuration is a dual-motor application. In traditional dual-motor systems, each traction motor requires an independent motor drive control loop, along with a host to manage functional safety and AUTOSAR operations. Additionally, expensive resolver decoder circuits are needed to detect the rotor position of the traction motors. A dual-motor system would require two such controllers and two resolver decoder circuits. However, with the F29H85x, the lock-step operation of CPU1 and CPU2 can handle functional safety and AUTOSAR tasks, while CPU3 controls the two motors. Notably, the F29H85x integrates resolver decoder functionality, or alternatively, uses another magnetic position sensing solution provided by TI, enabling all functions to be completed with one chip and integrated into the overall system.

The C2000 Ecosystem

The above is an introduction to the main performance, features, and typical application scenarios of the F29H85x. As an MCU product, having the device itself is not enough; a robust ecosystem is also needed to support it. To enable customers to more easily develop applications based on C2000, TI provides rich ecosystem support, ranging from reference designs to hardware and software design source files, as well as software and tool support. Especially during the migration from F28 to F29, dedicated migration tools are provided to help customers quickly transition their F28-based designs to F29. Additionally, there is native support for FreeRTOS.

In terms of functional safety, the entire F29 series and some automotive-grade products of the F28P55X have received ISO26262 functional safety certification. Development tools have also been comprehensively prepared, including diagnostic libraries and secure development tools. In modeling and simulation, support for modeling simulation tools from ETAS and MathWorks is provided. Furthermore, various development boards with different configurations and free CCS and SysConfig visual configuration tools are available. In terms of AUTOSAR, support is provided for internationally renowned suppliers such as VECTOR and ETAS, and active collaboration is underway with Chinese AUTOSAR suppliers to synchronize the development of the Chinese version of AUTOSAR.

Finally, in terms of information security, especially in the automotive sector, the most commonly used solutions still come from VECTOR’s HSM library. At the same time, TI is actively collaborating with domestic information security partners. For example, partners like Yishi Intelligent will support TI’s HSM development to provide TI customers with encryption libraries that meet domestic trade secret standards.

Related Resources

Currently, all information regarding the F29H85x and F28P55X series MCUs has been published on TI.com. Customers can place orders for these products on the product page and obtain relevant evaluation kits, technical articles, and application notes.

Conclusion

The launch of Texas Instruments’ two new C2000 series MCUs—the TMS320F28P55x and the F29H85x—marks another advancement in real-time control technology. The TMS320F28P55x series integrates edge AI hardware accelerators, achieving more intelligent real-time control with a fault detection accuracy rate of up to 99%. The new 64-bit C29 core in the F29H85x series has more than doubled real-time control performance compared to previous generations, achieving integrity levels that meet the automotive safety integrity level ASIL-D and SIL-3.

Through these innovations, TI not only meets the modern real-time control system’s demands for high performance, low power consumption, and high reliability but also provides engineers with rich ecosystem support, including reference designs, development tools, and AUTOSAR compatibility. From the practical cases of partners like SOLAX, the successful application of these new technologies further demonstrates their outstanding performance and broad application potential.