Introduction
Hello everyone, I am Engineer Li. I have been working in the automation industry for 15 years, mainly focusing on the design and maintenance of industrial automation control systems. Over the years, I have visited many factories and dealt with various PLC faults. From initially being flustered to now handling issues calmly, I have found that a good fault diagnosis system can really save engineers a lot of time and effort.
Today, I would like to share my experiences regarding the fault diagnosis systems for Siemens and AB PLCs. This method, which I learned from my American colleagues, has been refined through years of practice and has become a standard configuration for our team. I believe that whether you are a newcomer to the industry or an experienced veteran, you will find some practical tips here.

Why Do We Need a Fault Diagnosis System?
I still remember when I first started in the industry, whenever there was a production line shutdown, I had to rush to the site with my laptop and connect to the PLC to troubleshoot step by step. Sometimes it took hours to discover it was a minor issue, and the pressure from my boss and the production supervisor watching over me was immense! That kind of anxiety is something I believe all engineers can relate to.
A good fault diagnosis system can help us:
- Quickly identify problems and locate fault points
- Reduce downtime and improve production efficiency
- Create fault records to support preventive maintenance
- Reduce the work pressure and intensity for engineers
Hardware Configuration and Environmental Requirements
Basic Hardware Configuration
For Siemens S7 series and AB CompactLogix/ControlLogix series PLCs, we need the following hardware support:
- CPU: At least sufficient storage space for the diagnostic program (it is recommended to reserve more than 20% capacity)
- Storage Card: For fault log recording (Siemens recommends over 4MB, AB recommends a 2GB SD card)
- Communication Module: Ethernet module for remote monitoring and alarm push
- HMI Panel: For on-site display of fault information (recommended touch screen size of 7 inches or larger)
Tip: Don’t underestimate the choice of storage card! I once encountered a situation where a low-quality SD card caused the log files to become corrupted. It is recommended to use industrial-grade storage cards; although they are a bit more expensive, their reliability is much higher.
Software Environment
- Siemens: TIA Portal V15.1 or higher
- AB: Studio 5000 V30 or higher
- Database: It is recommended to use SQL Server Express (the free version is sufficient)
- Alarm Push: Optional OPC UA or MQTT protocol
Core Principles of the Diagnosis System
Our fault diagnosis system is designed based on a three-layer architecture:
- Data Acquisition Layer: Collect various signals and statuses through the PLC
- Logical Analysis Layer: Perform preliminary analysis and fault location in the PLC
- Display and Interaction Layer: Display results through HMI, mobile APP, or web interface
Key Design Concepts
Through years of practice, I have found that the most effective fault diagnosis approach is “reverse thinking”—instead of detecting what is normal, focus on defining what is abnormal. Specifically:
- Define clear conditions for each possible abnormal state
- Use state machine models to manage device operating states
- Use timestamps to record the exact time of fault occurrence
- Establish a fault priority mechanism to distinguish between critical errors and warnings
Code Implementation Details
Below is the fault diagnosis code framework for the Siemens S7-1500 series PLC (the implementation principles for the AB platform are similar):
1. Data Structure Definition
First, we need to define the data structure for fault information:
// Fault information structure
DATA_BLOCK "DB_FaultInfo"
{
struct
{
DWORD TimeStamp; // Timestamp
INT FaultCode; // Fault code
INT DeviceID; // Device ID
INT Priority; // Priority (1-highest, 5-lowest)
BOOL Acknowledged; // Whether acknowledged
STRING Message[100]; // Fault description
} FaultRecord[50]; // Maximum of 50 fault records
INT CurrentIndex; // Current record index
}
2. Fault Detection and Recording
In the main loop, we need to continuously check for various possible abnormal states:
// Called in OB1
FUNCTION "CheckFaults" : VOID
BEGIN
// Check for motor overload
IF "MotorCurrent" > "MotorOverloadThreshold" AND
"Motor_Running" = TRUE THEN
CALL "RecordFault"(
FaultCode := 1001,
DeviceID := 1,
Priority := 2,
Message := 'Motor overload detected'
);
END_IF;
// Check for low air pressure
IF "AirPressure" < "MinAirPressure" AND
"System_Running" = TRUE THEN
CALL "RecordFault"(
FaultCode := 2001,
DeviceID := 0,
Priority := 1,
Message := 'Low air pressure, system halted'
);
END_IF;
// More fault detection...
END_FUNCTION
Tip: In actual projects, I usually create a dedicated fault detection function block (FB) for each device unit, which facilitates modular management and code reuse.
3. Fault Recording Function
FUNCTION "RecordFault" : VOID
VAR_INPUT
FaultCode : INT;
DeviceID : INT;
Priority : INT;
Message : STRING[100];
END_VAR
BEGIN
// Check if the same fault already exists
FOR i := 0 TO 49 DO
IF "DB_FaultInfo".FaultRecord[i].FaultCode = FaultCode AND
"DB_FaultInfo".FaultRecord[i].DeviceID = DeviceID AND
"DB_FaultInfo".FaultRecord[i].Acknowledged = FALSE THEN
RETURN; // This fault already exists, do not record again
END_IF;
END_FOR;
// Record new fault
"DB_FaultInfo".CurrentIndex := ("DB_FaultInfo".CurrentIndex + 1) MOD 50;
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].TimeStamp := "System_Time";
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].FaultCode := FaultCode;
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].DeviceID := DeviceID;
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].Priority := Priority;
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].Message := Message;
"DB_FaultInfo".FaultRecord["DB_FaultInfo".CurrentIndex].Acknowledged := FALSE;
// Decide whether to stop based on priority
IF Priority <= 2 THEN
"Emergency_Stop" := TRUE;
END_IF;
// Trigger alarm
"Alarm_Trigger" := TRUE;
END_FUNCTION
Function Extension Description
Remote Monitoring and Push Notifications
I strongly recommend integrating the fault diagnosis system with remote monitoring. In the projects we have implemented, we usually add the following features:
- Mobile Push Notifications: Push important faults to engineers’ mobile phones via MQTT protocol
- Remote Fault Confirmation: Engineers can confirm known faults through a mobile APP
- Maintenance Guidance: The system can push fault repair guides and videos to the tablet devices of on-site operators
Data Analysis and Predictive Maintenance
After collecting enough fault data, we can conduct deeper analysis:
- Statistical analysis of fault frequency to identify weak points
- Analyze the time patterns of fault occurrences to predict potential fault points
- Combine production parameters analysis to identify potential factors leading to faults
Personal experience: Don’t expect to achieve everything at once with data analysis. Start with basic fault Pareto analysis to identify the most frequent 20% of fault types and prioritize solving these issues, which can usually eliminate 80% of downtime.
Real Application Case
Automotive Seat Production Line
I was responsible for an automated line producing automotive seats, using Siemens S7-1500 PLC. This line produces about 3000 seats daily, and previously experienced 3-4 unexplained shutdowns per week, each taking about 30-45 minutes to recover.
After implementing the fault diagnosis system, we found that:
- 70% of faults were due to pressure fluctuations in the pneumatic system
- 20% of faults were related to operator errors
- The remaining 10% were distributed among various minor issues
To address the main issues, we added a pressure stabilization system and conducted operator training. Three months later, the average number of shutdowns per week dropped to 0.5, and the average recovery time was reduced to under 10 minutes. Most importantly, engineers no longer needed to be on-site every time; 80% of the issues could be diagnosed and guided remotely.
Common Problems and Solutions
1. System Response Slows Down
Problem: After implementing the diagnosis system, the PLC scan cycle lengthens, and the system response slows down.
Solution:
- Execute diagnostic code in a low-priority interrupt
- Optimize data structures to reduce unnecessary array queries
- Consider using a separate diagnostic PLC or edge computing device
2. Excessive False Alarms
Problem: The system generates too many false alarms, causing engineers to be overwhelmed.
Solution:
- Add intelligent filtering algorithms, such as only alarming after detecting an anomaly three times in a row
- Dynamically adjust alarm thresholds based on production status
- Establish a “gray list” mechanism to allow temporary suppression of specific alarms
3. Data Storage Issues
Problem: After long-term operation, the SD card may become full or damaged.
Solution:
- Implement circular buffer management to automatically overwrite the oldest records
- Regularly upload data to the server and clear local storage
- Use data compression algorithms to reduce storage requirements
Tip: My current standard practice is to automatically back up fault data once a month and format the SD card. This small action greatly reduces the likelihood of card damage.
Summary and Insights
After implementing over a dozen fault diagnosis systems, I would like to share a few insights:
-
Start Simple: Don’t design overly complex systems from the start. First, implement basic functions and gradually improve them during use.
-
Value User Experience: The best diagnosis systems are not only technically accurate but also consider the needs of users. A clear and concise interface and clear fault descriptions are more important than complex functions.
-
Continuous Improvement: Treat the fault diagnosis system as an evolving tool, regularly review fault records, and update diagnostic logic.
-
Knowledge Accumulation: Encourage engineers to annotate and document solutions for each fault, gradually building a knowledge base for the team.

I remember once, after implementing this system in a factory, an experienced worker told me, “Xiao Li, this system means I no longer have to get up in the middle of the night to run to the factory!” This was perhaps the best feedback I received. The value of technology ultimately lies in its ability to solve real problems for people.
I hope my sharing is helpful to everyone. If you have any questions or suggestions for improvement, feel free to discuss. After all, collective wisdom can make the system better!