Event-Driven Programming for Fault Diagnosis in Smart Home PLC Systems

Event-Driven Programming for Fault Diagnosis in Smart Home PLC Systems

Click the blue text for more exciting information

Event-Driven Programming for Fault Diagnosis in Smart Home PLC Systems

Over the years, I have worked on numerous smart home projects and discovered a common issue: when a fault occurs, everyone is completely at a loss! Homeowners call every day, and for a minor issue, I have to make ten trips, it’s incredibly frustrating. Ultimately, this is due to poor diagnostic system design; most contractors just install things haphazardly without considering future maintenance.

Let me first discuss those terrible solutions. Many people prefer to use polling methods to detect faults, meaning the PLC continuously reads the status of each device one by one. Just think about it, a villa can have hundreds of points; doing it this way will skyrocket CPU usage! I can tell you that in 2022, I took over a villa renovation where the original system was painfully slow. Upon reviewing the polling code, I found it checked all devices every 100ms, and the old Siemens CPU was almost burnt out…

By the way, those who constantly brag about Industry 4.0 and digital twins, most of them can’t even handle basic IO collection smoothly, so let’s not talk about anything high-end, can we first get the basics right??

Back to the main topic. In fact, using event-driven programming for smart home fault diagnosis is the right approach. What does that mean? Simply put, it means “whoever has a problem, shouts out,” eliminating the need for continuous polling.

Last summer, I helped a high-end community in Suzhou with a system using an old Siemens S7-1200. That event-driven solution has not had a single issue in the past six months. The key is that I added an HMI to display all fault history, allowing remote identification of which device has a problem, can you imagine how much maintenance time that saves??

The technical implementation is actually not complicated. First, define a status flag and fault counter for each device point. When a device status is abnormal, the counter increments and triggers an event flag. Siemens supports interrupt triggers, and AB’s ControlLogix does too, but some domestic PLCs simply lack this functionality, those using such brands are clueless.

Here’s a snippet of code to illustrate the basic idea:

// Basic logic for event-driven fault diagnosis

EVENT_ON_FAULT:

IF Sensor_1_Status <> Expected_Status AND Sensor_1_Enabled THEN

Fault_Counter_1 := Fault_Counter_1 + 1;

IF Fault_Counter_1 >= Fault_Threshold THEN

// Trigger fault event and write to history

SET_EVENT(EVT_FAULT_SENSOR_1);

// Notify HMI and alarm system

SET(ALARM_BIT[1]);

END_IF;

END_IF;

To be honest, this seems simple to explain, but there are many pitfalls in practice. The biggest pitfall is in setting the trigger conditions; if they are too sensitive, it leads to false alarms, and if they are too dull, problems go undetected. Back in 2021, I had a project in Shanghai where intermittent disconnections of the broadband router caused communication interruptions to be mistaken for device faults, the owner almost beat me up… Later, I added communication status checks and timeout tolerances to resolve it.

Now, take a look at this optimized version:

// Improved version - with communication status detection and time tolerance

COMM_CHECK_BLOCK:

// Check communication status instead of device status

IF NOT Device_Responsive AND Device_Should_Respond THEN

Comm_Failure_Timer := Comm_Failure_Timer + Scan_Time;

// Only alarm if sustained beyond the set time

IF Comm_Failure_Timer > Comm_Failure_Threshold THEN

SET_EVENT(EVT_COMM_FAILURE_DEV_1);

LOG_Entry(Time_Stamp, "Communication Interrupted", Device_ID);

END_IF;

ELSE

// Communication restored, reset timer

Comm_Failure_Timer := 0;

END_IF;

This industryhas no perfect solutions, and it’s frustrating that many homeowners and clients have no idea how complex this is. They think that spending a few thousand dollars will buy them a system that never has issues, dreaming! There are countless pitfalls in on-site debugging; I could write a book about the bizarre problems I’ve encountered: power noise interference, temperature sensors being sunlit, children throwing toys at infrared detectors… all of these can lead to system performance bottlenecks or false alarms.

Another important point is that when recording fault logs, always include timestamps! Don’t ask me how I know this, it’s a hard lesson learned! In the past, without timestamps, I had a pile of fault records with no clear order, making it impossible to find the cause. Now my standard practice is to record the date and time of the fault occurrence immediately, then classify by severity: A-level faults are pushed to the mobile app immediately, while B-level faults are checked the next day.

This solution is vastly superior to those polling queries! CPU load is lower, fault response is faster, maintenance personnel have less stress, and homeowners are satisfied.

Some may ask, is this method suitable for all projects? Of course not. For small projects with three to five points, you can do whatever you want. But for projects with more than 20 device points, if you don’t use event-driven programming, you’ll eventually dig your own grave. Remember, don’t wait until you’re criticized to think about optimization; by then, it will be too late!

Disagree? Just watch me

Event-Driven Programming for Fault Diagnosis in Smart Home PLC Systems

Leave a Comment