Having been in the Internet of Things (IoT) field for two and a half years, I want to write something to document the pitfalls I’ve encountered, which can help me reflect while also assisting other colleagues. I am still a novice, so please refer to this content with caution. Additionally, my main focus is on smart electricity and smart healthcare industries, which may not be applicable to other sectors. All content below is from the perspective of IoT platform design, rather than purely considering IoT concepts.
1. IoT Platform Architecture
1.1 Perception Layer
As the cornerstone of the IoT, IoT devices are crucial, as all data originates from them. For example, there is a temperature and humidity device that can collect the temperature and humidity of a server room. There is also a power meter that can collect voltage, current, power consumption, etc. If categorized by purpose or industry, there are countless devices. If categorized by collection type, IoT devices can be divided into:edge gateways, gateway sub-devices, and standalone devices.
1.1.1 Edge Gateway
There are many ways for IoT devices to collect data (using various protocols), some of which are polling (the server sends commands, and the device responds with data), while some devices can only collect data and cannot upload it to the server. This is very inconvenient. The emergence of edge gateways perfectly solves this problem; they can adapt to many common protocols, simplifying the workload of data collection and server reception. By connecting various IoT devices to the edge gateway, data can be reported in a unified format (protocol).
The working mode of the edge gateway can be understood as a powerful media player; regardless of whether you have mp4, avi, rtsp, rtmp, or other video formats, they can all be played through the player. We only need to learn how to operate the player without needing to study the specific video formats.
- • The role of the edge gateway goes far beyond simply collecting different protocols; those interested can search for more information. However, the system design must be based on the gateway purchased by the company, as some companies may buy very cheap gateways with limited functionality.
- • A DTU (Data Terminal Unit) is not strictly an edge gateway, but its function can often be seen as a simplified version of an edge gateway.
1.1.2 Gateway Sub-Devices
Gateway sub-devices are the devices connected under the gateway, which may be original devices or conversion devices for the original devices, such as DTUs. This does not require special understanding, but it must be recorded during system design, as customers need to know which device triggered an alarm.
1.1.3 Standalone Devices
Some devices have the capability to upload data independently, such as certain smart meters that can upload data directly through specific protocols.
1.2 Network Layer
As the saying goes, a bird in the hand is worth two in the bush. No matter how powerful a device is, if it cannot transmit data to the server, it is meaningless. For developers, we do not need to focus on the communication protocols of the devices, we only need to focus on the transmission protocols. Based on the technical classification of the server, I generally categorize transmission protocols into:MQTT, TCP, UDP, HTTP.
The above classification is my subjective categorization; strictly speaking, MQTT also belongs to TCP, but we generally handle them separately, so I have written them separately. Additionally, protocols like Modbus and IEC104, which are master-slave protocols, are usually processed through gateways and rarely transmitted alone; I will introduce them separately when I have the opportunity.
1.3 Data Layer
We generally refer to this layer as the data center, which is responsible for data reception, processing, cleaning, transformation, and storage.
1.4 Business Layer
This is relatively simple; it mainly involves the application of data, such as displaying device data, alarms, generating reports, etc.
2. Classification of IoT Data
Our company started with electricity, so our classification of data is generally based on electricity. This concept is not authoritative but is relatively easy to distinguish and is for reference only.
2.1 Telemetry
This type of data is the most common, representing the numerical parameters of devices, such as temperature, humidity, current, and voltage.
2.2 Remote Signal
Status signal parameters, such as switches, smoke detectors, water immersion, and access control; any state change involving on/off or alarm/no alarm falls under remote signals.
2.3 Remote Pulse
Mainly used to represent cumulative parameters, such as electricity consumption, water consumption, and gas consumption; these are cumulative energy consumption data.
3. IoT Alarms
3.1 Communication Alarms
Records the online and offline status of devices. We generally categorize them into two types: immediate reporting of online/offline signals and determining offline status when a device does not upload data for an extended period (a device is considered online when it uploads data).
3.1.1 Device Immediate Reporting of Online/Offline Signals
This is very easy to understand; whether using MQTT or building TCP/UDP servers through Netty, when the device powers on or off, it will report signals, and we just need to log these records.
3.1.2 Device Long-Term Non-Data Upload to Determine Offline Status
I hope all colleagues consider this method when designing IoT systems; it is very practical. Here are two or three scenario examples: ① Master-slave devices do not have online/offline signals; ② Some devices (especially those using MQTT protocol) may automatically restart after running for a long time (resulting in a brief offline and online signal), leading to two alarms in the system; this is not a problem, but customers may question the stability of the device; ③ Some devices may freeze after running for a long time, for example, a device may upload an online signal but fails to report data; after a restart, it can upload normally. If we do not use time to judge, we cannot promptly identify issues with the device.
The best way to determine offline status is to base it on the device’s upload interval; for example, if a device uploads data every 30 seconds, and it has not uploaded data for 300 seconds, it can be determined to be offline.
3.2 Limit Exceeding Alarms
We usually set limits for telemetry data, and it is recommended to set two levels:upper upper limit, upper limit, lower lower limit, lower limit. For example, the temperature in a server room is generally recommended to be maintained at 25-30℃. When the indoor temperature reaches around 55℃, it can affect the operation of the server. Therefore, our limit cannot be set at 55℃, as it would not serve as a warning. Thus, if we set the upper limit at 40℃ and the upper upper limit at 50℃, an alarm will trigger at 40℃ to alert attention. At 50℃, it becomes very dangerous.
3.3 State Change Alarms
State change alarms mainly target remote signal data, such as switch tripping and smoke alarms, which are very dangerous situations.