Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

Dao Ge’s 21st Original Article

  • 1. Introduction

  • 2. The Role of the Gateway

  • 3. Communication Between Internal Processes of the Gateway

  • 4. Communication Between the Gateway and the Cloud Platform

  • 5. Conclusion

1. Introduction

In the previous article, we discussed how to use MQTT message bus for communication between processes in an embedded system. Article link: “My Favorite Inter-Process Communication Method – Message Bus”.

This communication model has been applied in several projects, and for non-industrial control products, the communication speed is completely sufficient. I have previously tested it, and on both x86 and ARM platforms, a piece of data can be controlled to travel from local to cloud and back to local within milliseconds.

The previous article only briefly introduced this design idea without discussing some details. This time, we will specifically discuss how to design the internal programs of the IoT gateway.

By reading this article, you can gain the following insights:

  1. How devices communicate within the IoT system;
  2. The message bus communication model between processes in the gateway;
  3. How data on the internal message bus of the gateway communicates with the server;
  4. As a pastime, gain some basic knowledge about IoT systems;

2. The Role of the Gateway

The term IoT is too broad; it seems that any hardware device that can connect to the internet can be called an IoT product, as if the term IoT can encompass everything.

This vague term is not conducive to our explanation, so we will use a perceptible and imaginative scenario instead, which is the smart home system, a typical product that best represents the IoT era.

2.1 Command Forwarding

In a smart home system, let’s assume there are several devices:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

If the communication modules of these devices are WiFi or Bluetooth, they can generally be controlled directly via a mobile phone (of course, the manufacturer needs to provide the corresponding mobile APP), and the phone acts as a central node controlling all devices.

Currently, some smart devices on the market use this communication method, such as air conditioners, vacuum cleaners, air purifiers, refrigerators, etc. As long as a wireless communication module is added to these devices (for example, the ESP8266 module).

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

If the communication module is another type of communication module, such as RF433, ZigBee, ZWave, since the mobile phone does not have these communication modules, a gateway is needed to “forward” commands.

The mobile phone and the gateway are both connected to the home router, within the same local area network, the phone sends control commands to the gateway, which then forwards the commands to the corresponding devices. The communication model is as follows:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

2.2 External Network Communication

In the above communication model, since the mobile phone and the gateway are in the same local area network, they can communicate directly. But what if the mobile phone is not in the local area network? Then it needs to be forwarded through a cloud server, and the communication model is as follows:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)
  1. The mobile phone sends the command to the server;
  2. The server forwards the command to the gateway;
  3. The gateway sends the command to the specified device;

The above describes the flow of control commands; if it is an alarm message sent by a device, the data flow will be reversed.

It can be seen that the gateway is the central node for communication between all devices and also serves as the intermediary for communication between the internal and external networks, acting as a connector for various smart devices to the internet.

2.3 Protocol Conversion

As mentioned above, the communication modules on hardware devices are fixed (RF, ZigBee, ZWave, etc.), and generally, these communication modules can be referred to as wireless communication protocols. In a smart home system, most wireless communication protocols among all devices are the same.

So, can devices with different types of wireless communication protocols coexist in the same system?

The answer is: Yes. As long as the gateway integrates the corresponding wireless communication protocol modules, this goal can be achieved! As shown in the figure below:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

From the mobile APP perspective, all devices appear the same, and it does not concern itself with what the wireless communication protocol of the device is, thus all control commands sent are protocol-agnostic.

When the gateway receives the control command, it first identifies the target device based on the command content, then determines the wireless communication protocol of the target device, and finally sends the command to the corresponding hardware communication module, which transmits the control command to the device via wireless signals.

From this command transmission process, the gateway plays the role of protocol conversion.

Additionally, there is another communication scenario: when an “input” device is bound/associated with an “output” device, for example:

  1. Infrared sensors and sound-light alarms are bound: when the infrared sensor detects a human body, it sends a signal, then controls the sound-light alarm to trigger an alarm;
  2. Door magnets and lights are bound: when the door opens, the door magnet sends a signal to automatically turn on the light;

If the “input” device and the “output” device use different types of wireless communication protocols, a gateway is also needed for protocol conversion.

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

2.4 Device Management

In a smart home system, the number of devices can vary, and managing these devices is also very important. As the central node of the system, the gateway naturally takes on the responsibility of managing devices.

Device management functions include:

Adding and removing devices; managing device status (battery, device disconnection, loss of connection, etc.); managing the device tree;

2.5 Edge Computing (Automated Control)

Under normal circumstances, the gateway can maintain a long connection with the server through the router. If the server has strong processing capabilities, all tasks that need to be processed in the smart home system can be handed over to the server for computation and processing, and the server sends the processing results back to the gateway. This seems like a perfect idea!

However, consider the following two scenarios:

  1. The router has a problem, and the gateway cannot connect to the server, thus it cannot timely report local data;
  2. An abnormal situation arises in the system that requires urgent processing; if the information is reported to the server, and the server calculates and then returns the instructions to the gateway, the time taken may exceed the tolerable time, how should it be handled? (You can visualize this scenario using a vehicle networking system: in the case of automated driving, if a car encounters an emergency, should it upload all information to the server and then wait for the server’s next instruction?)

For the above scenarios, it may be more appropriate to handle some computations and processing tasks on the gateway side! This has also become a popular trend in recent years known as edge computing.

1. Edge computing refers to an open platform that integrates network, computing, storage, and application core capabilities on the side close to the object or data source, providing the nearest service. 2. Its application programs initiate at the edge side, generating faster network service responses, meeting the basic needs of industries in real-time business, application intelligence, security, and privacy protection. 3. Edge computing is positioned between physical entities and industrial connections, or at the top of physical entities. Cloud computing, however, can still access the historical data of edge computing.

3. Communication Between Internal Processes of the Gateway

When designing the architecture of an application program, it can be achieved through multithreading or through multiprocessing; everyone has different habits, and both have their own advantages. We will not discuss which is superior, as I prefer the design philosophy of multiprocessing, so we will directly discuss the multiprocessing architecture.

3.1 What Processes Are Needed in the Gateway

The processes that need to be executed in the gateway are determined by the functions of the gateway, assuming the following functions:

(1) Process for connecting to the external network: Proc_Bridge

The gateway needs to connect to the cloud server, requiring a process to maintain a long connection with the server, so it can timely receive control commands sent by the server and report internal data back to the server.

This process needs to forward the commands received from the server to the internal gateway system and forward the information received from the internal system to the server, similar to a bridging function, hence named Proc_Bridge.

(2) Device management process: Proc_DevMgr

This process is used to execute device management functions, such as adding (joining) and deleting (leaving) devices.

(3) Protocol conversion process: Proc_Protocol

Downlink: Converts the application layer’s unified communication protocol into different types of wireless communication protocols and sends it to the corresponding wireless module.

Uplink: Converts the different types of wireless communication protocols reported by devices into the application layer’s unified communication protocol.

(4) Edge computing process (automated control): Proc_Auto

Clearly, an independent process is needed to handle various computations, and this process acts as the brain of the system.

(5) Processes related to wireless communication protocols: Proc_ZigBee, Proc_RF, Proc_ZWave

On the hardware side, each wireless communication module communicates with the gateway’s CPU via serial or other hardware connection methods, so each type of wireless communication module requires a corresponding process to handle.

(6) Other “soft device” processes: Proc_Xxx

In previous projects, I encountered some hardware devices that are logically at the same level as door magnets, sockets, etc., but connect to the gateway via TCP. For such devices, an independent process can also be used for management.

The running model of these processes in the gateway is as follows:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

3.2 MQTT Message Bus

These processes need to communicate with each other, which is not a simple point-to-point communication, but a mesh communication model. For example:

  1. The device management process Proc_DevMgr: When any device is added to the system, it needs to be processed here, so it needs to communicate with processes like Proc_ZigBee, Proc_RF, Proc_ZWave;
  2. When a device reports data (for example: Proc_ZigBee), the Proc_Protocol process needs to convert the data protocol, then the Proc_Bridge process reports the converted data to the server while the Proc_Auto process checks if the reported data from this device triggers any related devices;

This means that the communication among these processes is interconnected. If using traditional IPC methods (shared memory, named pipes, message queues, sockets), it can be quite complex to handle.

After introducing the MQTT message bus, each process only needs to mount to the bus. Each process only needs to listen to the topics of interest to receive the corresponding data.

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

Since the communication relationships among these processes are quite complex, a good topic design specification becomes very important!

3.3 Topic Design

The MQTT communication model is based on the subscribe/publish model. After a client (process) connects to the message bus, it needs to register the topics it is interested in, and other clients (processes) can send messages to this topic, which can then be received by the subscribers.

The topic is a string separated by slashes (/) to represent a multi-layer hierarchical structure. For example, the following two topics are related to online upgrades (OTA) on the Amazon AWS platform:

  1. $aws/things/MyThing/jobs/get/accepted
  2. $aws/things/MyThing/jobs/get/rejected

In our example scenario, we can design the topics as follows:

(1) Proc_DevMgr

Subscribed topics:

$iot/v1/ZigBee/Register  $iot/v1/ZigBee/UnRegister  $iot/v1/RF/Register  $iot/v1/RF/UnRegister  $iot/v1/ZWave/Register  $iot/v1/ZWave/UnRegister 

(2) Proc_Bridge

Subscribed topics:

$iot/v1/Device/Report

Published data topics:

$iot/v1/Device/Control  $iot/v1/Device/Remove  $iot/v1/Auto/AddRule  $iot/v1/Auto/RemoveRule 

(3) Proc_Protocol

Subscribed topics:

$iot/v1/Device/Control  $iot/v1/Device/Remove  $iot/v1/ZigBee/Report  $iot/v1/RF/Report  $iot/v1/ZWave/Report  

Published data topics:

$iot/v1/Device/Report  $iot/v1/ZigBee/Control  $iot/v1/ZigBee/Remove  $iot/v1/RF/Control  $iot/v1/RF/Remove  $iot/v1/ZWave/Control  $iot/v1/ZWave/Remove  

(4) Proc_Auto

Subscribed topics:

$iot/v1/Auto/AddRule  $iot/v1/Auto/RemoveRule  $iot/v1/Device/Report  

Published data topics:

$iot/v1/Device/Control

(5) Proc_ZigBee

Subscribed topics:

$iot/v1/ZigBee/Control  $iot/v1/ZigBee/Remove  

Published data topics:

$iot/v1/ZigBee/Register  $iot/v1/ZigBee/UnRegister  $iot/v1/ZigBee/Report  

(6) Proc_RF

Subscribed topics:

$iot/v1/RF/Control  $iot/v1/RF/Remove 

Published data topics:

$iot/v1/RF/Register  $iot/v1/RF/UnRegister  $iot/v1/RF/Report  

(7) Proc_ZWave

Subscribed topics:

$iot/v1/ZWave/Control  $iot/v1/ZWave/Remove  

Published data topics:

$iot/v1/ZWave/Register  $iot/v1/ZWave/UnRegister  $iot/v1/ZWave/Report  

The design of these topics is still somewhat rough. By using wildcards (#, +, $), a more flexible hierarchical structure can be designed.

  1. Multi-layer wildcard: “#” is used to match any level in the topic, and multi-layer wildcards represent its parent and any number of child levels.
  2. Single-layer wildcard: “+” can only be used to match a single topic level, and can be used at any level in the topic filter, including the first and last levels.
  3. Wildcard: “$” matches a single character, as long as it is not placed at the very beginning of the topic, otherwise it matches a single character.

We can take a control command as an example to illustrate how data flows through topics:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)
  1. The Proc_Bridge process receives a control command from the server and sends it to the topic on the message bus: $iot/v1/Device/Control.
  2. Since the Proc_Protocol process is subscribed to this topic, it immediately receives the command.
  3. The Proc_Protocol analyzes the command content, finds it is a ZigBee device, thus performs protocol conversion and sends a ZigBee control command to the topic on the message bus: $iot/v1/ZigBee/Control.
  4. Since the Proc_ZigBee process is subscribed to this topic, it receives the control command.
  5. The Proc_ZigBee converts the control command into the format required by the ZigBee wireless communication module and sends it to the device light bulb.

Now let’s analyze the device data reporting scenario:

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

First, focus on the red arrows in the figure, ignoring the blue arrows:

  1. The door magnet opens and reports the information to the Proc_CF process through wireless communication.
  2. The Proc_RF process receives the data reported by the RF433 communication module and sends the “door magnet opened” information to the topic on the message bus: $iot/v1/RF/Report.
  3. Since the Proc_Protocol process is subscribed to this topic, it receives the reported door magnet data.
  4. The Proc_Protocol analyzes the data, converts the RF433 protocol data into the unified application layer protocol data, and sends it to the topic on the message bus: $iot/v1/Device/Report.
  5. Since the Proc_Bridge process is subscribed to this topic, it receives the reported data.
  6. The Proc_Bridge process reports the data to the server.

Now let’s look at the process indicated by the blue arrows:

In the fourth step mentioned above, after the Proc_Protocol process converts the RF433 protocol data into the unified application layer protocol, it sends the data to the message bus topic: $iot/v1/Device/Report , and at the same time, the Proc_Auto process also performs the following operations:

  1. Since the Proc_Auto is also subscribed to this topic, it receives the application layer protocol data reported by the door magnet.
  2. The Proc_Auto looks up its configuration information (assuming the user has previously configured a rule: when the door magnet opens, trigger the sound-light alarm), finds a match with the “door magnet -> alarm” rule, and sends a control command to the alarm, which is sent to the topic on the message bus: $iot/v1/Device/Control.

The subsequent steps 7, 8, 9, and 10 are exactly the same as the control command flow above.

3.4 Comparison with DBUS

From the above descriptions of the three data flow scenarios, do you feel that using topics as a “data pipeline” for communication is very similar to the DBUS bus in Linux systems?

The DBUS bus is also used for inter-process communication. In my personal understanding, DBUS actually organizes two types of communication between processes:

  1. Data transmission based on signals;
  2. RPC remote calls based on methods;

The concepts included in DBUS are more complex, including paths, objects, interfaces, methods, etc., which are organized together to locate a specific service provider.

In comparison, I feel that the MQTT method is simpler.

The so-called RPC remote call is to call a function located on a remote machine, mainly solving two problems:

  1. Network connection;
  2. Data serialization and deserialization;

I will write a separate article later to implement RPC calls using the protobuf framework.

4. Communication Between the Gateway and the Cloud Platform

The design process discussed above pertains to the communication methods between various functional modules within the gateway, which is also the area where we as embedded developers can fully utilize our abilities.

The communication methods between the gateway and the cloud platform are generally specified by the client, and there are only a few types (Alibaba Cloud, Huawei Cloud, Tencent Cloud, Amazon AWS platform). Generally, it is required that the gateway and cloud platform maintain a long connection, so that various commands from the cloud can be sent to the gateway at any time.

Of course, these cloud platforms will provide corresponding SDK development packages, and the MQTT protocol is more commonly used to connect to cloud platforms. In some documents, the MQTT server located in the cloud is referred to as a Broker, which is essentially just a server.

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

The function of the Proc_Bridge process mainly includes two points:

  1. Data transmission channel with the cloud platform;
  2. Protocol conversion: converting the protocols related to the cloud platform into the internal protocols of the gateway and vice versa.

In other words, the Proc_Bridge process needs to connect simultaneously to the MQTT Broker of the cloud platform and the MQTT message bus of the gateway. In the next article, we will specifically discuss this part and provide a code template for implementing bridging functions.

5. Conclusion

As an embedded software developer, merely filling code into a framework designed by others can become monotonous over time, and one may not know where to improve their skills. Upon careful reflection, there are actually many directions: Linux kernel, file systems, algorithms, application program design, etc.

The content discussed in this article does not rise to the level of architectural design; it is merely a simple communication model for various functional modules within the IoT gateway. If you have the opportunity to design similar products, you might as well try this communication model, and of course, you will definitely design it even better!

[Original Declaration]

Developing IoT Gateways: Design Process Based on MQTT Message Bus (Part 1)

Reprint: You are welcome to reprint, but without the author’s consent, this statement must be retained, and the original link must be provided in the article.

No bragging, no hype, no exaggeration, just writing each article diligently! Welcome to forward and share with your technical friends around you. Dao Ge expresses heartfelt thanks here! The forwarding recommendation text has already been thought out for you:

This summary article by Dao Ge is well-written and very helpful for my technical improvement. Good stuff deserves to be shared!

Recommended Reading

My favorite inter-process communication method – message bus C language pointers – from underlying principles to various techniques, explaining thoroughly step by step – how to implement object-oriented programming in C to enhance code quality – macro definitions – from entry to giving up, it turns out that the underlying debugging principles of gdb are so simple – using setjmp and longjmp in C language to achieve exception capture and coroutine – those things about encryption and certificates – deep dive into LUA scripting language, letting you thoroughly understand debugging principles

Leave a Comment

×