
Checksum: A checksum is usually the last digit of a group of numbers, derived from the preceding numbers through some operation, used to verify the correctness of the data set. When code is inputted as data into a computer or other devices, input errors can easily occur. To reduce such input errors, coding experts have invented various checksum error detection methods and set checksums based on these methods. Common checksums include: Sum, XOR, LRC, CRC…
Discrete Input: Mainly used to read the status of a single bit of data, such as the state of an I/O;
Coil: Switch output signal, mainly used to write a single bit of data, forming a paired operation with discrete quantities;
Input Register: Mainly used to read 16 bits, which is two bytes of data;
Holding Register: Mainly used to write 16 bits of data;
PLC: Programmable Logic Controller, a type of programmable memory used to store programs internally and execute logical operations, sequential control, timing, counting, and arithmetic operations for user-oriented instructions, controlling various types of machinery or production processes through digital or analog input/output.
Serial Communication: With the application of computer systems and the development of microcomputer networks, communication functionality has become increasingly important. Here, communication refers to the exchange of information between computers and the outside world. Therefore, communication includes information exchange between computers and external devices, as well as between computers. Since serial communication transmits information one bit at a time over a single transmission line, it requires fewer transmission lines and can use existing telephone networks for information transmission, making it particularly suitable for long-distance transmission. For those human-machine exchange devices and serial storage external devices like terminals, printers, logic analyzers, disks, etc., which are not far from the computer, it is also common to use serial methods for data exchange. In real-time control and management, the communication between CPUs in a hierarchical distributed control system composed of multiple microcomputer processors is generally serial. Thus, serial interfaces are commonly used in microcomputer application systems. Many peripherals and computers communicate serially, where the serial method refers to the information transmission method between peripherals and interface circuits, while the CPU and interface still work in parallel.
Serial Port: A serial port is a very common protocol for device communication on a computer, not to be confused with the Universal Serial Bus (USB). Most computers contain two RS232-based serial ports. The serial port is also a common communication protocol for instrumentation devices; many GPIB-compatible devices also have RS-232 ports. At the same time, serial communication protocols can also be used to retrieve data from remote collection devices.
The concept of serial communication is very simple; it sends and receives bytes bit by bit. Although it is slower than byte-wise parallel communication, a serial port can send data on one wire while receiving data on another. It is simple and can achieve long-distance communication. For example, when IEEE488 defines the parallel communication state, it specifies that the total length of the device line must not exceed 20 meters, and the length between any two devices must not exceed 2 meters; however, for the serial port, the length can reach up to 1200 meters.
Typically, serial ports are used for transmitting ASCII characters. Communication is completed using three lines: ground, transmit, and receive. Since serial communication is asynchronous, the port can send data on one line while receiving data on another. Other lines are used for handshaking but are not mandatory. The most important parameters of serial communication are baud rate, data bits, stop bits, and parity. For two ports communicating, these parameters must match:
a. Baud Rate: This is a parameter that measures communication speed. It indicates the number of bits transmitted per second. For example, 300 baud means sending 300 bits per second. When we refer to clock cycles, we are referring to the baud rate. For example, if the protocol requires a baud rate of 4800, the clock is 4800Hz. This means the sampling rate of serial communication on the data line is 4800Hz. Typically, telephone lines have baud rates of 14400, 28800, and 36600. Baud rates can be much higher than these values, but baud rate is inversely proportional to distance. High baud rates are often used for communication between instruments placed very close together, a typical example being GPIB device communication.
b. Data Bits: This is a parameter that measures the actual data bits in communication. When a computer sends a packet of information, the actual data will not be 8 bits; standard values are 5, 7, and 8 bits. How this is set depends on the information you want to transmit. For example, standard ASCII is 0-127 (7 bits). Extended ASCII is 0-255 (8 bits). If the data uses simple text (standard ASCII), then each data packet uses 7 bits of data. Each packet refers to a byte, including start/stop bits, data bits, and parity bits. Since the actual data bits depend on the choice of communication protocol, the term “packet” refers to any communication situation.
c. Stop Bits: Used to indicate the last bit of a single packet. Typical values are 1, 1.5, and 2 bits. Since data is timed on the transmission line, and each device has its own clock, there may be slight desynchronization between two devices during communication. Therefore, stop bits not only indicate the end of transmission but also provide a chance for the computer to correct clock synchronization. The more stop bits suitable, the greater the tolerance for different clock synchronization, but the data transmission rate also becomes slower.
d. Parity Bit: A simple error detection method in serial communication. There are four error detection methods: even, odd, high, and low. Of course, there can also be no parity bit. For even and odd parity, the serial port sets a parity bit (one bit after the data bits) to ensure that the transmitted data has an even or odd number of logical high bits. For example, if the data is 011, then for even parity, the parity bit is 0, ensuring that the number of logical high bits is even. If it is odd parity, the parity bit is 1, resulting in three logical high bits. High and low bits do not actually check the data, simply setting logical high or low checks. This allows the receiving device to know the state of a bit, providing a chance to determine whether noise has interfered with communication or if there is a mismatch between transmitted and received data.
As early as 1971, Modicon first introduced the Modbus protocol, and Modbus RTU and Modbus ASCII were born from this. Later, Schneider Electric acquired Modicon and launched the Modbus TCP protocol in 1997. In 2004, the National Standardization Administration of China officially made Modbus a national standard, marking the beginning of Modbus’s contribution to industrial communication in China.
Through this protocol, controllers can communicate with each other, as well as between controllers and other devices via networks. The Modbus protocol features standards, openness, support for multiple electrical interfaces, a simple and compact data frame format, large data transmission volume, and good real-time performance, making it widely used in industrial control systems and becoming a universal industrial standard. A thorough analysis of the implementation principles of the Modbus protocol and its security is of significant practical importance for improving the security of industrial control systems.
Modbus RTU and Modbus ASCII are mainly used in the field of serial communication, while Modbus TCP is commonly used in Ethernet communication. Today, Modbus has become a communication protocol standard in the industrial field and is now a commonly used connection method between industrial electronic devices.
Modbus uses a simple Master and Slave protocol (client/server protocol) for communication. The client acts as the master, sending requests to the server; the server (slave) receives the request, analyzes it, and responds. The communication frame used is called the Application Data Unit (ADU), which includes the communication address segment, function code segment, data segment, and checksum segment, as shown in the figure below:
Generally, monitoring systems (HMI) are the masters, while PLCs, electric meters, instruments, etc., are slaves. The HMI system continuously polls various relays and registers of the slaves for the latest values, then displays and processes various logic calculations and control adjustments.
Among them, the function code segment and data segment combined are called the Protocol Data Unit (PDU). The function code segment occupies one byte, with a value range of 1-255, where 128-255 are reserved for abnormal message response packets. 1-127 are function code numbers, with 65-72 and 100-110 being user-defined codes.
The Modbus protocol is an application layer message transmission protocol that includes three message types: ASCII, RTU, and TCP. The protocol itself does not define the physical layer; it only defines the message structure that controllers can recognize and use, regardless of the network used for communication.
When using the Modbus protocol for serial transmission, either RTU or ASCII mode can be selected, specifying message, data structure, command, and response methods, and requiring data verification. The ASCII mode uses LRC checks, while the RTU mode employs 16-bit CRC checks. When transmitted over Ethernet, TCP is used, and this mode does not use checks since the TCP protocol is a reliable connection-oriented protocol.
Modbus is an application layer protocol that defines data units (ADU) independent of the underlying network, allowing communication over Ethernet (TCP/IP) or serial links (RS232, RS485, etc.) (Ethernet ADU and serial ADU are slightly different). In serial links, the Modbus protocol has two transmission modes—ASCII mode and RTU mode. ASCII is an abbreviation for American Standard Code for Information Interchange, while RTU stands for Remote Terminal Unit.
First, let’s look at how Modbus works.
Modbus adopts a master-slave communication mode, where only the master device can initiate transmission, and the slave device responds to requests from the master device. Typical master devices include field instruments and display panels, while typical slave devices are Programmable Logic Controllers (PLCs).
In master-slave communication over a serial link, a Modbus master device can connect to one or N (up to 247) slave devices. Communication between master and slave devices includes both unicast and broadcast modes.
In broadcast mode, the Modbus master device can simultaneously send requests to multiple slave devices (device address 0 is used for broadcast mode), and slave devices do not respond to broadcast requests.
In unicast mode, the master device sends a request to a specific slave device (each Modbus slave device has a unique address), and the request message frame includes function codes and data, such as function code “01” used to read the status of discrete coils. Upon receiving the request, the slave device responds and sends the message back to the master device.
In communication between master and slave devices, either ASCII mode or RTU mode can be used. In ASCII (American Standard Code for Information Interchange) transmission mode, the message frame starts with an English colon (“:”, ASCII 3A Hex) and ends with carriage return and line feed (CRLF, ASCII 0D and 0A Hex), allowing a character set of hexadecimal 0-9 and A-F; the slave devices in the network monitor whether there is an English colon (“:”) on the transmission path, and if so, decode the message frame to check if the address in the message matches their own address. If it matches, they receive the data; if not, they ignore it.
In ASCII mode, each 8-bit byte is split into two ASCII characters for transmission, for example, the hexadecimal number 0xAF is decomposed into ASCII characters “A” and “F” for transmission, effectively doubling the number of characters sent compared to RTU. The advantage of ASCII mode is that it allows a gap of up to 1 second between two characters without causing communication failure. This mode uses Longitudinal Redundancy Check (LRC) to check for errors. When the controller is set to communicate over the Modbus network in RTU mode, each 8-bit byte in the message contains two 4-bit hexadecimal characters, and this mode does not have start and end markers. Its advantages include the ability to transmit more data at the same baud rate.
In RTU (Remote Terminal Unit) mode, each byte can transmit two hexadecimal characters, for example, the hexadecimal number 0xAF is sent directly as hexadecimal 0xAF (binary: 10101111), resulting in a transmission density that is twice that of ASCII mode; RTU mode uses cyclic redundancy check (CRC), summarized as follows:
The specific format is shown in the figure below:
6.1 Master and Slave, Server and Client
[In the Modbus protocol]
The master sends the Modbus request, and the slave responds based on the request content. In the Modbus protocol, the master is always the active party, while the slave is always the passive party.
[In network applications]
In network applications, there are clients and servers; clients (such as browsers) send requests to the server, which returns content (such as HTML text) to the client.
[In Modbus TCP]
The master is the client, while the slave is the server. Don’t think that the server is more important; the master is also important, so the master is the server.
6.2 Can There Be Multiple Masters?
Based on the previous analysis, if the master is the client, then Modbus TCP supports multiple masters, and there can be multiple masters and slaves in a local area network. The connection capability of the slave (the number of hosts it can connect to) is determined by the maximum number of TCP connections in uIP.
6.3 Brief Overview of Modbus TCP Protocol
Modbus TCP is fundamentally similar to Modbus RTU, but there are some differences:
a. The slave address becomes less important and is often ignored. In a sense, the slave address is replaced by the IP address;
b. CRC checks become less important and can even be ignored. Since there is already a check in the TCP data packet, Modbus TCP simply eliminates CRC checks to avoid redundancy.
TCP mode was developed to allow Modbus data to be smoothly transmitted over Ethernet, using TCP port 502. The protocol’s physical layer, data link layer, network layer, and transport layer are all based on the TCP protocol, only modifying and encapsulating the Modbus protocol at the application layer; the receiver unpacks the TCP data packet to recover the original Modbus frame, then parses it according to the Modbus protocol specification and repackages the returned data packet into the TCP protocol to return to the sender. Unlike the data format transmitted over serial links, TCP mode removes additional addresses and checks, adding a message header instead, with the specific format shown in Figure 4.
In Modbus TCP, there is an MBAP header that consists of the following parts:
Area
Length
Description
Client
Server
Transmission Flag |
2 bytes |
MODBUS request and response transmission sequence number |
Generated by the client |
Copied by the server upon response |
Protocol Flag |
2 bytes |
Modbus protocol defaults to 0 |
Generated by the client |
Copied by the server upon response |
Length |
2 bytes |
Length of the remaining part |
Generated by the client |
Generated by the server upon response |
Unit Flag |
1 byte |
Slave flag (slave address) |
Generated by the client |
Copied by the server upon response |
Note:
a. The transmission flag can be understood as a sequence number to prevent MODBUS TCP communication misalignment, for example, when a later response arrives at the master before an earlier response;
b. The unit flag can be understood as the slave address, which is no longer important.
6.4 Relationship Between Modbus TCP and TCP IP
Modbus TCP can be understood as an application layer protocol that occurs over TCP. Since it is a TCP protocol, a complete MODBUS TCP message must include the TCP header, IP header, and Ethernet header.
The client initiating the Modbus transaction creates the Modbus Application Data Unit. The function code (in the PDU) indicates to the server which operation will be executed.
The function code field of the Modbus data unit is encoded in one byte. The valid range is decimal 1-255 (128-255 are reserved for abnormal response). When the client sends a message to the server, the function code field indicates which operation will be executed by the server.
The data field of the message sent from the client to the server includes additional information, which the server uses to execute the operation defined by the function code. This field also includes addresses of discrete items and registers, the number of items to be processed, and the actual number of data bytes in the field.
In some requests, the data field may not exist; in this case, the server does not need any additional information. The function code only indicates the operation.
Types of Function Codes:
Function codes are mainly divided into valid function codes, abnormal function codes, and error function codes. If there are no errors related to the requested Modbus function in a correctly received Modbus ADU, the response data from the server to the client will contain the normal function code from the request. If there are errors related to the requested Modbus function, the response data will contain an abnormal code and an error code.
For example, the client can read the on/off status of a group of discrete outputs or inputs, or the user can read/write a group of register data contents. When the server responds to the client, it uses the function code field to indicate a normal (no error) response or that some error occurred (known as an abnormal response). For a normal response, the server only responds to the original function code.
For an abnormal response, the server returns a code equivalent to the client, setting the most significant bit of the original function code to logical 1 and adding the abnormal code followed by the error code to notify the client of the abnormal reason.
Valid Function Codes
There are more than twenty valid function codes, but in general use, eight are most commonly used: 1, 2, 3, 4, 5, 6, 15, and 16, as well as two specially used codes 20 and 21, which are General Reference Registers that most Modbus devices do not provide. The main control data on PLCs includes the following four types. These eight function codes handle this control data, detailed as follows:
Four Types of Control Data:
DI: Digital Input (Discrete Input), one address one data bit, the user can only read its status, cannot modify it. Represented by one bit as On/Off, used to record the status of control signal inputs, such as switches, contact points, motor operation, limit switches, etc. It is referred to as Input relay, input coil, etc., on PLCs.
DO: Digital Output (Coil Output), one address one data bit, the user can set, reset, and read back the status. Represented by one bit as On/Off, used to output control signals to activate or stop motors, alarms, lights, etc. It is referred to as Output relay, Output coil, etc., on PLCs.
AI: Analog Input (Input Register), one address 16-bit data, the user can only read, cannot modify, represented by a 16 bits integer indicating a value used to record control signal numerical inputs, such as temperature, flow, material amount, speed, rotation speed, valve opening, liquid level, weight, etc. It is referred to as Input register on PLCs.
AO: Analog Output (Holding Register), one address 16-bit data, the user can write and read back, represented by a 16 bits integer indicating a value used to output control signal numerical values, such as temperature, flow, speed, rotation speed, valve opening, feed amount, etc. It is referred to as Output register, Holding register on PLCs.
The Modbus protocol is a typical industrial control network protocol, and studying its security is significant for enhancing the security of industrial control networks. Generally, security issues related to protocols can be divided into two types: one is security issues arising from the design and description of the protocol itself; the other is security issues arising from incorrect implementations of the protocol. The Modbus protocol also has both of these issues.
8.1 Inherent Issues of the Modbus Protocol
The vast majority of industrial control protocols were designed with a focus on functional implementation, improving efficiency, and increasing reliability, without considering security issues. The Modbus protocol is no exception, despite having become a de facto industrial standard. From the previous principle analysis, its inherent security issues are: lack of authentication, authorization, encryption, and function code misuse.
(1) Lack of Authentication
The purpose of authentication is to ensure that the received information comes from legitimate users, and commands sent by unauthenticated users to devices will not be executed. In the Modbus protocol communication process, there are no definitions regarding authentication; attackers only need to find a legitimate address to use function codes to establish a Modbus communication session, thereby disrupting the entire or part of the control process.
(2) Lack of Authorization
Authorization ensures that different privileged operations must be performed by authenticated users with different permissions, significantly reducing the probability of operational errors and internal attacks. Currently, the Modbus protocol lacks a role-based access control mechanism, does not classify users, and does not delineate user permissions, which allows any user to execute any function.
(3) Lack of Encryption
Encryption ensures that the information between both parties during communication is not illegally obtained by third parties. In the Modbus protocol communication process, addresses and commands are all transmitted in plaintext, making it easy for attackers to capture and analyze data, providing convenience for attackers.
(4) Function Code Misuse
Function codes are an essential component of the Modbus protocol, and nearly all communications contain function codes. Currently, function code misuse is a major factor leading to abnormal Modbus networks. For instance, illegal message lengths, short-cycle useless commands, incorrect message lengths, delayed acknowledgment of abnormal codes, etc., can all lead to denial-of-service attacks.
8.2 Issues Arising from Protocol Implementation
Although the Modbus protocol has been widely used, developers do not possess security knowledge or are unaware of security issues when implementing specific industrial control systems. This leads to various security vulnerabilities in systems using the Modbus protocol.
(1) Design Security Issues
Modbus system developers primarily focus on functional implementation issues, and security issues are rarely considered during design. Design security refers to fully considering security during design to address potential exceptions and illegal operations in Modbus systems. For instance, if a node is maliciously controlled during communication and sends illegal data, it is necessary to consider the identification and handling of such data.
(2) Buffer Overflow Vulnerabilities
Buffer overflow refers to filling data into a buffer that exceeds its capacity, causing the overflow data to overwrite legitimate data. This is one of the most common and dangerous vulnerabilities in software development, potentially leading to system crashes or being exploited by attackers to gain control of the system.
Most Modbus system developers lack security development knowledge, which leads to many buffer overflow vulnerabilities; once exploited by malicious actors, severe consequences can occur.
(3) Modbus TCP Security Issues
Currently, the Modbus protocol can be implemented on general computers and operating systems, running over TCP/IP to meet developmental needs. Thus, the inherent security issues of the TCP/IP protocol inevitably affect the security of industrial control networks. Common attack methods in the IP internet, such as unauthorized data acquisition, man-in-the-middle attacks, denial of service, IP spoofing, and viruses, will impact Modbus system security.
8.3 Security Recommendations
Currently, security measures taken by Modbus systems are generally insufficient. Here are some security recommendations based on research in the information security industry and the specific security issues of industrial control systems, which can effectively reduce the threats faced by industrial control systems.
(1) Start from the Source
Many vulnerabilities in industrial control networks arise from issues during implementation. If security measures are integrated from the source during the demand design, development, implementation, internal testing, and deployment stages of the Modbus system, incorporating security design, secure coding, and security testing techniques throughout the entire lifecycle can significantly eliminate security vulnerabilities and reduce the overall security risks of the Modbus system.
(2) Anomaly Behavior Detection
Anomalous behavior indicates potential threats, regardless of whether there are attackers. Therefore, developing dedicated anomaly behavior detection devices for Modbus systems can greatly enhance the security of industrial control networks. For Modbus systems, it is essential to analyze the various operational behaviors, describing behaviors based on the six-tuple model of “subject, location, time, access method, operation, object,” and then analyze whether the behavior is anomalous; ultimately deciding on measures such as recording or alerting.
(3) Security Audits
The security audit of Modbus involves in-depth decoding and analysis of protocol data, recording key information such as the time, location, operator, and operational behavior, achieving security audit log recording and audit functionality for the Modbus system, thus providing the ability to trace security events after they occur.
(4) Use Network Security Devices
Utilizing intrusion prevention and firewall network security devices. A firewall is a serial device that, when set, only allows specific addresses to access the server, prohibiting external addresses from accessing the Modbus server, effectively preventing external intrusions; intrusion prevention devices can analyze the specific operational content of the Modbus protocol, effectively detecting and blocking various internal/external abnormal operations and penetration attacks, providing protection for the internal network.
———— END ————
â—Ź Column “Embedded Tools”
â—Ź Column “Embedded Development”
â—Ź Column “Keil Tutorial”
â—Ź Selected Tutorials from Embedded Column
Follow the public account reply “Join Group” to join the technical exchange group according to the rules, reply “1024” to see more content.
