Click the “Five-Minute Algorithm Learning” above and select the “Star” public account
Heavyweight content delivered first time
When it comes to the protocols we encounter most in computer networks, we cannot ignore the TCP/IP protocol, which is also the most famous protocol on the Internet. Let’s talk about the TCP/IP protocol together.
Historical Background of TCP/IP
Before the TCP/IP protocol existed, in the 1960s, many countries and regions recognized the importance of communication technology. The U.S. Department of Defense wanted to study a technology that could communicate through alternative routes even if communication lines were destroyed. To achieve this technology, packet
networks emerged.
Even during communication between two nodes, if several nodes are damaged, they can still communicate by changing the lines and other means.
This packet network facilitated the birth of ARPANET (Advanced Research Projects Agency Network)
, the first wide-area packet-switched network with distributed control and the predecessor of the first implementation of the TCP/IP protocol.
ARPANET was actually planned to be established by the U.S. Department of Defense’s Advanced Research Projects Agency.
Thus, the emergence of computer networks was initially for military research purposes.
In the 1990s, IOS initiated the OSI international standardization process, but did not make substantial progress, yet the TCP/IP protocol became widely used.
The rapid development of the TCP/IP protocol may be due to its standardization, meaning that the TCP/IP protocol involves standards that OSI does not have, and these standards will be the main focus of our discussion next.
First, let’s understand the TCP/IP protocol. The TCP/IP protocol refers not only to the TCP and IP protocols but actually to a protocol suite. What is a protocol suite? In simple terms, it is a comprehensive set of protocols. If someone asks you what protocols are in TCP/IP next time, you can show them the diagram below.
The protocols summarized above constitute the TCP/IP protocol suite.
TCP/IP Standards
Compared to other protocol standards, TCP/IP emphasizes two points: openness
and practicality
, which refers to whether the standard can be practically used.
Openness means that TCP/IP is discussed and formulated by IETF
, which is an organization that allows anyone to join and discuss.
Practicality means that if a framework only exists in theory without practical implementation, it can never become mainstream.
The standard protocol of TCP/IP is what we know as RFC documents
, which you can find online. RFC not only standardizes protocol standards but also includes implementation and usage information.
For more RFC protocols, you can refer to the official document at https://www.rfc-editor.org/rfc-index.html
We will not elaborate on this here; the focus of our article is on the study of TCP/IP.
TCP/IP Protocol Suite
Now let’s start discussing the TCP/IP protocol suite.
The TCP/IP protocol is the most encountered protocol by programmers. The OSI model has seven layers: physical layer, data link layer, network layer, transport layer, session layer, presentation layer, and application layer. However, this is quite complex, so in the TCP/IP protocol, they are simplified into four layers.
Let’s start by introducing the communication link layer and the protocols between the layers.
Communication Link Layer
If we must subdivide it, the communication link layer can also be divided into physical layer
and data link layer
.
Physical Layer
The physical layer is the lowest layer of TCP/IP, responsible for the hardware transmission, such as Ethernet or telephone lines and other physical layer devices.
Data Link Layer
The other layer is the data link layer, which is located between the physical layer and the network layer, defining how to transmit data over a single link.
Network Layer
The network layer mainly uses the IP
protocol, which forwards packet data based on IP addresses.
The main function of the IP protocol is to send packet data to the target host.
The functions of the Internet layer and transport layer in TCP/IP layering are usually provided by the operating system.
IP also implicitly includes the functions of the data link layer, allowing hosts to communicate with each other regardless of the underlying data link.
Although IP is also a packet-switching protocol, it lacks a retransmission mechanism. Even if data does not reach the other end, it will not be resent, so IP is considered an unreliable protocol.
Another protocol in the network layer is ICMP
, which is used to send an error notification to the sender when an IP packet cannot reach the target address due to an exception during the sending process. Therefore, ICMP can also be used to diagnose network conditions.
Transport Layer
After introducing the most important IP protocol of the TCP/IP protocol, let’s now introduce the transport layer protocol, of which TCP is one.
The transport layer is like a highway, connecting roads between two cities. The following is the logical channel of the Internet, which you can imagine as a highway.
The main function of the transport layer is to enable communication and data exchange between application programs at the application layer. Many application programs run inside the computer, each corresponding to a port number, which we generally use to distinguish these applications.
The transport layer protocols are mainly divided into connection-oriented protocol TCP and connectionless protocol UDP.
TCP
TCP is a reliable protocol that guarantees the reliable delivery of data packets. TCP can correctly handle packet loss, transmission order confusion, and other exceptional situations during transmission. In addition, TCP also provides congestion control to alleviate network congestion.
UDP
UDP is an unreliable protocol that cannot guarantee the reliable delivery of data. Compared to TCP, UDP does not check whether packets have arrived or whether the network is congested, but UDP is more efficient.
UDP is commonly used for small packet data or broadcast/multicast in video communication and multimedia fields.
Application Layer
In the TCP/IP protocol suite, the session layer and presentation layer in the OSI standard model are classified as the application layer. The architecture of the application layer mostly belongs to the client/server model, where the program providing the service is called the server, and the program receiving the service is called the client. In this architecture, the server is usually deployed on the server in advance, waiting for client connections to provide services.
The Process of Sending Packets
Next, let’s introduce how a packet is sent from one data packet to another through the application layer, transport layer, network layer, and communication link layer.
Packet Structure
First, let’s understand the structure of a data packet. Here, cxuan will give you a brief introduction, and later articles will provide more detailed information.
In each layer above, a header
is added to the data being sent, which contains the necessary information for that layer. Each layer processes the data and attaches the necessary information of that layer to the data packet. Now let’s talk about the process of sending a packet.
Packet Sending Process
Suppose Host A and Host B are communicating, what peculiar operations does Host A go through to send a packet to Host B?
Application Layer Processing
Host A, which is the user, clicks on an application or opens a chat window and inputs cxuan
, then clicks send. This cxuan
becomes a data packet traveling through the network. However, the application layer still needs to process this data packet, including character encoding, formatting, etc. This layer actually performs the work of the presentation layer in the OSI model, but in the TCP/IP protocol, it is all classified as the application layer.
At the moment of sending the data packet, a TCP connection is established, which acts as a channel. After this, other data packets will also use this channel to transmit data.
Transport Layer Processing
To ensure that the information is accurately delivered to the other party, we use the TCP protocol for description. TCP is responsible for establishing connections, sending data, and disconnecting based on the application’s instructions.
TCP attaches a TCP header field to the front of the application data layer. The TCP header contains the source port number
and destination port number
, which indicate where the data packet was sent from and which application it needs to be sent to; the TCP header also contains a sequence number
, indicating the sequence number of the byte in the entire data sent by the sender; the TCP header also includes a checksum
to determine if the data is damaged, then the TCP header is attached to the data packet and sent to IP.
Network Layer Processing
The network layer mainly processes the data packet using the IP protocol. The IP protocol takes the TCP header and data sent by TCP and adds its own IP header to the front of the TCP header. Therefore, the IP data packet is followed by the TCP data packet, and then the data itself. The IP header contains the destination and source addresses, followed by information to determine whether the following is TCP or UDP.
Once the IP packet is generated, the routing control table determines which host it should be sent to, and the data packet modified by IP continues to be sent to the router or the network interface driver, thus achieving real data transmission.
If the IP address of the target host is unknown, you can use the
ARP (Address Resolution Protocol)
to look it up.
Communication Link Layer Processing
The data packet sent via IP will have an Ethernet header added and processed by Ethernet. The Ethernet header includes the MAC address of the receiving end, the MAC address of the sending end, and the Ethernet data protocol that marks the type of Ethernet.
Below is the complete processing and parsing process.
As shown in the figure above, the left side is the data sending process, where the application layer data is processed layer by layer into a data packet that can be sent, which is then transmitted to the designated host through the physical medium.
The packet receiving process is the reverse of the sending process, and the parsing of the data packet will also go through the following steps.
Communication Link Parsing
After the target host receives the data packet, it will first find the MAC address from the Ethernet header to determine whether it is the intended recipient. If not, the data packet will be discarded.
If the received data packet is intended for itself, it will check the Ethernet type to determine which protocol it is. If it is the IP protocol, it will be handed over to the IP protocol for processing; if it is the ARP
protocol, it will be handed over to the ARP protocol for processing. If the protocol type is an unrecognized protocol, the data packet will be discarded.
Network Layer Parsing
After the data packet has been processed by Ethernet, it is handed over to the network layer for processing. Assuming the protocol type is IP, when IP receives the data packet, it will parse the IP header to check whether the IP address in the header matches its own IP address. If it matches, it will receive the data and determine whether the upper-layer protocol is TCP or UDP; if it does not match, it will be discarded directly.
Note: During the routing forwarding process, sometimes the IP address is not its own; in this case, the routing table needs to assist in processing.
Transport Layer Processing
In the transport layer, we default to using the TCP protocol. During the TCP processing, it first calculates the checksum
to determine if the data is damaged. Then it checks whether the data is received in order, and finally checks the port number to determine which application program it is.
Once the data is completely identified, it will be passed to the application program identified by the port number for processing.
Application Program Processing
The designated application program at the receiving end will process the data sent by the sender, recognize the content of the data through decoding and other operations, then store the corresponding data on the disk, and return a success message to the sender. If saving fails, it will return an error message.
The above is a complete process of sending and receiving a data packet. In this data sending and receiving process, various addresses, port numbers, protocol types, etc., between different layers are involved, so now let’s analyze them.
After passing through each layer, the layer protocol will attach a header to the data packet. Below is a complete header diagram.
In the process of sending a data packet, each layer sequentially adds header information to the data packet. Each header contains the sender and receiver addresses as well as the protocol type of the previous layer. Ethernet uses MAC addresses, IP uses IP addresses, and TCP/UDP uses port numbers to identify the addresses of the two hosts.
In addition, each header in each layer also contains an identification bit, which is used to identify the type of the previous layer protocol.
Recommended Reading
• Wu Shixiong’s real-name complaint about a problem on LeetCode…• When I interviewed ByteDance, I encountered the original question…• How can computer science students practice programming to master it?• Why does MySQL use B+ trees?• A simple algorithm interview question from ByteDance
Welcome to follow my public account “Five-Minute Algorithm Learning“. If you like it, please click “Looking“~
