Understanding HTTP, TCP, IP, and Socket Protocols

Light

Topic

CC teacher shares the cornerstone coding + graphics technology explanation for live development. Tonight is the last day to sign in, don’t waste your efforts! If you complete the sign-in tonight, you can receive the full set of iOS skill expansion videos after class.

Understanding HTTP, TCP, IP, and Socket ProtocolsUnderstanding HTTP, TCP, IP, and Socket Protocols

Tonight at 8:30, a detailed explanation of the above topics will be given. Scan the QR code to sign up for free before 8:30 tonight (mobile users can download the app for a better experience).

Class address, scan to enter the classroom

Understanding HTTP, TCP, IP, and Socket Protocols

Light

Main Text

1

Understanding HTTP, TCP, IP, and Socket Protocols.

HTTP

The HyperText Transfer Protocol (HTTP) corresponds to the application layer and is based on a TCP connection. An HTTP connection is referred to as a short connection, meaning that the client sends a request to the server, and after the server responds, the connection is terminated.

In HTTP 1.0, each request from the client requires a separate connection to be established, which is automatically released after the request is processed.

HTTP 1.1 allows multiple requests to be handled over a single connection, and multiple requests can overlap without waiting for one to finish before sending the next.

Because HTTP is a “short connection,” to maintain the online status of the client program, it must continually initiate connection requests to the server. The general practice is that even without needing to retrieve any data, the client still sends a “keep-alive” request to the server at fixed intervals. Upon receiving this request, the server replies to indicate that it acknowledges the client is “online.” If the server does not receive a request from the client for a long time, it considers the client “offline,” and if the client does not receive a reply from the server for an extended period, it assumes the network has been disconnected.

TCP/IP

The TCP/IP protocol is a transport layer protocol that primarily addresses how data is transmitted over the network. “IP” stands for Internet Protocol, and TCP and UDP use this protocol to transfer data packets from one network to another. Think of IP as a highway that allows other protocols to travel on it and find exits to other computers. TCP and UDP are the “trucks” on this highway, carrying goods such as HTTP and File Transfer Protocol (FTP).

You should understand that TCP and UDP are transport layer protocols used by FTP, HTTP, and SMTP. Although both TCP and UDP are used to transmit other protocols, they have a significant difference: TCP guarantees data transmission, while UDP does not. This means TCP has a unique mechanism to ensure that data is transmitted securely and without errors from one endpoint to another, whereas UDP provides no such guarantees.

The chart below attempts to show the positions of different TCP/IP and other protocols in the original OSI model:

TCP flags include the following 6 indicators:

1. SYN (synchronous connection establishment)

2. ACK (acknowledgment)

3. PSH (push)

4. FIN (finish)

5. RST (reset)

6. URG (urgent)

Sequence number

Acknowledge number

Client TCP state transitions:

CLOSED->SYN_SENT->ESTABLISHED->FIN_WAIT_1->FIN_WAIT_2->TIME_WAIT->CLOSED

Server TCP state transitions:

CLOSED->LISTEN->SYN_RECEIVED->ESTABLISHED->CLOSE_WAIT->LAST_ACK->CLOSED

The meanings of each state are as follows:

LISTEN – Listening for connection requests from remote TCP ports;

SYN-SENT – Waiting for a matching connection request after sending a connection request;

SYN-RECEIVED – Waiting for acknowledgment of the connection request after sending and receiving a connection request;

ESTABLISHED – Indicates an open connection, data can be transmitted to the user;

FIN-WAIT-1 – Waiting for a remote TCP connection termination request or acknowledgment of a previous connection termination request;

FIN-WAIT-2 – Waiting for a connection termination request from remote TCP;

CLOSE-WAIT – Waiting for a connection termination request from the local user;

CLOSING – Waiting for acknowledgment of the connection termination from remote TCP;

LAST-ACK – Waiting for acknowledgment of the original connection termination request sent to remote TCP;

TIME-WAIT – Waiting long enough to ensure that remote TCP receives the acknowledgment of the connection termination request;

CLOSED – No connection state;

TCP/IP Three-Way Handshake

First handshake: When establishing a connection, client A sends a SYN packet (SYN=j) to server B and enters the SYN_SENT state, waiting for server B to confirm.

Second handshake: Server B receives the SYN packet and must confirm client A’s SYN (ACK=j+1) while also sending a SYN packet (SYN=k), i.e., a SYN+ACK packet. At this point, server B enters the SYN_RECV state.

Third handshake: Client A receives server B’s SYN+ACK packet and sends an acknowledgment packet ACK (ACK=k+1) to server B. After this packet is sent, client A and server B enter the ESTABLISHED state, completing the three-way handshake. After completion, the client and server begin to transmit data.

Since TCP connections are full-duplex, each direction must be closed separately. This principle allows one party to send a FIN to terminate the connection in that direction once it has completed its data transmission task. Receiving a FIN only means that there is no data flow in that direction; a TCP connection can still send data after receiving a FIN. The party that initiates the closure will perform an active close, while the other party will perform a passive close.

TCP connection termination requires sending four packets, hence it is called a four-way handshake. Either the client or server can actively initiate the handshake. In socket programming, either party executing the close() operation will initiate the handshake.

Client A sends a FIN to close the data transmission from client A to server B.

Server B receives this FIN and sends back an ACK, confirming the sequence number as the received sequence number plus one. Like SYN, a FIN also occupies a sequence number.

Server B closes the connection with client A and sends a FIN to client A.

Client A sends back an ACK message confirming and sets the acknowledgment number to the received sequence number plus one.

In-depth understanding of TCP connection release:

Since TCP connections are full-duplex, each direction must be closed separately. This principle allows one party to send a FIN to terminate the connection in that direction once it has completed its data transmission task. Receiving a FIN only means that there is no data flow in that direction; a TCP connection can still send data after receiving a FIN. The party that initiates the closure will perform an active close, while the other party will perform a passive close.

The TCP protocol’s connection is a full-duplex connection, meaning a TCP connection exists with bidirectional read and write channels.

In simple terms, it’s “close read first, then close write,” which involves four stages. Taking the client-initiated connection closure as an example:

1. Server read channel closes 2. Client write channel closes 3. Client read channel closes 4. Server write channel closes

The closing action occurs after the initiating party has finished sending data, at which point it sends a FIN (finish) data segment to the other party. Until the other party sends a FIN and the other party receives the acknowledgment ACK, the data communication between both parties is completely finished, and each reception requires returning an acknowledgment data segment ACK.

Detailed process:

First stage: After the client has sent data, it sends a FIN data segment to the server with sequence number i;

1. The server receives FIN(i) and returns acknowledgment segment ACK with sequence number i+1, closing the server read channel;

2. The client receives ACK(i+1) and closes the client write channel;

(At this point, the client can still read data from the server through the read channel, and the server can still write data through the write channel.)

Second stage: After the server has sent data, it sends a FIN data segment to the client with sequence number j;

3. The client receives FIN(j) and returns acknowledgment segment ACK with sequence number j+1, closing the client read channel;

4. The server receives ACK(j+1) and closes the server write channel.

This is the standard two-stage TCP closure, where both the server and client can initiate closure, making it completely symmetrical.

The FIN identifier is set when sending the last data segment. In standard examples, the server is still sending data, so it must wait until it has finished sending before setting the FIN (at this point, the TCP connection can be considered in a half-closed state because data can still be transmitted from the passive close side to the active close side). If the server has no data to send when it receives FIN(i), it can set the FIN(j) identifier when returning ACK(i+1), effectively merging the second and third steps.

TCP’s TIME_WAIT and CLOSE_WAIT states

CLOSE_WAIT:

The party that initiates the TCP connection closure is called the client, while the party that passively closes is called the server. The TCP state of the passive closing end that has not sent a FIN is CLOSE_WAIT. This situation generally arises due to issues in the server-side code. If your server experiences a large number of CLOSE_WAIT states, you should consider checking the code.

TIME_WAIT:

According to the TCP protocol’s definition of the three-way handshake for disconnecting connections, the socket that actively closes will enter the TIME_WAIT state. The TIME_WAIT state lasts for 2 MSL (Maximum Segment Lifetime), which defaults to 4 minutes or 240 seconds in Windows. Sockets in the TIME_WAIT state cannot be recycled for use. The specific phenomenon is that for a server handling a large number of short connections, if the server actively closes the client’s connection, it will lead to a large number of sockets in the TIME_WAIT state, potentially outnumbering those in the Established state, severely impacting the server’s processing capacity and even exhausting available sockets, stopping service.

Socket

A socket connection is referred to as a long connection; theoretically, once a connection is established between the client and server, it will not actively disconnect. However, due to various environmental factors, the connection may be interrupted, such as if the server or client host goes down, network failures occur, or if there has been no data transmission for an extended period, the network firewall may disconnect the connection to free up network resources. Therefore, when there is no data transmission in a socket connection, a heartbeat message must be sent to maintain the connection; the specific heartbeat message format is defined by the developer.

A socket is the cornerstone of communication and is the basic operational unit that supports TCP/IP protocol network communication. It is an abstract representation of the endpoint in the network communication process, containing five essential pieces of information for network communication: the protocol used for the connection, the local host’s IP address, the local process’s protocol port, the remote host’s IP address, and the remote process’s protocol port.

What we usually refer to as a socket is actually a wrapper around the TCP/IP protocol; the socket itself is not a protocol but a calling interface (API) that allows us to use the TCP/IP protocol. In reality, sockets have no inherent connection to the TCP/IP protocol. When the socket programming interface was designed, it was intended to be adaptable to other network protocols as well. Therefore, the emergence of sockets merely makes it easier for programmers to use the TCP/IP protocol stack, abstracting the TCP/IP protocol to form the basic function interfaces we know, such as create, listen, connect, accept, send, read, and write, etc. There is a saying about the relationship between sockets and the TCP/IP protocol that is quite easy to understand:

“TCP/IP is just a protocol stack, like the operating system’s operating mechanism, which must be implemented and also provide external operational interfaces. This is similar to how operating systems provide standard programming interfaces, such as the Win32 programming interface; TCP/IP must also provide interfaces for programmers to use in network development, which is the socket programming interface.”

In fact, the transport layer’s TCP is based on the network layer’s IP protocol, while the application layer’s HTTP protocol is based on the transport layer’s TCP protocol, and the socket itself is not considered a protocol; as mentioned above, it merely provides an interface for programming with TCP or UDP. A socket is a tool for port communication development and is a bit lower level.

A socket is an interface for programming with TCP and UDP, allowing you to establish TCP connections, etc. TCP and UDP protocols belong to the transport layer. HTTP is an application layer protocol, which is actually built on top of the TCP protocol (HTTP is a car that provides a specific form for encapsulating or displaying data; the socket is the engine that provides network communication capabilities).

A socket is a wrapper around the TCP/IP protocol; the socket itself is not a protocol but a calling interface (API) that allows us to use the TCP/IP protocol. The emergence of sockets merely makes it easier for programmers to use the TCP/IP protocol stack, abstracting the TCP/IP protocol to form the basic function interfaces we know.

Understanding HTTP, TCP, IP, and Socket Protocols

Dear programmers, what interview questions or skills would you like to learn about, or let our instructor share technical points for free?

Leave a message in the comment section every day, and the editor will randomly select three each day from the comments for our instructor to share in the public class for free the next day

Understanding HTTP, TCP, IP, and Socket Protocols

135

Leave a Comment