An Overview of HTTP: Protocols, Methods, and Differences with HTTPS

1. Introduction to HTTP

HTTP (Hypertext Transfer Protocol) is an application layer protocol used for transmitting hypertext (such as HTML files) over the network. It is based on the TCP/IP protocol and serves as the foundation for data communication on the World Wide Web. Below, we will detail HTTP from multiple aspects.

2. Development History of HTTP

  • HTTP/0.9: The earliest version, with simple functionality, only supports GET requests for retrieving HTML pages.

  • HTTP/1.0: Introduced POST and HEAD request methods, supports response status codes, HTTP header information, and can transmit various types of data (such as images, audio, etc.).

  • HTTP/1.1: Currently the most widely used version, it introduced persistent connections (default to keep connections open), chunked transfer encoding, caching mechanisms, etc., improving performance and efficiency.

  • HTTP/2: Developed based on the SPDY protocol, it employs binary framing, multiplexing, header compression, server push, and other technologies to significantly enhance transmission performance.

  • HTTP/3: Based on the QUIC protocol, it further optimizes transmission performance, reducing connection establishment and transmission delays.

3. Common HTTP Request Methods

  • GET: Used to request a specified resource, with request parameters typically appended to the URL, such as https://example.com/api?param1=value1&param2=value2. Mainly used for data retrieval and should not have side effects on server data.

  • POST: Used to submit data to a specified resource, with request parameters typically placed in the request body. It can be used to submit form data, upload files, etc., and will change server data.

  • PUT: Used to upload the latest content to a specified resource location, typically for updating resources.

  • DELETE: Requests the server to delete a specified resource.

  • HEAD: Similar to GET requests, but the server only returns response headers without the response body, useful for obtaining metadata about resources.

  • OPTIONS: Used to query the server’s capabilities or inquire about options and requirements related to resources.

4. How HTTP Works

HTTP (Hypertext Transfer Protocol) is an application layer protocol based on a request-response model, used for transmitting hypertext data between clients and servers, such as HTML pages, images, etc. Below is a detailed introduction to its working principle.

### Overall Process

The basic flow of HTTP communication can be summarized as the client initiating a request, the server receiving and processing the request, and then returning a response, with the client receiving and processing the response. The entire process involves multiple steps, which will be explained step by step.

### Specific Steps

1. Establishing a Connection

The client (usually a browser) needs to establish a TCP connection with the server. Since HTTP is based on the TCP protocol, a reliable TCP connection is established using a three-way handshake mechanism before HTTP communication occurs. For example, when you enter https://www.example.com in the browser and hit enter, the browser resolves the domain name, finds the corresponding server IP address, and then establishes a TCP connection with the server.

2. Sending a Request

Once the connection is successfully established, the client constructs an HTTP request message and sends it to the server. The HTTP request message consists of three parts: the request line, request headers, and request body:

  • Request Line: Contains the request method (such as GET, POST, etc.), the requested URL, and the HTTP protocol version. For example: GET /index.html HTTP/1.1.

  • Request Headers: Contains a series of key-value pairs that provide additional information about the request, such as the character encoding supported by the client, accepted media types, etc. Example:

  • Request Body: An optional part, typically used in POST requests, carrying data to be sent to the server, such as form data, JSON data, etc.

3. Server Processes the Request

After receiving the client’s request, the server parses the request and processes it based on the request’s URL, request method, and other information. This may involve accessing a database, reading files, invoking business logic, etc. For example, the server may read the index.html file from disk or query the database for data based on request parameters.

4. Sending a Response

After processing the request, the server constructs an HTTP response message and sends it back to the client. The HTTP response message consists of three parts: the status line, response headers, and response body:

  • Status Line: Contains the HTTP protocol version, status code, and status description. For example: HTTP/1.1 200 OK.

  • Response Headers: Contains a series of key-value pairs that provide additional information about the response, such as the content type and content length.

  • Response Body: Contains the actual data returned by the server to the client, such as the content of an HTML page, image data, etc.

5. Client Receives and Processes the Response

After the client receives the server’s response, it parses the response. Based on the information in the response headers, the client can understand the type and length of the response, and then process the data in the response body accordingly. For example, if the response body is an HTML page, the browser will parse and render it for display to the user.

6. Closing the Connection

After data transmission is complete, the TCP connection between the client and server can be closed. In HTTP/1.0, the connection is closed after each request-response cycle; however, in HTTP/1.1 and later versions, persistent connections are used by default, allowing multiple request-response cycles over the same TCP connection, improving performance. When both parties no longer need to communicate, the TCP connection is closed using a four-way handshake mechanism.

6. HTTP Status Codes

HTTP status codes are three-digit codes returned by the server in response to client requests, indicating the result of processing the request. Based on the first digit of the status code, they can be divided into five categories. Below are common status codes and their meanings.

### 1xx: Informational Status Codes

Indicates that the server has received the request and is continuing to process it. These status codes are rarely seen in practical applications.

  • 100 Continue: The server has received the initial part of the request, and the client should continue sending the remaining part. If the request is already complete, this response can be ignored.

  • 101 Switching Protocols: The server is switching protocols based on the client’s request, such as switching from HTTP to WebSocket.

### 2xx: Success Status Codes

Indicates that the request has been successfully received, understood, and processed by the server.

  • 200 OK: The request was successful, and the server has processed and returned the requested data. This is the most common success status code.

  • 201 Created: The request was successful, and the server created a new resource, typically returned after a POST request.

  • 202 Accepted: The server has accepted the request but has not yet processed it. It may process the request asynchronously later.

  • 204 No Content: The request was successful, but no content is returned in the response, commonly used after delete operations.

### 3xx: Redirection Status Codes

Indicates that the client needs to take further action to complete the request, typically used for redirection.

  • 301 Moved Permanently: The requested resource has been permanently moved to a new URL, and the client should use the new URL for subsequent requests.

  • 302 Found: The requested resource has temporarily moved to a new URL, and the client should use the new URL for this request, but subsequent requests can still use the original URL.

  • 303 See Other: The client is advised to use the GET method to access another URL to obtain the response result, typically used for redirecting to a confirmation page after a POST request.

  • 304 Not Modified: The client sent a conditional request (such as If-Modified-Since or If-None-Match), and the server found that the resource has not changed, so it does not return the resource content. The client can use the cached resource.

  • 307 Temporary Redirect: Similar to 302, the requested resource has temporarily moved to a new URL, but the client should maintain the original request method for the redirect request.

  • 308 Permanent Redirect: Similar to 301, the requested resource has been permanently moved to a new URL, and the client should use the new URL for subsequent requests while maintaining the original request method.

### 4xx: Client Error Status Codes

Indicates that there is an error in the request sent by the client, and the server cannot process the request.

  • 400 Bad Request: The syntax of the request sent by the client is incorrect, and the server cannot understand it.

  • 401 Unauthorized: The request requires user authentication, and the client did not provide valid authentication information.

  • 403 Forbidden: The server understands the request but refuses to execute it; the client does not have permission to access the resource.

  • 404 Not Found: The server cannot find the requested resource, which may be due to an incorrect URL or the resource being deleted.

  • 405 Method Not Allowed: The request method used by the client (such as GET, POST, etc.) is not allowed by the server, which will specify the allowed request methods in the response headers.

  • 408 Request Timeout: The client request timed out, and the server waited too long for the client to send the request.

  • 409 Conflict: The request conflicts with the current state of the server, such as when creating a resource that already exists.

  • 415 Unsupported Media Type: The server does not support the media format used in the client request.

### 5xx: Server Error Status Codes

Indicates that an error occurred on the server while processing the request, and the request could not be completed.

  • 500 Internal Server Error: An internal error occurred on the server, preventing the request from being completed. This is the most common server error status code.

  • 501 Not Implemented: The server does not support the functionality required to fulfill the request, and the client may try other requests.

  • 502 Bad Gateway: The server, acting as a gateway or proxy, received an invalid response from the upstream server.

  • 503 Service Unavailable: The server is temporarily unable to handle the request, usually due to overload or maintenance.

  • 504 Gateway Timeout: The server, acting as a gateway or proxy, did not receive a timely response from the upstream server.

7. HTTP Header Information

Both HTTP requests and responses contain header information used to convey additional metadata. Common request and response headers include:

### Request Headers:

  • User-Agent: Identifies the type and version of the client, such as browser information.

  • Accept: Specifies the content types that the client can accept, such as text/html, application/json, etc.

  • Authorization: Used to provide authentication information to the server.

### Response Headers:

  • Content-Type: Specifies the content type of the response body, such as text/html; charset=UTF-8.

  • Content-Length: Specifies the length of the response body in bytes.

  • Cache-Control: Used to control caching policies, such as max-age=3600, indicating a cache validity period of 1 hour.

8. Differences Between HTTP and HTTPS

HTTP (Hypertext Transfer Protocol) and HTTPS (Hypertext Transfer Protocol Secure) are commonly used application layer protocols on the internet, with the following main differences:

### 1. Security

  • HTTP: A plaintext transmission protocol, where data exists in plaintext during transmission, making it vulnerable to interception, tampering, or eavesdropping by intermediaries. For example, in a public Wi-Fi environment, an attacker can easily capture sensitive information such as usernames and passwords transmitted over HTTP websites using packet sniffing tools.

  • HTTPS: Based on the SSL/TLS encryption transmission protocol, it encrypts data before transmission. Even if data is intercepted, attackers cannot access the real information without the decryption key, significantly enhancing the security of data transmission.

### 2. Port Number

  • HTTP: By default, it uses port 80. When a browser accesses an HTTP website without specifying a port number, it automatically uses port 80 for the connection.

  • HTTPS: By default, it uses port 443. When a browser accesses an HTTPS website, it will attempt to establish a connection with the server through port 443 by default.

### 3. Certificates

  • HTTP: Does not require a certificate, and any server can set up an HTTP service, making the setup cost lower.

  • HTTPS: Requires an SSL/TLS certificate, which is issued by a trusted Certificate Authority (CA). Website operators need to apply for a certificate from the CA, providing relevant authentication information during the application process. The certificate serves to verify the server’s identity and provide a public key for data encryption.

### 4. Connection Method

  • HTTP: Uses a simple request-response model, where the client sends a request to the server, and the server receives the request and returns a response. The entire process is in plaintext, and the connection process is relatively simple.

  • HTTPS: Requires an SSL/TLS handshake process when establishing a connection. The client and server negotiate encryption algorithms, exchange keys, etc., to ensure the security of subsequent data transmission. This process increases the complexity and time overhead of the connection.

### 5. Performance

  • HTTP: Since it does not require encryption and decryption operations, data transmission speed is relatively fast, and the server’s processing burden is lower.

  • HTTPS: Encryption and decryption operations consume certain CPU and memory resources, leading to relatively slower data transmission speeds and increased server processing burden. However, with the improvement of hardware performance and optimization of encryption algorithms, the performance loss of HTTPS has been gradually decreasing.

### 6. Search Engine Optimization (SEO)

  • HTTP: Websites using HTTP have relatively lower weight in search engine rankings, as search engines prefer to recommend secure websites to users.

  • HTTPS: Major search engines (such as Google, Baidu, etc.) give certain ranking advantages to HTTPS websites, helping to increase the visibility of websites in search results.

### Example Code (Node.js Implementation of Simple HTTP and HTTPS Servers)

HTTP Server

const http = require('http');
const server = http.createServer((req, res) => {  res.statusCode = 200;  res.setHeader('Content-Type', 'text/plain');  res.end('Hello, HTTP!\n');});
const port = 80;server.listen(port, () => {  console.log(`Server running at http://localhost:${port}/`);});

HTTPS Server

const https = require('https');
const fs = require('fs');
const options = {  key: fs.readFileSync('server.key'), // Private key file  cert: fs.readFileSync('server.cert') // Certificate file};
const server = https.createServer(options, (req, res) => {  res.statusCode = 200;  res.setHeader('Content-Type', 'text/plain');  res.end('Hello, HTTPS!\n');});
const port = 443;server.listen(port, () => {  console.log(`Server running at https://localhost:${port}/`);});

HTTPS has significant advantages in security, and as the internet’s security requirements continue to rise, more and more websites are migrating from HTTP to HTTPS.

9. Three-Way Handshake and Four-Way Handshake

The three-way handshake and four-way handshake are mechanisms used in the TCP (Transmission Control Protocol) connection establishment and disconnection process. Let’s first discuss the process of establishing and ending the relationship, including common flags, sequence numbers, and acknowledgment numbers.

### Flags, Sequence Numbers, and Acknowledgment Numbers

When it comes to establishing and disconnecting a TCP connection, the request message (data transmission) is a key component. Below is a comprehensive description that includes a brief introduction to flags, sequence numbers, and acknowledgment numbers:

1. Flags::

  • SYN (Synchronize): The sender uses the SYN flag to initiate a new connection.
  • ACK (Acknowledgment): Indicates that the acknowledgment number field is valid and is used to confirm received data.
  • FIN (Finish): Sent by the sender to indicate a request to close the connection, used after data transmission is complete. Among them,<span><span>ACK=1</span></span> indicates acknowledgment,<span><span>SYN=1</span></span> indicates synchronization, and<span><span>FIN=1</span></span><span><span> indicates closure.</span></span>

2. Sequence Number: (seq):

  • Used to identify the order of each data segment in the TCP data stream.
  • The sender assigns a unique sequence number to each data segment, and the receiver reassembles the data using the sequence numbers.

3. Acknowledgment Number: (ack):

  • Indicates the expected sequence number of the next data packet to be received.
  • The receiver sends an ACK data packet with the acknowledgment number to inform the sender which data has been successfully received.

Combining these concepts, the establishment of a TCP connection typically begins with a request message that sends the SYN flag. The receiver confirms by sending a response message with both SYN and ACK flags. The correct use of sequence numbers and acknowledgment numbers ensures the orderly transmission and reliable reception of data. When disconnecting, the use of the FIN flag indicates that the sender requests to close the connection, while the receiver confirms the closure request by sending an ACK response with the acknowledgment number.

### Three-Way Handshake

Meaning

The three-way handshake is the process used by the TCP protocol to establish a connection, ensuring that both the client and server can send and receive data properly by exchanging three packets.

Process

1. First Handshake (SYN): Establishing a connection request, the client sends a SYN packet

  • The client sends a packet with SYN = 1 flag and a randomly generated initial sequence numberseq = x.

  • The client enters the SYN_SENT state, waiting for the server’s confirmation, indicating that the client wants to establish a connection.

2. Second Handshake (SYN-ACK): Confirming the connection request, the server sends a SYN + ACK packet

  • After receiving the client’s request, the server replies with a packet containing SYN = 1 and ACK = 1 flags, along with the acknowledgment numberack = x + 1 (indicating that the server confirms receipt of the client’s SYN packet) and its own initial sequence numberseq = y.

  • The server enters the SYN_RCVD state, waiting for the client’s final confirmation.

3. Third Handshake (ACK): Connection confirmation, the client sends an ACK packet

  • After receiving the server’s SYN + ACK packet, the client sends a packet with ACK = 1 flag, acknowledgment numberack = y + 1 (indicating that the client confirms receipt of the server’s SYN packet), and sequence numberseq = x + 1.

  • Both the client and server enter the ESTABLISHED state, and the connection is successfully established.

An Overview of HTTP: Protocols, Methods, and Differences with HTTPSThumbnail

         Client                           Server           |                               |(First Handshake) | -------- SYN(seq=x) --------> |  Client sends SYN packet(Second Handshake) | <-- SYN+ACK(seq=y,ack=x+1) -- |  Server sends SYN + ACK packet(Third Handshake) | --- ACK(seq=x+1,ack=y+1) ---> |  Client sends ACK packet           |                               |  Connection established successfully

### Four-Way Handshake

Meaning

The four-way handshake is the process used by the TCP protocol to disconnect a connection, ensuring that both the client and server can properly close the connection and release resources by exchanging four packets.

Process

1. First Handshake (FIN): Initiating a close request, the client sends a FIN packet

  • After completing data transmission, the client sends a packet with FIN = 1 flag and sequence numberseq = 1 (indicating that the client will no longer send data and is preparing to close the connection)

  • The client enters the FIN_WAIT_1 state

2. Second Handshake (ACK): Confirming the close request, the server sends an ACK packet

  • After receiving the client’s close request, the server sends a packet with ACK = 1 flag, acknowledgment number ack = x + 1 (ack = client’s sequence number + 1), and its own sequence number seq = y to indicate that it has received and confirmed the close request.

  • The server enters the CLOSE_WAIT state (half-closed state, meaning the client has no more data to send, but the server can still send data, and the client can receive the server’s last data).

  • After receiving the ACK packet, the client enters the FIN_WAIT_2 state, waiting for the server to send the connection release packet. In this state, the server can continue to send any unsent data, and the client can receive the server’s last data.

3. Third Handshake (FIN): Initiating a close confirmation, the server sends a FIN packet

  • After completing data transmission, the server sends a packet with FIN = 1 and ACK = 1 flags, acknowledgment numberack = x + 1 and its own sequence numberseq = z to indicate that the server will also no longer send data.

  • The server enters the LAST_ACK state, waiting for the client’s confirmation.

4. Fourth Handshake (ACK): Completing the connection closure, the client sends an ACK packet

  • After receiving the server’s FIN packet (close request), the client sends a packet with the ACK = 1 flag to confirm, acknowledgment numberack = z + 1 (ack = server’s sequence number + 1) and its own sequence numberseq = x + 1.

  • The client enters the TIME_WAIT state, waiting for 2MSL (Maximum Segment Lifetime) time.

  • After receiving the client’s confirmation packet, the server revokes its Transmission Control Block (TCB) and enters the CLOSED state.

  • After waiting for 2MSL, the client also revokes its corresponding Transmission Control Block (TCB) and enters the CLOSED state, completely disconnecting the connection.

  • An Overview of HTTP: Protocols, Methods, and Differences with HTTPS

Thumbnail

        Client                         Server          |                             |(First Handshake)| ------- FIN(seq=x) -------> |  Client sends FIN packet(Second Handshake)| <--- ACK(seq=y,ack=x+1) --- |  Server sends ACK packet(Third Handshake)| <--- FIN(seq=y,ack=x+1) --- |  Server sends FIN packet(Fourth Handshake)| -- ACK(seq=x+1,ack=y+1) --> |  Client sends ACK packet          |                             |  Connection disconnected

Leave a Comment