HTTP communication typically includes entity data in the request and response messages (entity). An entity is the actual data content carried by the HTTP message, such as the request body in a request or the response body in a response. Each entity has corresponding metadata descriptions, such as Content-Type and Content-Length header fields.Content-Type indicates the media type of the entity (MIME type), for example, text/html indicates HTML text, and image/jpeg indicates JPEG images, etc. The client knows how to interpret the entity data through Content-Type.Content-Length indicates the size of the entity body (in bytes). Note that the length here refers to the size after content encoding (for example, if the text is compressed to gzip, then Content-Length is the length in bytes after compression, not the original length). Generally, any HTTP message containing an entity body should have a Content-Length field, unless a special transfer encoding mechanism is used. In HTTP/1.1, unless “chunked encoding” is used, Content-Length must be sent, which helps the receiver detect truncated messages due to server crashes and correctly delineate boundaries when transmitting multiple messages over a persistent connection. In summary, an entity is the actual data carried in an HTTP message, with corresponding headers describing its type, length, and other information.
HTTP Entity Data
In HTTP messages, both requests and responses can carry entity data.Request entities are often used to include data to be uploaded in requests such as POST, PUT, etc., such as form content or files; Response entities are the resource data returned by the server (such as HTML pages, image files, etc.). An entity consists of entity headers and an entity body, where the entity headers contain Content-Type, Content-Length, and other descriptive information, while the entity body is the actual data. Essentially, HTTP is a stateless protocol, but by including entity data in requests/responses and using mechanisms like Cookie, stateful session management can be achieved.
The type of entity data is specified by Content-Type. The value of Content-Type includes a main type and a sub-type separated by a slash, such as “text/html” or “image/png”. It informs the client of the media type of the entity content so that the client can handle the data correctly. For example, when the server returns an HTML page, it sends Content-Type: text/html, and the browser knows it needs to render it as an HTML document. For file downloads, the server may send the corresponding MIME type (such as application/pdf) to prompt the browser to open it with the appropriate software.
The length of the entity data is specified by Content-Length (in bytes). The client or server learns the size of the current entity data through Content-Length to read the message correctly. It is worth noting that Content-Length indicates the length of the entity body. In cases where the content has been compressed or encoded, Content-Length refers to the size of the encoded data, not the original size. For example, if the server has enabled gzip compression for the response content, then Content-Length will be the number of bytes after compression. By using Content-Length, the receiver can determine whether the message has been transmitted completely and can identify the end position of one message entity and the start position of the next message on a persistent connection.
HTTP/1.1 allows for not using Content-Length and instead using chunked transfer encoding, which will be discussed in subsequent sections. In this case, the entity is divided into several chunks for transmission, each chunk has its own length identifier, so there is no need to declare the total length at the beginning. However, in non-chunked transfers, Content-Length is still necessary; otherwise, the receiving end cannot determine the data boundaries. Therefore, for POST requests with a request body or response messages with a response body, Content-Length is generally required to ensure the integrity and recognizability of data transmission.
Methods for Transmitting Large Files via HTTP
When transmitting larger data (such as large files or continuous data streams) via HTTP, the standard Content-Length method may be inconvenient or unsuitable. HTTP provides various mechanisms to support large data transmission and streaming:
- Chunked Transfer Encoding (Transfer-Encoding: chunked):
Introduced in HTTP/1.1, this mechanism allows the server to send data without knowing the entire response size in advance. The server adds Transfer-Encoding: chunked to the response header, then splits the response body into several chunks to send sequentially, with each chunk starting with the chunk size in hexadecimal, followed by CRLF and the chunk data. The receiver reads each chunk’s data until it encounters a chunk size of 0, indicating the end of the entity. Since each chunk declares its own length, the server does not need to calculate the total length, thus eliminating the need for Content-Length. This is very suitable for scenarios where content is generated and sent simultaneously, achieving streaming responses. After the last chunk, the server can also send an empty chunk and optional trailer headers (Trailer) to attach additional metadata (such as Content-MD5 checksum) for verification after the entity has been sent. The benefit of chunked encoding is that it allows for simultaneous generation and sending, reducing latency, while solving the problem of not knowing the total length in persistent connections.
When the response uses Transfer-Encoding: chunked, the entity is split into several data chunks for transmission, with each chunk starting with a hexadecimal length, and the last chunk with a size of 0 indicates the end. This mechanism allows the server to send content in chunks without knowing the overall length, achieving streaming transmission.
- Range Requests:
The HTTP protocol allows clients to request partial content of a resource, which is very useful for downloading specific segments of large files or resuming interrupted downloads. The client specifies the desired byte range by sending the Range field in the request header, for example, Range: bytes=0-1023 indicates a request for the first 1024 bytes of the file. If the server supports range requests (usually indicated in the response header by Accept-Ranges: bytes), it responds with status code 206 Partial Content, returning only the requested range of data, and indicating the actual returned range and total size of the resource in the response header Content-Range. For example:Content-Range: bytes 0-1023/5000. By making multiple Range requests, the client can obtain different parts of a large file as needed, achieving resuming downloads or multi-threaded downloads. If the client requests multiple non-contiguous ranges, the server can use Content-Type: multipart/byteranges in the response to combine different segments into one response. The range request mechanism makes it possible to partially retrieve large files, enhancing the flexibility and efficiency of large data transmission.
- Streaming Responses and Server Push:
For some scenarios where data is continuously generated (such as long-lived message streams, Server-Sent Events, or live video streaming), the server can keep the connection open and continuously send data. This is usually implemented in conjunction with chunked transfer, where the server periodically sends chunked chunks, allowing the client to gradually receive data without waiting for the entire data to complete. For example, server push is a new feature in HTTP/2 that allows the server to proactively push resources before the client has a chance to request them, thus saving loading time. However, server push is aimed at pre-emptively predicted static resources and does not belong to typical large file download scenarios. General large file downloads mainly still rely on the aforementioned chunked and Range implementations for resuming downloads and real-time transmission. Combining streaming responses with chunked transfer allows the client to download and process data simultaneously, improving real-time performance.
It is worth mentioning that common issues in large file transmission include interruptions and recovery. The HTTP range request is very practical for resuming downloads: when a download is interrupted, the client can record the number of bytes received, and in the next request, continue to request the remaining data from the interrupted position in the Range. The server returns 206 Partial Content, and the client appends the new data to the previous part, thus completing the retrieval of the entire file. This method is widely used in download managers and browser download functions. Modern browsers and servers generally support Range requests and 206 status. Combined with ETag or Last-Modified for validation, it can also ensure that the resumed data is consistent with the original file.
HTTP Connection Management
The initial design of HTTP adopted a short connection model: each time an HTTP request is made, a new TCP connection is established, and the connection is immediately closed after sending and receiving the response. This was the default in the HTTP/1.0 era (unless using the non-standard Connection: keep-alive extension). Short connections are simple to implement but inefficient, as establishing a TCP connection requires a three-way handshake, along with slow start and other mechanisms, resulting in additional overhead and latency for each new connection. Especially since web pages often contain many resources (images, styles, scripts, etc.) that require multiple requests, if each request establishes an independent connection, it severely slows down loading speed.
To improve performance, HTTP/1.1 introduced the long connection (persistent connection) model. Long connections allow a TCP connection to remain open after completing one request-response cycle, enabling it to be reused for subsequent requests and responses. This avoids the overhead of frequently establishing connections, significantly improving efficiency, especially in cases of high request volume. HTTP/1.1 enables persistent connections by default—meaning that unless explicitly sent with Connection: close, the connection will remain open for reuse. In contrast, in HTTP/1.0, the default is to close the connection, requiring a Connection: keep-alive hint to keep the connection open. With long connections, multiple requests can be sent serially over the same connection, reducing latency and system resource consumption. Of course, the server will not keep the connection open indefinitely. Generally, if the connection is idle for a period of time (configured timeout by the server), the server will actively close it to free up resources. Many servers also use the Keep-Alive header to inform the client of the idle timeout and maximum request count parameters to ensure both parties manage the connection lifecycle reasonably.
Long connections greatly improve HTTP performance, but they also bring some issues and evolution needs:
- Concurrent Requests and Pipelining:
Although HTTP/1.1 supports long connections, requests on the same connection are still processed sequentially, and the head-of-line blocking problem still exists. If the previous request is slow to process, it blocks subsequent requests. To address this, HTTP/1.1 introduced pipelining technology, allowing the next request to be sent without waiting for the previous response to be fully returned, thus reducing network wait time. However, pipelining requires strict idempotency and order, and compatibility with intermediate proxy servers is poor, which may lead to unstable behavior. Many browsers do not enable pipelining by default due to compatibility issues. Pipelining is also limited by head-of-line blocking (HOL); if one request is delayed, subsequent requests cannot receive responses. Therefore, in practice, pipelining in HTTP/1.1 has not been widely adopted.
- HTTP/2 Multiplexing:
The real effective solution to head-of-line blocking and fully utilizing the concurrent capabilities of a single connection is HTTP/2’s multiplexing. HTTP/2 made significant improvements to the underlying transport: all communication is completed over a single TCP long connection, data frames are transmitted in binary framing format, and the concept of streams is introduced, where each stream can carry a request/response pair. HTTP/2 splits HTTP messages into independent frames (frames), which can be interleaved and sent in parallel, and reassembled on the other end based on the stream identifiers in the frames, restoring the complete data of each request. This allows for true concurrency of multiple request responses on a single connection, where blocking of one request does not affect another. Browsers can initiate multiple requests simultaneously without needing to open multiple TCP connections, solving the previous limitation of six concurrent connections per domain. Compared to HTTP/1.x’s text format and sequential requests, HTTP/2 significantly improves performance through binary transmission, multiplexing, header compression, and other means. For example, in the HTTP/1.1 era, to bypass the serial limitation of each connection, websites often distributed static resources across different domains (domain sharding) for parallel downloads; however, under HTTP/2, this practice has become a burden, as a single connection can efficiently transmit all resources.
Performance comparison of different connection models in HTTP/1.x: left side short connection, each request establishes a new connection; middle long connection, reuses connection for serial requests; right side pipelining, parallel requests on long connection but still has head-of-line blocking. HTTP/2’s multiplexing achieves true parallelism on the same connection, eliminating head-of-line blocking (not shown in the diagram).
- Keep-Alive Mechanism and Resource Overhead:
During the period of keeping long connections, idle connections still occupy server resources, and too many idle long connections may lead to resource exhaustion or even DoS attacks. Therefore, servers usually set a Keep-Alive timeout, closing connections that exceed the idle time. In certain environments (such as high-concurrency short request scenarios), long connections may also be actively closed to quickly reclaim resources. However, in most Web applications, the advantages of long connections far outweigh the overhead, and they should not be closed unless necessary. Additionally, HTTP/1.1 still allows indicating through the response header Connection: close that the connection will be closed after completing the current response to prevent subsequent reuse (for example, in some scenarios with high real-time requirements, where old connections should not be reused).
In summary, HTTP connection management has evolved from short connections to long connections and then to multiplexing. Short connections are simple but inefficient, long connections (Keep-Alive) improve performance and become the default, but still have room for improvement under high latency and high concurrency, ultimately HTTP/2’s multiplexing comprehensively enhances single connection throughput and concurrency capabilities. The simultaneous introduction of header compression, server push, and other features further optimizes performance. For developers, the improvements in connection management at the application layer in HTTP/2 are transparent, but understanding its principles helps in service tuning. If possible, prioritize using HTTP/2 or newer versions (such as HTTP/3 based on QUIC protocol) to fully leverage more efficient connection management mechanisms.
HTTP Redirection and Jumping
In web browsing, redirection (Redirect) is often encountered, which occurs when the browser requests a URL and the server returns a response indicating the client should redirect to another URL. HTTP uses the 3XX series of status codes to indicate redirection responses, typically including 301, 302, etc. The process of redirection is straightforward: the server first returns a 30x status code and provides the new URL in the Location field of the response header. The browser automatically initiates a request to the new URL based on the Location it receives, as shown in the following diagram.
Diagram of the browser redirection process. The client requests resource A, the server returns 301 redirection and the target Location, and the client then automatically requests the new Location address B, ultimately obtaining resource B.
Although the process is simple, the HTTP specification defines various redirection status codes, which differ in terms of caching and request method handling, and it is important to understand the differences:
- 301 Moved Permanently (Permanent Redirect): indicates that the resource has been permanently moved. Clients (including search engines) may cache
The result of a 301 redirection is that the new URL is used directly in the future. For example, redirecting an old domain to a new domain is usually done with 301 to inform browsers and search engines to update the link. It is important to note that if a POST request returns a 301, most browsers will change subsequent requests to the new URL to the GET method, which, although not strictly required by the specification, is common behavior.
- 302 Found (Temporary Redirect):
Temporary redirection from HTTP/1.0. Indicates that the resource is temporarily located elsewhere, and the client should still use the original URL (it may revert to the original URL next time). The initial specification of 302 required that non-GET requests should not be automatically redirected (requiring user confirmation), but in practice, browsers generally automatically redirect POST requests to GET, leading to inconsistencies between the specification and implementation. Therefore, in HTTP/1.1, more refined 303 and 307 were introduced to clarify semantics, and 302 itself is not encouraged for direct use (retained mainly for backward compatibility).
- 303 See Other:
Clearly indicates that the client should use the GET method to access the resource pointed to by Location, regardless of the original request method. This is often used for redirection after a POST request, for example, after a successful form submission, returning 303 and a result page’s Location, allowing the browser to use GET to retrieve the result page (instead of repeating the POST). 303 is a temporary redirect and is prohibited from being cached, requiring confirmation from the original URL each time.
- 307 Temporary Redirect (Temporary Redirect):
Newly added in HTTP/1.1, it replaces 302 as a clearly defined temporary redirect. The difference from 302 is that 307 requires the request method and request body to remain unchanged. In other words, if the client makes a POST request to the original URL and receives a 307, it should continue to make a POST request to the new URL (instead of changing to GET as with 302). 307 is not recommended to be cached, and the original URL should still be requested next time.
- 308 Permanent Redirect (Permanent Redirect):
Newly added in HTTP/1.1, it serves as a supplement to 301. 308 indicates a permanent redirect while maintaining the request method. Functionally similar to 301, but it resolves the issue where 301 may change the method: when receiving a 308, the client must use the same method as the original request for the new URL. When a resource needs to be permanently relocated and the client wishes to maintain operations such as POST, 308 can be used. Like 301, 308 is a permanent redirect, and browsers may cache it.
The above redirection status codes can be understood in terms of the “two-dimensional four-quadrant” approach: whether permanent (determining whether to cache) and whether to change the method. 301 and 308 are both permanent redirects, and browsers will cache and remember the redirection destination, directly accessing the new address next time; 302, 303, and 307 are temporary redirects, not caching the results, returning to the original address each time (unless cached at the application layer). In terms of method handling, 301/302 were originally not supposed to change the method in the specification, but in practice, most implementations changed to GET, leading to the emergence of 303 and 307: 303 explicitly requires changing to GET, while 307 explicitly requires not changing the method. 308 is the permanent version that does not change the method.
When using redirection in development, it is important to choose the appropriate status code based on the requirements. For example, when a website domain changes permanently, use 301 or 308; to guide users to a result page after form submission, use 303; in load balancing scenarios, temporarily redirect requests elsewhere using 307, etc. Additionally, it is important to note that browsers will automatically redirect POST requests to GET for 301/302, which, although not in accordance with the specification, has become a fact. Therefore, if you want to strictly maintain the redirection method, be sure to use 307/308.
Finally, redirection occurs automatically between the browser and the server. For users, sometimes they can see a jump from the browser’s address bar (URL changes), but in most cases, this is imperceptible. What needs to be avoided is excessive redirection (such as A jumping to B, B jumping to C, multiple jumps) which increases latency and may even lead to circular redirection errors. Properly setting redirection status and Location can improve SEO and user experience, for example, using 301 to transfer the weight of the old URL to the new URL, which is beneficial for search engines to index the switch between new and old; while temporary redirects do not affect the original address’s weight in search engines. In summary, mastering the differences between various redirection codes and applying them correctly can achieve smooth link transitions and effective traffic guidance.