(Click the public account above to quickly follow)
Source: kingszelda,
www.cnblogs.com/kingszelda/p/8988505.html
1. Background
The HTTP protocol is a stateless protocol, meaning each request is independent of others. Therefore, its initial implementation was that each HTTP request would open a TCP socket connection, which would be closed after the interaction was complete.
HTTP is a full-duplex protocol, so establishing and tearing down connections requires a three-way handshake and a four-way handshake. Clearly, in this design, each HTTP request consumes a lot of additional resources due to the establishment and destruction of connections.
As a result, the HTTP protocol has evolved to use persistent connections to reuse socket connections.

From the image, we can see:
-
In serial connections, each interaction requires opening and closing connections.
-
In persistent connections, the first interaction opens the connection, and after the interaction ends, the connection is not closed, saving the connection establishment process for the next interaction.
There are two implementations of persistent connections: HTTP/1.0+ keep-alive and HTTP/1.1 persistent connections.
2. HTTP/1.0+ Keep-Alive
Since 1996, many HTTP/1.0 browsers and servers have extended the protocol with the “keep-alive” extension.
Note that this extension was introduced as an experimental persistent connection supplement to 1.0. Keep-alive is no longer used, and the latest HTTP/1.1 specification does not mention it, although many applications have continued to use it.
Clients using HTTP/1.0 add “Connection: Keep-Alive” to the header, requesting the server to keep a connection open. If the server agrees to keep the connection open, it will include the same header in the response. If the response does not include the “Connection: Keep-Alive” header, the client will assume the server does not support keep-alive and will close the current connection after sending the response message.

Through the keep-alive extension, a persistent connection is established between the client and server; however, some issues still exist:
In HTTP/1.0, keep-alive is not a standard protocol, and the client must send “Connection: Keep-Alive” to activate the keep-alive connection.
Proxy servers may not support keep-alive, as some proxies are “blind relays” that cannot understand the meaning of the headers and simply forward them hop by hop. This may result in both the client and server maintaining a connection, but the proxy not accepting data on that connection.
3. HTTP/1.1 Persistent Connections
HTTP/1.1 replaces Keep-Alive with persistent connections.
By default, connections in HTTP/1.1 are persistent. To explicitly close a connection, the “Connection: Close” header must be added to the message. In HTTP/1.1, all connections are reused.
However, like Keep-Alive, idle persistent connections can also be closed at any time by the client and server. Not sending “Connection: Close” does not mean the server commits to keeping the connection open forever.
4. How HttpClient Generates Persistent Connections
HttpClient uses a connection pool to manage held connections, allowing connections on the same TCP link to be reused. HttpClient achieves connection persistence through a connection pool.
In fact, the “pool” technique is a common design, and its design philosophy is not complex:
-
Establish a connection when it is first used.
-
At the end, do not close the corresponding connection, but return it to the pool.
-
The next connection to the same destination can obtain an available connection from the pool.
-
Periodically clean up expired connections.
All connection pools follow this idea, but we focus on two main points in the HttpClient source code:
-
The specific design scheme of the connection pool for future custom connection pool reference.
-
How to correspond with the HTTP protocol, i.e., how to translate theoretical abstraction into code implementation.
4.1 Implementation of HttpClient Connection Pool
The handling of persistent connections in HttpClient can be concentrated in the following code, extracted from MainClientExec, with other parts removed:
public class MainClientExec implements ClientExecChain {
@Override
public CloseableHttpResponse execute(
final HttpRoute route,
final HttpRequestWrapper request,
final HttpClientContext context,
final HttpExecutionAware execAware) throws IOException, HttpException {
// Obtain a connection request from the connection manager HttpClientConnectionManager
final ConnectionRequest connRequest = connManager.requestConnection(route, userToken);final HttpClientConnection managedConn;
final int timeout = config.getConnectionRequestTimeout();
// Obtain a managed connection HttpClientConnection from the connection request ConnectionRequest
managedConn = connRequest.get(timeout > 0 ? timeout : 0, TimeUnit.MILLISECONDS);
// Hold the connection manager HttpClientConnectionManager and the managed connection HttpClientConnection with a ConnectionHolder
final ConnectionHolder connHolder = new ConnectionHolder(this.log, this.connManager, managedConn);
try {
HttpResponse response;
if (!managedConn.isOpen()) {
// If the current managed connection is not open, a new connection needs to be established
establishRoute(proxyAuthState, managedConn, route, request, context);
}
// Send the request through the connection HttpClientConnection
response = requestExecutor.execute(request, managedConn, context);
// Determine if the connection can be reused based on the connection reuse strategy
if (reuseStrategy.keepAlive(response, context)) {
// Obtain the connection validity period
final long duration = keepAliveStrategy.getKeepAliveDuration(response, context);
// Set the connection validity period
connHolder.setValidFor(duration, TimeUnit.MILLISECONDS);
// Mark the current connection as reusable
connHolder.markReusable();
} else {
connHolder.markNonReusable();
}
}
final HttpEntity entity = response.getEntity();
if (entity == null || !entity.isStreaming()) {
// Release the current connection back to the pool for next use
connHolder.releaseConnection();
return new HttpResponseProxy(response, null);
} else {
return new HttpResponseProxy(response, connHolder);
}
}
Here we see that the handling of connections during the HTTP request process is consistent with the protocol specifications, and we need to elaborate on the specific implementation.
PoolingHttpClientConnectionManager is the default connection manager for HttpClient. First, it obtains a connection request through requestConnection(), note that this is not a connection.
public ConnectionRequest requestConnection(
final HttpRoute route,
final Object state) {final Future<CPoolEntry> future = this.pool.lease(route, state, null);
return new ConnectionRequest() {
@Override
public boolean cancel() {
return future.cancel(true);
}
@Override
public HttpClientConnection get(
final long timeout,
final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException {
final HttpClientConnection conn = leaseConnection(future, timeout, tunit);
if (conn.isOpen()) {
final HttpHost host;
if (route.getProxyHost() != null) {
host = route.getProxyHost();
} else {
host = route.getTargetHost();
}
final SocketConfig socketConfig = resolveSocketConfig(host);
conn.setSocketTimeout(socketConfig.getSoTimeout());
}
return conn;
}
};
}
We can see that the returned ConnectionRequest object actually holds a Future<CPoolEntry>, where CPoolEntry is the actual connection instance managed by the connection pool.
From the above code, we should focus on:
-
Future<CPoolEntry> future = this.pool.lease(route, state, null)
How to obtain an asynchronous connection from the connection pool, Future<CPoolEntry>
-
HttpClientConnection conn = leaseConnection(future, timeout, tunit)
How to obtain a real connection HttpClientConnection from the asynchronous connection Future<CPoolEntry>
4.2 Future<CPoolEntry>
Let’s take a look at how CPool releases a Future<CPoolEntry>, the core code of AbstractConnPool is as follows:
private E getPoolEntryBlocking(
final T route, final Object state,
final long timeout, final TimeUnit tunit,
final Future<E> future) throws IOException, InterruptedException, TimeoutException {
// First, lock the current connection pool, the current lock is a reentrant lock ReentrantLockthis.lock.lock();
try {
// Obtain a connection pool corresponding to the current HttpRoute. For HttpClient’s connection pool, the total pool has a size, and each route corresponds to a pool, so it is a “pool within a pool”
final RouteSpecificPool<T, C, E> pool = getPool(route);
E entry;
for (;;) {
Asserts.check(!this.isShutDown, “Connection pool shut down”);
// Infinite loop to obtain a connection
for (;;) {
// Get a connection from the pool corresponding to the route, which may be null or a valid connection
entry = pool.getFree(state);
// If null is obtained, exit the loop
if (entry == null) {
break;
}
// If an expired or closed connection is obtained, release resources and continue to loop to obtain
if (entry.isExpired(System.currentTimeMillis())) {
entry.close();
}
if (entry.isClosed()) {
this.available.remove(entry);
pool.free(entry, false);
} else {
// If a valid connection is obtained, exit the loop
break;
}
}
// Exit the loop if a valid connection is obtained
if (entry != null) {
this.available.remove(entry);
this.leased.add(entry);
onReuse(entry);
return entry;
}
// If no valid connection is obtained, a new one needs to be generated
final int maxPerRoute = getMax(route);
// The maximum number of connections for each route is configurable. If exceeded, some connections need to be cleaned up using LRU
final int excess = Math.max(0, pool.getAllocatedCount() + 1 – maxPerRoute);
if (excess > 0) {
for (int i = 0; i < excess; i++) {
final E lastUsed = pool.getLastUsed();
if (lastUsed == null) {
break;
}
lastUsed.close();
this.available.remove(lastUsed);
pool.remove(lastUsed);
}
}
// The number of connections in the current route pool has not reached the upper limit
if (pool.getAllocatedCount() < maxPerRoute) {
final int totalUsed = this.leased.size();
final int freeCapacity = Math.max(this.maxTotal – totalUsed, 0);
// Determine if the connection pool exceeds the upper limit. If it does, some connections need to be cleaned up using LRU
if (freeCapacity > 0) {
final int totalAvailable = this.available.size();
// If the number of idle connections is greater than the remaining available space, some idle connections need to be cleaned up
if (totalAvailable > freeCapacity – 1) {
if (!this.available.isEmpty()) {
final E lastUsed = this.available.removeLast();
lastUsed.close();
final RouteSpecificPool<T, C, E> otherpool = getPool(lastUsed.getRoute());
otherpool.remove(lastUsed);
}
}
// Establish a connection based on the route
final C conn = this.connFactory.create(route);
// Place this connection into the corresponding “small pool” of the route
entry = pool.add(conn);
// Place this connection into the “large pool”
this.leased.add(entry);
return entry;
}
}
// If no valid connection is obtained from the route pool and the current route connection pool has reached the maximum value, i.e., there are connections in use but not available to the current thread
boolean success = false;
try {
if (future.isCancelled()) {
throw new InterruptedException(“Operation interrupted”);
}
// Place the future in the route pool to wait
pool.queue(future);
// Place the future in the large connection pool to wait
this.pending.add(future);
// If a signal is received, success will be true
if (deadline != null) {
success = this.condition.awaitUntil(deadline);
} else {
this.condition.await();
success = true;
}
if (future.isCancelled()) {
throw new InterruptedException(“Operation interrupted”);
}
} finally {
// Remove from the waiting queue
pool.unqueue(future);
this.pending.remove(future);
}
// If no signal is received and the current time has timed out, exit the loop
if (!success && (deadline != null && deadline.getTime() <= System.currentTimeMillis())) {
break;
}
}
// If no signal is received and no available connection is obtained, throw an exception
throw new TimeoutException(“Timeout waiting for connection”);
} finally {
// Release the lock on the large connection pool
this.lock.unlock();
}
}
The logic in the above code has several important points:
-
The connection pool has a maximum number of connections, and each route corresponds to a small connection pool, which also has a maximum number of connections.
-
Whether in the large connection pool or the small connection pool, when the number exceeds the limit, some connections must be released using LRU.
-
If a valid connection is obtained, it is returned for upper-level use.
-
If no valid connection is obtained, HttpClient will determine whether the current route connection pool has exceeded the maximum number. If it has not reached the upper limit, a new connection will be created and placed in the pool.
-
If the upper limit is reached, it will queue and wait. If a signal is received, it will try again; if not, it will throw a timeout exception.
-
Obtaining connections through the thread pool must be locked using ReentrantLock to ensure thread safety.
By this point, the program has either obtained a usable CPoolEntry instance or terminated with an exception.
4.3 HttpClientConnection
protected HttpClientConnection leaseConnection(
final Future<CPoolEntry> future,
final long timeout,
final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException {
final CPoolEntry entry;
try {
// Obtain CPoolEntry from the asynchronous operation Future<CPoolEntry>
entry = future.get(timeout, tunit);
if (entry == null || future.isCancelled()) {
throw new InterruptedException();
}
Asserts.check(entry.getConnection() != null, “Pool entry with no connection”);
if (this.log.isDebugEnabled()) {
this.log.debug(“Connection leased: ” + format(entry) + formatStats(entry.getRoute()));
}
// Obtain a proxy object for CPoolEntry, all operations use the same underlying HttpClientConnection
return CPoolProxy.newProxy(entry);
} catch (final TimeoutException ex) {
throw new ConnectionPoolTimeoutException(“Timeout waiting for connection from pool”);
}
}
5. How HttpClient Reuses Persistent Connections?
In the previous chapter, we saw that HttpClient obtains connections through a connection pool, acquiring them from the pool when needed.
This corresponds to the issues in Chapter 3:
-
Establish a connection when it is first used.
-
At the end, do not close the corresponding connection, but return it to the pool.
-
The next connection to the same destination can obtain an available connection from the pool.
-
Periodically clean up expired connections.
We saw in Chapter 4 how HttpClient handles issues 1 and 3, but how is the second issue handled?
That is, how does HttpClient determine whether to close a connection after it has been used or to place it back in the pool for others to reuse? Let’s take another look at the code in MainClientExec:
// Send Http connection
response = requestExecutor.execute(request, managedConn, context);
// Determine whether to reuse the current connection based on the reuse strategy
if (reuseStrategy.keepAlive(response, context)) {
// If the connection needs to be reused, obtain the connection timeout based on the timeout in the response
final long duration = keepAliveStrategy.getKeepAliveDuration(response, context);
if (this.log.isDebugEnabled()) {
final String s;
// The timeout is in milliseconds; if not set, it is -1, meaning no timeout
if (duration > 0) {
s = “for ” + duration + ” ” + TimeUnit.MILLISECONDS;
} else {
s = “indefinitely”;
}
this.log.debug(“Connection can be kept alive ” + s);
}
// Set the timeout; when the request ends, the connection manager will decide whether to close or return it to the pool based on the timeout
connHolder.setValidFor(duration, TimeUnit.MILLISECONDS);
// Mark the connection as reusable
connHolder.markReusable();
} else {
// Mark the connection as non-reusable
connHolder.markNonReusable();
}
We can see that after a connection is used, a connection retry strategy determines whether the connection should be reused. If it is to be reused, it will be handed over to the HttpClientConnectionManager to be placed back in the pool.
So what is the logic of the connection reuse strategy?
public class DefaultClientConnectionReuseStrategy extends DefaultConnectionReuseStrategy {
public static final DefaultClientConnectionReuseStrategy INSTANCE = new DefaultClientConnectionReuseStrategy();
@Override
public boolean keepAlive(final HttpResponse response, final HttpContext context) {
// Obtain the request from the context
final HttpRequest request = (HttpRequest) context.getAttribute(HttpCoreContext.HTTP_REQUEST);
if (request != null) {
// Obtain the Connection header
final Header[] connHeaders = request.getHeaders(HttpHeaders.CONNECTION);
if (connHeaders.length != 0) {
final TokenIterator ti = new BasicTokenIterator(new BasicHeaderIterator(connHeaders, null));
while (ti.hasNext()) {
final String token = ti.nextToken();
// If the Connection: Close header is included, it indicates that the request does not intend to keep the connection, and the response’s intention will be ignored. This header is part of the HTTP/1.1 specification
if (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) {
return false;
}
}
}
}
// Use the parent class’s reuse strategy
return super.keepAlive(response, context);
}
}
Let’s take a look at the parent class’s reuse strategy:
if (canResponseHaveBody(request, response)) {
final Header[] clhs = response.getHeaders(HTTP.CONTENT_LEN);
// If the Content-Length of the response is not set correctly, the connection cannot be reused
// Because for persistent connections, there is no need to re-establish the connection between two transmissions, it is necessary to confirm which request the content belongs to based on Content-Length to correctly handle the “sticky packet” phenomenon
// Therefore, if the response does not correctly set the Content-Length, the connection cannot be reused
if (clhs.length == 1) {
final Header clh = clhs[0];
try {
final int contentLen = Integer.parseInt(clh.getValue());
if (contentLen < 0) {
return false;
}
} catch (final NumberFormatException ex) {
return false;
}
} else {
return false;
}
}
if (headerIterator.hasNext()) {
try {
final TokenIterator ti = new BasicTokenIterator(headerIterator);
boolean keepalive = false;
while (ti.hasNext()) {
final String token = ti.nextToken();
// If the response has a Connection: Close header, it explicitly indicates to close, so do not reuse
if (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) {
return false;
// If the response has a Connection: Keep-Alive header, it explicitly indicates to persist, so reuse
} else if (HTTP.CONN_KEEP_ALIVE.equalsIgnoreCase(token)) {
keepalive = true;
}
}
if (keepalive) {
return true;
}
} catch (final ParseException px) {
return false;
}
}
// If the response does not have relevant Connection headers, it indicates that connections are reused for versions higher than HTTP/1.0
return !ver.lessEquals(HttpVersion.HTTP_1_0);
To summarize:
-
If the request header contains Connection: Close, do not reuse.
-
If the Content-Length of the response is not set correctly, do not reuse.
-
If the response header contains Connection: Close, do not reuse.
-
If the response header contains Connection: Keep-Alive, reuse.
-
If none of the above conditions are met, connections are reused if the HTTP version is greater than 1.0.
From the code, we can see that the implementation strategy is consistent with the constraints of the protocol layers discussed in Chapters 2 and 3.
6. How HttpClient Cleans Up Expired Connections
Before version 4.4 of HttpClient, when obtaining reusable connections from the connection pool, it would check if they were expired and clean them up if they were. In later versions, a separate thread scans the connections in the connection pool, cleaning up those that have been idle for longer than the configured time. The default timeout is 2 seconds.
public CloseableHttpClient build() {
// The cleaning thread will only start if specified to clean expired and idle connections
if (evictExpiredConnections || evictIdleConnections) {
// Create a cleaning thread for the connection pool
final IdleConnectionEvictor connectionEvictor = new IdleConnectionEvictor(cm,
maxIdleTime > 0 ? maxIdleTime : 10, maxIdleTimeUnit != null ? maxIdleTimeUnit : TimeUnit.SECONDS,
maxIdleTime, maxIdleTimeUnit);
closeablesCopy.add(new Closeable() {
@Override
public void close() throws IOException {
connectionEvictor.shutdown();
try {
connectionEvictor.awaitTermination(1L, TimeUnit.SECONDS);
} catch (final InterruptedException interrupted) {
Thread.currentThread().interrupt();
}
}
});
// Execute the cleaning thread
connectionEvictor.start();
}
We can see that when building the HttpClientBuilder, if the cleaning function is specified, a connection pool cleaning thread will be created and run.
public IdleConnectionEvictor(
final HttpClientConnectionManager connectionManager,
final ThreadFactory threadFactory,
final long sleepTime, final TimeUnit sleepTimeUnit,
final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
this.connectionManager = Args.notNull(connectionManager, “Connection manager”);
this.threadFactory = threadFactory != null ? threadFactory : new DefaultThreadFactory();
this.sleepTimeMs = sleepTimeUnit != null ? sleepTimeUnit.toMillis(sleepTime) : sleepTime;
this.maxIdleTimeMs = maxIdleTimeUnit != null ? maxIdleTimeUnit.toMillis(maxIdleTime) : maxIdleTime;
this.thread = this.threadFactory.newThread(new Runnable() {
@Override
public void run() {
try {
// Infinite loop, the thread keeps executing
while (!Thread.currentThread().isInterrupted()) {
// Sleep for a few seconds before executing, default is 10 seconds
Thread.sleep(sleepTimeMs);
// Clean up expired connections
connectionManager.closeExpiredConnections();
// If a maximum idle time is specified, clean up idle connections
if (maxIdleTimeMs > 0) {
connectionManager.closeIdleConnections(maxIdleTimeMs, TimeUnit.MILLISECONDS);
}
}
} catch (final Exception ex) {
exception = ex;
}
}
});
}
To summarize:
-
The cleaning of expired and idle connections will only start if manually set in HttpClientBuilder.
-
After manual setting, a thread will be started to execute in an infinite loop, executing every few seconds, calling the cleaning methods of HttpClientConnectionManager to clean up expired and idle connections.
7. Conclusion
-
The HTTP protocol alleviates the excessive connection issues in early designs through persistent connections.
-
There are two methods for persistent connections: HTTP/1.0+ Keep-Alive and HTTP/1.1 default persistent connections.
-
HttpClient manages persistent connections through connection pools, which are divided into a total connection pool and a connection pool corresponding to each route.
-
HttpClient obtains a pooled connection through asynchronous Future<CPoolEntry>.
-
The default connection reuse strategy is consistent with the constraints of the HTTP protocol, first checking for Connection: Close to close, then checking for Connection: Keep-Alive to open, and finally reusing if the version is greater than 1.0.
-
Only when the cleaning of expired and idle connections is manually enabled in HttpClientBuilder will connections in the pool be cleaned.
-
After version 4.4 of HttpClient, a thread cleans up expired and idle connections in a loop, executing every few seconds to achieve periodic execution.
The above research is based on my personal understanding of the HttpClient source code. If there are any inaccuracies, I hope everyone will actively comment and discuss.
Did you gain something from reading this article? Please share it with more people.
Follow “ImportNew” to enhance your Java skills.
