Essential Knowledge of HTTP

This article is sponsored by Yugang Writing Platform with a sponsorship amount of 200 yuan. Original author: Zhu Qiandai. Copyright statement: This article is copyrighted by the WeChat public account Yugang Says. Unauthorized reproduction in any form is prohibited.

HTTP is a network application layer protocol that we frequently interact with, and its importance cannot be overstated. However, many people, including myself, may not have a deep understanding of HTTP. In this article, I will share my learning insights and the relevant knowledge points related to caching that I believe are essential to know.

HTTP Message

First, let’s cover the basics and look at the specific format of HTTP messages. HTTP messages can be divided into request messages and response messages, with similar formats. They are mainly divided into three parts:

Start Line
Headers
Body

Request message format:

<method> <request-url> <version>
<headers>

<entity-body>

Response message format:

<version> <status> <reason-phrase>
<headers>

<entity-body>

From the request and response message formats, we can see that the main difference lies in the start line. Here is a brief explanation of each tag:

<method> refers to the request method, commonly used ones are GET, POST, HEAD, and others that we won't discuss here. If you're interested, you can look them up yourself.

<version> refers to the protocol version, which is usually HTTP/1.1 now.

<request-url> is the request address.

<status> refers to the response status code, such as 200, 404, etc.

<reason-phrase> is the reason phrase, such as "200 OK" or "404 Not Found"; this description is usually not too important to focus on.

Method

We know that the most commonly used request methods are GET and POST, and interviewers often ask about the differences between the two and when to use them. Let’s briefly discuss this.

The two methods have some differences in their transmission forms. When a request is initiated using the GET method, the request parameters are appended to the end of the request URL, formatted as url?param1=xxx&param2=xxx&[…].

We need to know that this way of transmitting parameters exposes them in the address bar. Additionally, since the URL is ASCII encoded, if there are Unicode characters in the parameters, such as Chinese characters, they will be encoded before transmission. Another point to note is that while the HTTP protocol does not impose a limit on URL length, some browsers and servers may have restrictions, so the parameters sent via the GET method should not be too long. In contrast, the POST method sends parameters in the request body, so it does not have these issues with GET parameters.

Another difference is the semantic meaning of the methods themselves. The GET method typically indicates retrieving a resource from the server at a specific URL, and performing multiple GET requests on the same URL will not affect the server. The POST method usually indicates adding or modifying a resource at a specific URL, such as submitting a form, which typically inserts a record into the server. Multiple POST requests may lead to multiple records being added to the server’s database. Therefore, semantically, the two should not be confused.

Status Codes

Common status codes include: 200 OK – request successful, the entity contains the requested resource; 301 Moved Permanently – the requested URL has been removed, usually with a new URL for redirection included in the Location header; 304 Not Modified – conditional request revalidation, resource has not changed; 404 Not Found – resource does not exist; 206 Partial Content – successfully executed a partial request, which is relevant for resuming downloads.

Header

Both request and response messages can carry some information, which, in conjunction with other parts, can achieve various powerful functions. This information is located between the start line and the request entity, in the form of key-value pairs, known as headers. Each header ends with a carriage return and line feed, with an extra line feed separating it from the entity.

Here we will focus on: Date, Cache-Control, Last-Modified, Etag, Expires, If-Modified-Since, If-None-Match, If-Unmodified-Since, If-Range, If-Match.

There are many other HTTP headers, but due to space limitations, we will not discuss them all. These headers are all related to HTTP caching, and we will discuss their roles in the following sections.

Entity

The resource sent in the request or the resource returned in the response.

HTTP Caching

When we initiate an HTTP request, the server returns the requested resource, and we can store a copy of that resource locally. This way, when we request the same URL resource again, we can quickly retrieve it from local storage. This is known as caching. Caching can save unnecessary network bandwidth and quickly respond to HTTP requests.

Let’s introduce a few concepts:

Freshness Check

Revalidation

Revalidation Hit

We know that some resources corresponding to URLs do not remain unchanged; the resource at that URL on the server may be modified after a certain period. At this point, the resource in local cache will differ from the resource on the server.

Since resources may change after a certain time, we can assume that the resource has not changed before that time, allowing us to confidently use the cached resource. When the request time exceeds that time, we consider that the cached resource may no longer be consistent with the server. Therefore, when we initiate a request, we need to first check the cached resource to see if we can directly use it; this is called Freshness Check. Each resource has an expiration time, just like food, and we need to check if it has expired before consuming it.

If we find that the cached resource has exceeded a certain time, when we make the request again, we will not directly return the cached resource but will first check with the server to see if the resource has changed; this is called Revalidation. If the server finds that the corresponding URL resource has not changed, it will return 304 Not Modified and will not return the corresponding entity. This is known as Revalidation Hit. Conversely, if the revalidation does not hit, it will return 200 OK and return the changed URL resource, at which point the cache can be updated for future requests.

Let’s look at the specific implementation:

Freshness Check: We need to determine whether the resource has exceeded a certain time to judge whether the cached resource is fresh and usable. How is this certain time determined? It is actually set by the server by adding Cache-Control:max-age or Expires to the response message. It is worth noting that Cache-Control is part of the HTTP/1.1 protocol specification, usually indicating a relative time, i.e., how many seconds later, and needs to be combined with Last-Modified to calculate the absolute time. Expires is part of the HTTP/1.0 specification, followed by an absolute time.

Revalidation: If the freshness check indicates that we need to request the server for revalidation, we need to inform the server what kind of cached resource we have, and then the server can determine whether this cached resource is consistent with the current resource. The logic is correct. So how do we tell the server that we currently have a backup cached resource? We can use a method called Conditional Request to achieve revalidation.

HTTP defines five headers for conditional requests: If-Modified-Since, If-None-Match, If-Unmodified-Since, If-Range, If-Match.

If-Modified-Since can be used in conjunction with the Last-Modified response header returned by the server. When we make a conditional request, we pass the value of the Last-Modified header as the value of the If-Modified-Since header to the server, meaning we are querying whether the resource on the server has been modified since we last cached it.

If-None-Match needs to be used in conjunction with another Etag response header returned by the server. The Etag header can be considered a version number defined by the server for the document resource. Sometimes a document may be modified with very minor changes that do not require all caches to re-download the data. Or a document may be modified so frequently that a judgment based on seconds is no longer sufficient. In this case, the Etag header is needed to indicate the document’s version number. When making a conditional request, we can send the cached Etag value as the value of the If-None-Match header to the server. If the server’s resource Etag matches the current conditional request Etag, it indicates a revalidation hit.

The other three headers are related to knowledge about resuming downloads, which we will not discuss in this article. I will write another article about resuming downloads later.

OkHttp Caching

The theoretical knowledge of HTTP caching is roughly as described. Let’s look at the source code of OkHttp to see how this well-known open-source library implements caching using the HTTP protocol. Here we assume that the reader has a general understanding of the request execution flow in OkHttp and will only discuss the caching-related parts. For those unfamiliar with OkHttp code, it is recommended to review the relevant code or other articles first.

We know that OkHttp requests go through a series of Interceptors before being sent to the server, one of which is the CacheInterceptor that we need to analyze.

final InternalCache cache;

@Override public Response intercept(Chain chain) throws IOException {
    Response cacheCandidate = cache != null
        ? cache.get(chain.request())
        : null;

    long now = System.currentTimeMillis();

    CacheStrategy strategy = new CacheStrategy.Factory(now, chain.request(), cacheCandidate).get();
    Request networkRequest = strategy.networkRequest;
    Response cacheResponse = strategy.cacheResponse;

    ......
}

The method first retrieves the corresponding request cache through InternalCache. We will not discuss the specific implementation of this class here; we just need to know that if the resource for that request URL has been cached previously, it can be found through the request object.

The retrieved cache response, current timestamp, and request are passed to CacheStrategy, and then the get method is executed to perform some logic to ultimately obtain strategy.networkRequest and strategy.cacheResponse. If, after the judgment of CacheStrategy, we find that this request cannot directly use the cached data, we need to send a request to the server using the networkRequest constructed by CacheStrategy. Let’s first look at what CacheStrategy does.

CacheStrategy.Factory.java

public Factory(long nowMillis, Request request, Response cacheResponse) {
      this.nowMillis = nowMillis;
      this.request = request;
      this.cacheResponse = cacheResponse;

      if (cacheResponse != null) {
        this.sentRequestMillis = cacheResponse.sentRequestAtMillis();
        this.receivedResponseMillis = cacheResponse.receivedResponseAtMillis();
        Headers headers = cacheResponse.headers();
        for (int i = 0, size = headers.size(); i < size; i++) {
          String fieldName = headers.name(i);
          String value = headers.value(i);
          if ("Date".equalsIgnoreCase(fieldName)) {
            servedDate = HttpDate.parse(value);
            servedDateString = value;
          } else if ("Expires".equalsIgnoreCase(fieldName)) {
            expires = HttpDate.parse(value);
          } else if ("Last-Modified".equalsIgnoreCase(fieldName)) {
            lastModified = HttpDate.parse(value);
            lastModifiedString = value;
          } else if ("ETag".equalsIgnoreCase(fieldName)) {
            etag = value;
          } else if ("Age".equalsIgnoreCase(fieldName)) {
            ageSeconds = HttpHeaders.parseSeconds(value, -1);
          }
        }
      }
    }

The constructor of CacheStrategy.Factory first saves the passed parameters and parses the relevant headers of the cached response. The subsequent get method is as follows:

public CacheStrategy get() {
      CacheStrategy candidate = getCandidate();

      if (candidate.networkRequest != null && request.cacheControl().onlyIfCached()) {
        // We're forbidden from using the network and the cache is insufficient.
        return new CacheStrategy(null, null);
      }

      return candidate;
}

The get method is straightforward, with the main logic in getCandidate. Here, if the returned candidate has a non-null networkRequest, it indicates that this request needs to be sent to the server. If the request’s cacheControl requires that this request only use cached data, then this request will likely end in failure. We will return to CacheInterceptor to see this later. Next, let’s look at the main logic of getCandidate.

private CacheStrategy getCandidate() {
      // No cached response.
      if (cacheResponse == null) {
        return new CacheStrategy(request, null);
      }

      // Drop the cached response if it's missing a required handshake.
      if (request.isHttps() && cacheResponse.handshake() == null) {
        return new CacheStrategy(request, null);
      }

      // If this response shouldn't have been stored, it should never be used
      // as a response source. This check should be redundant as long as the
      // persistence store is well-behaved and the rules are constant.
      if (!isCacheable(cacheResponse, request)) {
        return new CacheStrategy(request, null);
      }

      CacheControl requestCaching = request.cacheControl();
      if (requestCaching.noCache() || hasConditions(request)) {
        return new CacheStrategy(request, null);
      }
      ......
}

The above code lists four situations in which the cache should be ignored and a request should be sent directly to the server:

The cache itself does not exist.
The request is using HTTPS and the cache does not have handshake data.
The cache itself should not have been stored. There may be issues with the cache implementation that retained data that should not have been cached.
If the request itself adds Cache-Control: No-Cache or some conditional request headers, it indicates that the request does not wish to use cached data.

In these cases, a CacheStrategy object containing a networkRequest but with a null cacheResponse is returned.

private CacheStrategy getCandidate() {
      ......

      CacheControl responseCaching = cacheResponse.cacheControl();
      if (responseCaching.immutable()) {
        return new CacheStrategy(null, cacheResponse);
      }

      long ageMillis = cacheResponseAge();
      long freshMillis = computeFreshnessLifetime();

      if (requestCaching.maxAgeSeconds() != -1) {
        freshMillis = Math.min(freshMillis, SECONDS.toMillis(requestCaching.maxAgeSeconds()));
      }

      long minFreshMillis = 0;
      if (requestCaching.minFreshSeconds() != -1) {
        minFreshMillis = SECONDS.toMillis(requestCaching.minFreshSeconds());
      }

      long maxStaleMillis = 0;
      if (!responseCaching.mustRevalidate() && requestCaching.maxStaleSeconds() != -1) {
        maxStaleMillis = SECONDS.toMillis(requestCaching.maxStaleSeconds());
      }

      if (!responseCaching.noCache() && ageMillis + minFreshMillis < freshMillis + maxStaleMillis) {
        Response.Builder builder = cacheResponse.newBuilder();
        if (ageMillis + minFreshMillis >= freshMillis) {
          builder.addHeader("Warning", "110 HttpURLConnection \"Response is stale\"");
        }
        long oneDayMillis = 24 * 60 * 60 * 1000L;
        if (ageMillis > oneDayMillis && isFreshnessLifetimeHeuristic()) {
          builder.addHeader("Warning", "113 HttpURLConnection \"Heuristic expiration\"");
        }
        return new CacheStrategy(null, builder.build());
      }
      ......
}

If the Cache-Control header of the cached response contains immutable, it indicates that the resource will not change. The client can directly use the cached result. It is worth noting that immutable is not part of the HTTP protocol but is an extension property proposed by Facebook.

Next, we calculate the values of ageMillis, freshMillis, minFreshMillis, and maxStaleMillis. If the response cache has not been prohibited from being used by Cache-Control: No-Cache, and the following inequality holds:

ageMillis + minFreshMillis < freshMillis + maxStaleMillis

Then we enter the conditional code block and ultimately return a CacheStrategy constructed using the current cached value with a null networkRequest.

What does this inequality mean? Let’s look at what these four values represent:

ageMillis refers to the time elapsed since the cached resource was generated or validated against the source server until now. Using the analogy of food’s shelf life, it is like how long it has been since the production date.

freshMillis indicates how long this resource is considered fresh. For example, if the shelf life is 18 months, then that 18 months is freshMillis.

minFreshMillis indicates how long I hope this cache remains fresh. For example, if I am particular about food, and a certain food expires in just one month, even if it has not actually expired, I still consider it not fresh and do not want to eat it.

maxStaleMillis is like someone who is not so particular; even if the food has expired, as long as it has not expired for too long, say two months, I think it is still okay to eat.

Both minFreshMillis and maxStaleMillis can be set by the request headers, allowing the request to control the strictness or looseness of cache usage by setting:

Cache-Control:min-fresh=xxx, Cache-Control:max-stale=xxx

Next, we look at the following code in getCandidate:

private CacheStrategy getCandidate() {
        ......

      // Find a condition to add to the request. If the condition is satisfied, the response body
      // will not be transmitted.
      String conditionName;
      String conditionValue;
      if (etag != null) {
        conditionName = "If-None-Match";
        conditionValue = etag;
      } else if (lastModified != null) {
        conditionName = "If-Modified-Since";
        conditionValue = lastModifiedString;
      } else if (servedDate != null) {
        conditionName = "If-Modified-Since";
        conditionValue = servedDateString;
      } else {
        return new CacheStrategy(request, null); // No condition! Make a regular request.
      }

      Headers.Builder conditionalRequestHeaders = request.headers().newBuilder();
      Internal.instance.addLenient(conditionalRequestHeaders, conditionName, conditionValue);

      Request conditionalRequest = request.newBuilder()
          .headers(conditionalRequestHeaders.build())
          .build();
      return new CacheStrategy(conditionalRequest, cacheResponse);
}

If the previous conditions are not met, it indicates that our cached response has expired. At this point, we need to perform a revalidation operation with the server through a conditional request. The following code is clear; it constructs a conditional request using the Last-Modified, Etag, and Date headers extracted from the cached response and returns it.

Next, we return to CacheInterceptor:

// If we're forbidden from using the network and the cache is insufficient, fail.
if (networkRequest == null && cacheResponse == null) {
  return new Response.Builder()
      .request(chain.request())
      .protocol(Protocol.HTTP_1_1)
      .code(504)
      .message("Unsatisfiable Request (only-if-cached)")
      .body(Util.EMPTY_RESPONSE)
      .sentRequestAtMillis(-1L)
      .receivedResponseAtMillis(System.currentTimeMillis())
      .build();
}

As we can see, if both networkRequest and cacheResponse are null, it indicates that we have neither usable cache nor a request that allows us to use the current cached data through Cache-Control:only-if-cached. In this case, we can only return a 504 response. Continuing down:

// If we don't need the network, we're done.
if (networkRequest == null) {
  return cacheResponse.newBuilder()
      .cacheResponse(stripBody(cacheResponse))
      .build();
}

If networkRequest is null, it indicates that we do not need to perform revalidation; we can directly return cacheResponse as the request result.

Response networkResponse = null;
try {
  networkResponse = chain.proceed(networkRequest);
} finally {
  // If we're crashing on I/O or otherwise, don't leak the cache body.
  if (networkResponse == null && cacheCandidate != null) {
    closeQuietly(cacheCandidate.body());
  }
}

// If we have a cache response too, then we're doing a conditional get.
if (cacheResponse != null) {
  if (networkResponse.code() == HTTP_NOT_MODIFIED) {
    Response response = cacheResponse.newBuilder()
        .headers(combine(cacheResponse.headers(), networkResponse.headers()))
        .sentRequestAtMillis(networkResponse.sentRequestAtMillis())
        .receivedResponseAtMillis(networkResponse.receivedResponseAtMillis())
        .cacheResponse(stripBody(cacheResponse))
        .networkResponse(stripBody(networkResponse))
        .build();
    networkResponse.body().close();

    // Update the cache after combining headers but before stripping the
    // Content-Encoding header (as performed by initContentStream()).
    cache.trackConditionalCacheHit();
    cache.update(cacheResponse, response);
    return response;
  } else {
    closeQuietly(cacheResponse.body());
  }
}

Response response = networkResponse.newBuilder()
    .cacheResponse(stripBody(cacheResponse))
    .networkResponse(stripBody(networkResponse))
    .build();

if (cache != null) {
  if (HttpHeaders.hasBody(response) && CacheStrategy.isCacheable(response, networkRequest)) {
    // Offer this request to the cache.
    CacheRequest cacheRequest = cache.put(response);
    return cacheWritingResponse(cacheRequest, response);
  }

  if (HttpMethod.invalidatesCache(networkRequest.method())) {
    try {
      cache.remove(networkRequest);
    } catch (IOException ignored) {
      // The cache cannot be written.
    }
  }
}

return response;

If networkRequest exists and is not null, it indicates that this request needs to be sent to the server. At this point, there are two situations: one where cacheResponse does not exist, indicating that we do not have a usable cache, and this request is just a normal request. If cacheResponse exists, it indicates that we have a possibly expired cache, and networkRequest is a conditional request for revalidation.

In either case, we need to obtain a response from the server using networkResponse=chain.proceed(networkRequest). The difference is that if there is cached data, we need to update the current cache data after obtaining the revalidation response using cache.update(cacheResponse, response). If there is no cached data, we check whether this request can be cached. If the conditions for caching are met, we cache the response and return it.

The caching process in OkHttp is roughly as follows, and we can see that the entire process follows the HTTP caching flow. Finally, let’s summarize the caching process:

Parse the URL and various headers from the received request.
Check if there is a cached copy available locally.
If there is a cache, perform a freshness check. If the cache is fresh enough, use it as the response. If it is not fresh enough, construct a conditional request and send it to the server for revalidation. If there is no cache, send the request directly to the server.
Update or add the response returned from the server to the cache.

OAuth

OAuth is a protocol for authorizing third parties to access corresponding resources. Unlike previous authorization methods, OAuth’s authorization avoids exposing user passwords to third parties, making it more secure. The OAuth protocol sets up an authorization layer to distinguish between users and third-party applications. Users can log in to the service provider with their passwords to access all their resources. Third-party applications can only request authorization from users to obtain an Access Token, which is used to log in to the authorization layer and access the resources authorized by the user for a specified time.

Several roles defined by OAuth:

Role	Description
Resource Owner	The entity that can authorize access to certain protected resources, usually the user.
Client	The application that can access protected resources through user authorization, i.e., the third-party application.
Authorization Server	The server that issues Access Tokens to third parties after authenticating users.
Resource Server	The server that holds protected resources and can respond to resource requests using Access Tokens.

     +--------+                               +---------------+
     |        |--(A)- Authorization Request ->|   Resource    |
     |        |                               |     Owner     |
     |        |<-(B)-- Authorization Grant ---|               |
     |        |                               +---------------+
     |        |
     |        |                               +---------------+
     |        |--(C)-- Authorization Grant -->| Authorization |
     | Client |                               |     Server    |
     |        |<-(D)----- Access Token -------|               |
     |        |                               +---------------+
     |        |
     |        |                               +---------------+
     |        |--(E)----- Access Token ------>|    Resource   |
     |        |                               |     Server    |
     |        |<-(F)--- Protected Resource ---|               |
     +--------+                               +---------------+

From the above diagram, we can see that an OAuth authorization process can be divided into six steps:

The client requests authorization from the user.
The user agrees to the authorization.
The client requests an Access Token from the authorization server using the obtained authorization.
The authorization server issues an Access Token after authenticating the authorization.
The client initiates a request to the resource server using the obtained Access Token.
The resource server verifies the Access Token and issues the requested resource.

HTTPS

In simple terms, HTTP + Encryption + Authentication + Integrity Protection = HTTPS

The traditional HTTP protocol is an application layer transport protocol that communicates directly with the TCP protocol. It has some drawbacks:

HTTP uses plaintext transmission, making it susceptible to eavesdropping.
HTTP does not authenticate both parties in communication, so neither party can confirm whether the other is a disguised client or server.
HTTP has no means to verify the integrity of transmitted content, making it easy to be hijacked or tampered with during transmission.

Therefore, in scenarios that require security, such as requests involving bank accounts, HTTP cannot withstand these attacks. HTTPS can support encryption of communication content and authentication of both parties through the added SSL/TLS.

HTTPS Encryption

In modern cryptography, there are mainly two types of encryption methods:

Symmetric Key Encryption
Asymmetric Key Encryption

Symmetric key encryption means that the same key is used for both encryption and decryption. The advantage of this method is its fast processing speed, but securely transmitting the key from one party to the other is a problem.

Asymmetric key encryption means that different keys are used for encryption and decryption. One key is called the public key, which can be freely disclosed, while the other is called the private key, which is only used by the holder. A client with the public key can use it to encrypt the transmitted content, and only the holder of the private key can decrypt the content encrypted with the public key. This method overcomes the key exchange problem but is slower than symmetric key encryption.

The SSL/TLS encryption method combines the advantages of both encryption methods. First, it uses asymmetric key encryption to transmit a symmetric key encrypted with the public key to the other party. The other party uses the private key to decrypt the transmitted symmetric key. Then both parties use the symmetric key for communication. This solves the key transmission problem of symmetric key encryption while utilizing the high efficiency of symmetric key encryption for encrypting and decrypting communication content.

HTTPS Authentication

The hybrid encryption method used by SSL/TLS still has a problem: how to ensure that the public key used for encryption is indeed the one issued by the expected server? Perhaps the public key has been tampered with when received. Therefore, we also need the ability to authenticate this key to ensure that the party we are communicating with is the one we expect.

The current approach is to use public key certificates issued by certificate authorities. Server operators can apply for public key certificates from certification authorities. After review, the certification authority binds the public key to the certificate. The server can then distribute this certificate to clients, who can use the certification authority’s public key to verify the certificate. Once the verification is successful, it can be confirmed that this key is trustworthy.

Summary The communication process of HTTPS:

The client initiates a request.
The server responds to the request and subsequently sends the certificate to the client.
The client uses the public key of the certification authority to authenticate the certificate and extracts the server’s public key from the certificate.
The client uses the public key to encrypt a random key and sends it to the server.
The server uses the private key to decrypt the random key.
Both parties use the random key as the symmetric key for encryption and decryption.

— — — END — — —

Recent Article Review

How to Understand Design Patterns and Their Ideas in Simple Terms?
Android Official Architecture Components Paging: The Design Aesthetics of the Paging Library
MVC, MVP, MVVM, How Should I Choose?