
The web front end is the page result that users see when they type a line of letters in the browser’s address bar. However, what happens from typing letters to seeing the page, and how data is obtained, all rely on HTTP/HTTPS.
However, this part is often overlooked by readers, so candidates need to gather knowledge related to it, which is also something readers should master.
1. What is the relationship between HTTP and HTTPS? What are their port numbers?
HTTP typically runs on top of TCP, and by adding a security protocol layer (SSL or TSL) between HTTP and TCP, it becomes what we commonly refer to as HTTPS. The default port number for HTTP is 80, while the default port number for HTTPS is 443.
2. Why is HTTPS more secure?
In network requests, many servers and routers need to forward data. Each of these nodes could potentially tamper with the information, but with HTTPS, the key is only available at the endpoint. HTTPS is more secure than HTTP because it uses the SSL/TLS protocol for transmission. It includes technologies such as certificates, offloading, traffic forwarding, load balancing, page adaptation, browser adaptation, and refer passing, ensuring the security of the transmission process.
3. How much do you know about HTTP/2?
HTTP/2 introduces the concept of “server push,” allowing the server to proactively send data to the client cache before the client requests it, thereby improving performance.
HTTP/2 provides more encryption support.
HTTP/2 uses multiplexing technology, allowing multiple messages to be interleaved over a single connection.
It adds header compression, resulting in very small requests; the headers for requests and responses will occupy a minimal proportion of bandwidth.
4. What common HTTP status codes do you know?
(1) 100 Continue indicates to continue, generally returned by the server after sending the HTTP header in a POST request, confirming that the header has been received and that the actual parameters can now be sent.
(2) 200 OK indicates normal return of information.
(3) 201 Created indicates that the request was successful and that the server has created a new resource.
(4) 202 Accepted indicates that the server has accepted the request but has not yet processed it.
(5) 301 Moved Permanently indicates that the requested webpage has permanently moved to a new location.
(6) 302 Found indicates a temporary redirect.
(7) 303 See Other indicates a temporary redirect, and always uses a GET request for the new URI.
(8) 304 Not Modified indicates that the requested webpage has not been modified since the last request.
(9) 400 Bad Request indicates that the server cannot understand the format of the request, and the client should not try to initiate the request with the same content again.
(10) 401 Unauthorized indicates that the request is unauthorized.
(11) 403 Forbidden indicates that access is forbidden.
(12) 404 Not Found indicates that the resource matching the URI could not be found.
(13) 500 Internal Server Error indicates the most common server-side error.
(14) 503 Service Unavailable indicates that the server is temporarily unable to handle the request (possibly due to overload or maintenance).
5. What is the complete HTTP transaction process?
The basic process is as follows.
(1) Domain name resolution.
(2) Initiate the TCP three-way handshake.
(3) After establishing the TCP connection, initiate the HTTP request.
(4) The server responds to the HTTP request, and the browser receives the HTML code.
(5) The browser parses the HTML code and requests the resources in the HTML code.
(6) The browser renders the page and presents it to the user.
6. Implement a simple HTTP server.
Load the HTTP module in Node.js and create a server that listens on a port. The code is as follows:
var http = require('http'); // Load HTTP modulehttp.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/html'}); // 200 indicates success, document type for browser recognitionres.write('<meta charset="UTF-8"><h1>Welcome to Frontend World</h1>'); // HTML data returned to clientres.end(); // End output stream}).listen(3000); // Bind to port 3000
HTTP is a format specification for data transmission between clients and servers, representing the “HyperText Transfer Protocol.”
8. What is a stateless protocol in HTTP? How to overcome the shortcomings of HTTP’s stateless protocol?
(1) A stateless protocol has no memory for transaction processing. The lack of state means that if subsequent processing is needed, prior information must be provided.
(2) The way to overcome the shortcomings of a stateless protocol is to save information through cookies and sessions.
9. What parts do the HTTP request and response messages contain?
The request message contains three parts.
(1) Request line, which includes request method, URI, and HTTP version information.
(2) Request header fields.
(3) Request content entity.
The response message contains three parts.
(1) Status line, which includes HTTP version, status code, and reason phrase for the status code.
(2) Response header fields.
(3) Response content entity.
10. What request methods are available in HTTP?
(1) GET: Requests access to resources identified by a URI (Uniform Resource Identifier), can pass parameter data to the server via the URL.
(2) POST: Transmits information to the server, mainly similar to the GET method, but the amount of data transmitted is usually unlimited.
(3) PUT: Transmits files, with the message body containing the file content, saved to the corresponding URI location.
(4) HEAD: Obtains the message header, similar to the GET method, but does not return the message body, generally used to verify if the URI is valid.
(5) DELETE: Deletes files, opposite to the PUT method, deletes the file at the corresponding URL location.
(6) OPTIONS: Queries the HTTP methods supported by the corresponding URI.
11. What are the differences between HTTP 1.0 and 1.1 specifications?
In HTTP 1.0, after establishing a connection, the client sends a request, and the server returns information and then closes the connection. When the browser makes the next request, it has to establish a connection again. Clearly, this constant establishment of connections causes many issues.
In HTTP 1.1, the concept of persistent connections was introduced. With this connection, the browser can send requests and receive responses without waiting for each response to arrive, allowing multiple requests to be sent continuously.
12. What types of header fields are included in HTTP?
(1) General header fields (used by both request and response messages).
It includes the following parts.
-
Date: The time the message was created.
-
Connection: Management of the connection.
-
Cache-Control: Control of caching.
-
Transfer-Encoding: The encoding method for the message body.
(2) Request header fields (used in request messages).
It includes the following parts.
-
Host: The server where the requested resource is located.
-
Accept: The media types that can be processed.
-
Accept-Charset: The character sets that are acceptable.
-
Accept-Encoding: The content encodings that are acceptable.
-
Accept-Language: The natural languages that are acceptable.
(3) Response header fields (used in response messages).
It includes the following parts.
Server: Information about the HTTP server installation.
(4) Entity header fields (used in the entity part of request and response messages).
It includes the following parts.
-
Allow: The HTTP methods supported by the resource.
-
Content-Type: The type of the entity body.
-
Content-Encoding: The encoding method used for the entity body.
-
Content-Language: The natural language of the entity body.
-
Content-Length: The size of the entity body in bytes.
-
Content-Range: The range of the entity body, generally used for partial requests.
13. What are the disadvantages of HTTP compared to HTTPS?
The disadvantages of HTTP are as follows.
(1) Communication uses plaintext and is unencrypted, meaning the content can be eavesdropped on, or analyzed through packet capture.
(2) It does not verify the identity of the communicating parties, which may lead to impersonation.
(3) It cannot verify the integrity of the message, which may be tampered with.
HTTPS is essentially HTTP + encryption processing (typically SSL secure communication line) + authentication + integrity protection.
14. How to optimize HTTP requests?
Use load balancing optimization and acceleration for HTTP application requests; use HTTP caching to optimize website requests.
15. What are the characteristics of the HTTP protocol?
Supports client/server mode, is simple and fast, flexible, connectionless, and stateless.
16. What are the new features of HTTP 1.1?
The new features are as follows.
(1) Default persistent connections, saving communication volume; as long as neither the client nor the server explicitly indicates to close the TCP connection, the connection will be maintained, allowing multiple HTTP requests to be sent.
(2) Pipelining, allowing the client to send multiple HTTP requests simultaneously without waiting for each response.
(3) The principle of breakpoint resume.
17. Explain the TCP three-way handshake and four-way handshake strategies.
To accurately deliver data to the target, TCP uses a three-way handshake strategy. After sending the data packet, TCP will confirm whether it was successfully delivered. The handshake process uses TCP flags, namely SYN and ACK.
The sender first sends a data packet with the SYN flag to the receiver. After receiving it, the receiver sends back a data packet with the SYN/ACK flag to indicate successful transmission and to confirm the information. Finally, the sender sends back a data packet with the ACK flag, representing the completion of the “handshake.” If any stage of the handshake is interrupted, TCP will resend the same data packet in the same order.
Disconnecting a TCP connection requires a “four-way handshake.”
The first handshake: The active closing party sends a FIN to close the data transmission from the active closing party to the passive closing party, indicating that the active closing party will no longer send data to the passive closing party (however, any data sent before the FIN packet must be retransmitted if no corresponding ACK confirmation is received). However, at this point, the active closing party can still receive data.
The second handshake: The passive closing party sends an ACK back to confirm the sequence number as received sequence number + 1 (a FIN occupies a sequence number, similar to SYN).
The third handshake: The passive closing party sends a FIN to close the data transmission from the passive closing party to the active closing party, indicating that the passive closing party has finished sending data and will no longer send data to the active closing party.
The fourth handshake: After the active closing party receives the FIN, it sends an ACK back to the passive closing party, confirming the sequence number as received sequence number + 1, thus completing the four-way handshake.
18. What are the differences between TCP and UDP?
TCP (Transmission Control Protocol) is a connection-based protocol, meaning that a reliable connection must be established with the other party before formally sending and receiving data. A TCP connection must go through three “dialogues” to be established.
UDP (User Datagram Protocol) is a protocol corresponding to TCP. It is a connectionless protocol, which means it does not establish a connection with the other party but directly sends data packets. UDP is suitable for applications that only transmit a small amount of data and do not require high reliability.
19. What happens from the input of a URL to the completion of the page loading display?
The entire process can be divided into four steps.
(1) When sending a URL request, whether the URL is for a web page or for each resource on the web page, the browser will open a thread to handle the request and initiate a DNS query on the remote DNS server. This allows the browser to obtain the corresponding IP address for the request.
(2) The browser negotiates with the remote web server to establish a TCP connection through the three-way handshake. This handshake includes a synchronization message, a synchronization-acknowledgment message, and an acknowledgment message, which are transmitted between the browser and the server. The handshake begins with the client attempting to establish communication, then the server responds and accepts the client’s request, and finally, the client sends a message acknowledging the request has been accepted.
(3) Once the TCP/IP connection is established, the browser sends an HTTP GET request to the remote server through that connection. The remote server locates the resource and returns it using an HTTP response, with a 200 HTTP response status code indicating a correct response.
(4) At this point, the web server provides resource services, and the client begins downloading the resource. After the request returns, it enters the browser’s module. The browser parses the HTML to generate a DOM Tree, then generates the CSS rule tree based on the CSS, and JavaScript can manipulate the DOM using the DOM API.
20. What are the seven layers of the network layering model?
The seven layers are: Application layer, Presentation layer, Session layer, Transport layer, Network layer, Data Link layer, and Physical layer.
The role of each layer is as follows.
-
Application layer: Provides access to the OSI environment.
-
Presentation layer: Translates, encrypts, and compresses data.
-
Session layer: Establishes, manages, and terminates sessions.
-
Transport layer: Provides end-to-end reliable message delivery and error recovery.
-
Network layer: Responsible for the transmission of data packets from source to destination and internetworking.
-
Data Link layer: Assembles bits into frames and implements point-to-point delivery.
-
Physical layer: Transmits bits through media and determines mechanical and electrical specifications.
21. What protocols are you familiar with in the seven-layer network model?
The following protocols are included.
-
ICMP, the Internet Control Message Protocol, is a sub-protocol of the TCP/IP protocol suite used to transmit control messages between IP hosts and routers.
-
TFTP, a protocol in the TCP/IP protocol suite used for simple file transfers between clients and servers, providing uncomplicated and low-overhead file transfer services.
-
HTTP, the HyperText Transfer Protocol, is an object-oriented protocol belonging to the application layer, suitable for distributed hypermedia information systems due to its simplicity and speed.
-
DHCP, the Dynamic Host Configuration Protocol, is a means for systems to connect to networks and obtain the necessary configuration parameters.
22. Explain the principle of the 304 cache.
The server first generates an ETag for the request, which can later be used to determine if the page has been modified. Essentially, the client requests the server to verify whether it (the client) is cached based on that token.
304 is an HTTP status code that the server uses to indicate that the file has not been modified, and it does not return content. After receiving this status code, the browser will use the cached file.
The client requests page A. The server returns page A and adds an ETag to it. The client displays that page and caches it along with the ETag.
The client requests page A again and sends the ETag returned by the server during the previous request.
The server checks the ETag and determines that the page has not been modified since the last client request. It directly returns a 304 (Not Modified) response with an empty response body.
When sending a server request, the browser first performs a cache expiration check. The browser determines whether the cached file is expired based on the cache expiration time; if it has not expired, it does not send a request to the server and directly uses the cached result.
At this time, we can see 200 OK (from cache) in the browser console, indicating that the cache is fully used, and there is no interaction between the browser and the server.
If it has expired, the browser sends a request to the server. At this time, the request will include the file modification time and ETag, and then resource update checks will be performed.
The server checks the file modification time sent by the browser to determine if the file has been modified since the browser’s last request. It checks the ETag to determine if the file content has changed since the last request.
If both checks conclude that the file has not been modified, the server does not send new content but directly tells the browser that the file has not been modified, allowing it to continue using the cache—304 Not Modified.
At this point, the browser will fetch the content of the requested resource from local cache. This situation is called protocol caching, and there is one request interaction between the browser and the server.
If either of the modification time or file content checks fails, the server will process the request and return new data. Note that only GET requests will be cached; POST requests will not.
24. Explain the application of ETag.
ETag is generated by the server, and the client uses the If-Match or If-None-Match condition to validate whether the resource has been modified. Commonly, If-None-Match is used. The process of requesting a file is as follows.
On the first request, the client initiates an HTTP GET request to obtain a file. The server processes the request, returns the file content along with the request header (including ETag), and returns a status code of 200. On the second request, the client initiates an HTTP GET request to obtain a file.
At this point, the client also sends an If-None-Match header, which contains the ETag returned by the server during the first request. The server checks whether the ETag sent matches the calculated ETag.
If If-None-Match is False, it does not return 200 but returns 304, allowing the client to continue using the local cache.
If the server sets Cache-Control:max-age and Expires, the server must fully match If-Modified-Since and If-None-Match before returning 304.
25. What are the functions of Expires and Cache-Control?
Expires requires strict synchronization of time between the client and server. HTTP 1.1 introduced Cache-Control to overcome the limitations of the Expires header. If max-age and Expires appear together, max-age has a higher priority.
The specific code is as follows.
Cache-Control:no-cache, private, max-age=0ETag:"8b4c-55f16e2e30000"Expires:Thu, 02 Dec 2027 11:37:56 GMTLast-Modified:Wed, 29 Nov 2017 03:39:44 GMT
26. What is a reverse proxy?
A reverse proxy is a server that receives connection requests from the internet through a proxy server, forwards the requests to servers on the internal network, and returns the results obtained from the server to the client requesting the connection on the internet. At this point, the proxy server acts as a reverse proxy server to the outside world.
This article is complete~