Core Principles of Web Crawlers: How an HTTP Request is Completed

Core Principles of Web Crawlers: How an HTTP Request is Completed

Author: Da Mu Jiang https://my.oschina.net/luozhou/blog/3003053 Overview In the previous article, “Do You Know What Happens Behind the Scenes When You Ping?” we analyzed the process of a <span>Ping</span> using actual packet capture (a common interview question). We learned that <span>ping</span> relies on the <span>ICMP</span> protocol and also involves <span>ARP</span> requests in a local area network. … Read more

HTTP Persistent Connections and HttpClient Connection Pool

HTTP Persistent Connections and HttpClient Connection Pool

Background The HTTP protocol is a stateless protocol, meaning each request is independent of others. Therefore, its initial implementation was to open a TCP socket connection for each HTTP request, which would be closed after the interaction was complete. HTTP is a full-duplex protocol, so establishing and closing connections requires three-way handshakes and four-way handshakes. … Read more

Learning MQTT Protocol: 007 – Keep Alive Mechanism and Corresponding Messages (PINGREQ, PINGRESP)

Learning MQTT Protocol: 007 - Keep Alive Mechanism and Corresponding Messages (PINGREQ, PINGRESP)

Background Keep alive is part of the variable header in the CONNECT message. We have mentioned that the Broker needs to know whether the Client has disconnected abnormally to send the last will message. In fact, the Client also needs to quickly detect if it has lost the connection to the Broker in order to … Read more