Implementing a Simple HTTP Proxy in Golang

This article details the implementation of an HTTP proxy using Golang. Those who have such needs in practical business can start learning!

A proxy is an important function in the network, which serves to proxy network users to obtain network information. Figuratively speaking, it is a transfer station for network information. For clients, the proxy acts as a server, receiving request messages and returning response messages; for web servers, the proxy acts as a client, sending request messages and receiving response messages.

There are various types of proxies. If classified according to network users, they can be divided into forward proxies and reverse proxies:

  • Forward Proxy: The client acts as the network user. When the client accesses the server, it first accesses the proxy server, and then the proxy server accesses the server. This process requires the client to configure the proxy, which is transparent to the server.

  • Reverse Proxy: The server acts as the network user. The access process is the same as that of a forward proxy, but this process is transparent to the client, and the server needs to configure the proxy (or may not configure it).

For forward and reverse proxies, there are different proxy protocols, which are the protocols used for communication between the proxy server and the network user:

  • Forward Proxy:

    • http
    • https
    • socks4
    • socks5
    • vpn: In terms of functionality, a VPN can also be considered a proxy
  • Reverse Proxy:

    • tcp
    • udp
    • http
    • https

Next, let’s talk about HTTP proxies.

Overview of HTTP Proxies

HTTP proxy is a relatively simple proxy method in forward proxies, using the HTTP protocol as the transport protocol between the client and the proxy server.

HTTP proxies can carry HTTP, HTTPS, FTP protocols, etc. For different protocols, the data format between the client and the proxy server is slightly different.

HTTP Protocol

Let’s first look at the HTTP Header sent from the client to the proxy server under the HTTP protocol:

// Direct connection
GET / HTTP/1.1
Host: staight.github.io
Connection: keep-alive

// HTTP proxy
GET http://staight.github.io/ HTTP/1.1
Host: staight.github.io
Proxy-Connection: keep-alive

As can be seen, the HTTP proxy compared to a direct connection:

  • The URL becomes a complete path, /->http://staight.github.io/
  • The Connection field becomes the Proxy-Connection field
  • All others remain the same

Why use a complete path?

To identify the target server. If there is no complete path and no Host field, the proxy server will not be able to know the address of the target server.

Why use the Proxy-Connection field instead of the Connection field?

To be compatible with outdated proxy servers that use the HTTP/1.0 protocol. The long connection feature only started with HTTP/1.1. In the case of a direct connection, if there is a Connection: keep-alive field in the HTTP Header sent by the client, it indicates the use of a long connection for HTTP communication with the server. However, if there is an outdated proxy server in between, that proxy server will not be able to establish a long connection with the client and server, causing the client and server to wait indefinitely, wasting time.

Therefore, the Proxy-Connection field is used instead of the Connection field. If the proxy server uses the HTTP/1.1 protocol and can recognize the Proxy-Connection field, it will convert this field into Connection before sending it to the server; if it cannot recognize it, it will send it directly to the server, as the server also cannot recognize it, thus using a short connection for communication.

The interaction process of the HTTP proxy HTTP protocol is illustrated as follows:

Implementing a Simple HTTP Proxy in Golang

HTTP Proxy HTTP Protocol

HTTPS Protocol

Next, let’s look at the HTTP Header sent from the client to the proxy server under the HTTPS protocol:

CONNECT staight.github.io:443 HTTP/1.1
Host: staight.github.io:443
Proxy-Connection: keep-alive

As shown above, compared to the HTTP protocol, the HTTPS protocol:

  • The request method changes from GET to CONNECT
  • The URL does not have a protocol field

In fact, since the communication between the client and server under HTTPS is encrypted except for the initial negotiation, the proxy server no longer modifies the HTTP message for forwarding. Instead, it first negotiates the server’s address with the client, and then directly forwards the subsequent TCP encrypted data.

The interaction process of the HTTP proxy HTTPS protocol is illustrated as follows:

Implementing a Simple HTTP Proxy in Golang

Code Implementation

First, create a TCP service, and for each TCP request, call the handle function:

// TCP connection, listening on port 8080
l, err := net.Listen("tcp", ":8080")
if err != nil {
 log.Panic(err)
}

// Infinite loop, call handle when a connection is encountered
for {
 client, err := l.Accept()
 if err != nil {
  log.Panic(err)
 }

 go handle(client)
   }

Then store the obtained data in a buffer:

// Buffer to store client data
var b [1024]byte
// Get data from the client
n, err := client.Read(b[:])
if err != nil {
 log.Println(err)
 return
   }

Read the HTTP request method, URL, and other information from the buffer:

var method, URL, address string
// Read method and URL from client data
fmt.Sscanf(string(b[:bytes.IndexByte(b[:], '\n')]), "%s%s", &method, &URL)
hostPortURL, err := url.Parse(URL)
if err != nil {
 log.Println(err)
 return
   }

The method of obtaining addresses for HTTP and HTTPS protocols is different, and they are processed separately:

// If the method is CONNECT, it is the HTTPS protocol
if method == "CONNECT" {
 address = hostPortURL.Scheme + ":" + hostPortURL.Opaque
} else { // Otherwise, it is the HTTP protocol
 address = hostPortURL.Host
 // If the host does not have a port, default to 80
 if strings.Index(hostPortURL.Host, ":") == -1 { // Host does not have a port, default 80
  address = hostPortURL.Host + ":80"
 }
   }

Use the obtained address to send a request to the server. If it is an HTTP protocol, directly forward the client’s request to the server; if it is an HTTPS protocol, send an HTTP response:

// Obtained the requested host and port, initiate a TCP connection to the server
server, err := net.Dial("tcp", address)
if err != nil {
 log.Println(err)
 return
}
// If using HTTPS protocol, first indicate to the client that the connection has been established
if method == "CONNECT" {
 fmt.Fprint(client, "HTTP/1.1 200 Connection established\r\n\r\n")
} else { // If using HTTP protocol, forward the HTTP request obtained from the client to the server
 server.Write(b[:n])
   }

Finally, forward all client requests to the server and all server responses back to the client:

// Forward the client's request to the server and the server's response to the client. io.Copy is a blocking function, it won't stop until the file descriptor is closed
 go io.Copy(server, client)
   io.Copy(client, server)

Complete source code:

package main

import (
 "bytes"
 "fmt"
 "io"
 "log"
 "net"
 "net/url"
 "strings"
)

func main() {
 // TCP connection, listening on port 8080
 l, err := net.Listen("tcp", ":8080")
 if err != nil {
  log.Panic(err)
 }

 // Infinite loop, call handle when a connection is encountered
 for {
  client, err := l.Accept()
  if err != nil {
   log.Panic(err)
  }

  go handle(client)
 }
}

func handle(client net.Conn) {
 if client == nil {
  return
 }
 defer client.Close()

 log.Printf("remote addr: %v\n", client.RemoteAddr())

 // Buffer to store client data
 var b [1024]byte
 // Get data from the client
 n, err := client.Read(b[:])
 if err != nil {
  log.Println(err)
  return
 }

 var method, URL, address string
 // Read method and URL from client data
 fmt.Sscanf(string(b[:bytes.IndexByte(b[:], '\n')]), "%s%s", &method, &URL)
hostPortURL, err := url.Parse(URL)
 if err != nil {
  log.Println(err)
  return
 }

 // If the method is CONNECT, it is the HTTPS protocol
 if method == "CONNECT" {
  address = hostPortURL.Scheme + ":" + hostPortURL.Opaque
 } else { // Otherwise, it is the HTTP protocol
  address = hostPortURL.Host
  // If the host does not have a port, default to 80
  if strings.Index(hostPortURL.Host, ":") == -1 { // Host does not have a port, default 80
   address = hostPortURL.Host + ":80"
  }
 }

 // Obtained the requested host and port, initiate a TCP connection to the server
 server, err := net.Dial("tcp", address)
 if err != nil {
  log.Println(err)
  return
 }
 // If using HTTPS protocol, first indicate to the client that the connection has been established
 if method == "CONNECT" {
  fmt.Fprint(client, "HTTP/1.1 200 Connection established\r\n\r\n")
 } else { // If using HTTP protocol, forward the HTTP request obtained from the client to the server
  server.Write(b[:n])
 }

 // Forward the client's request to the server and the server's response to the client. io.Copy is a blocking function, it won't stop until the file descriptor is closed
 go io.Copy(server, client)
 io.Copy(client, server)
}

Add the proxy and then run:

Implementing a Simple HTTP Proxy in Golang

Implementing a Simple HTTP Proxy in Golang

Link: https://blog.csdn.net/weixin_43507410/article/details/124839308

(Copyright belongs to the original author, infringement will be deleted)

Implementing a Simple HTTP Proxy in Golang

Leave a Comment