Mastering HTTP Requests with Python Requests Library

When it comes to HTTP request libraries in Python, requests is definitely a “mountain” that cannot be overlooked. This tool is not only powerful but also very easy to use. If you’ve ever written a web scraper, debugged an API, or tinkered with anything related to the web, you can’t do without it. Today, let’s talk about the core usage of requests and guide you to easily handle HTTP requests.

---

GET Request: The Way to Get Data!

The most common type of HTTP request is GET, which is straightforward and used to retrieve data from the server. Sending a GET request with requests is a piece of cake.

import requests

response = requests.get('https://jsonplaceholder.typicode.com/posts')
print(response.status_code)  # Outputs the status code, e.g., 200
print(response.text)  # Outputs the returned text content

In the code above, requests.get() sends a GET request. response is the response object returned, from which you can obtain the content returned by the server via response.text, or check the HTTP response status code with response.status_code, where 200 indicates a successful request.

Tip:

If you find the returned content is garbled, you can try this:

response.encoding = 'utf-8'
print(response.text)

Forcing the encoding can sometimes save the day.

---

POST Request: Sending Data to the Server

GET is for getting data, while POST is for submitting data to the server. For example, submitting a login form uses POST as the main player.

import requests

payload = {'username': 'python', 'password': '123456'}
response = requests.post('https://httpbin.org/post', data=payload)
print(response.json())  # Outputs the JSON data returned by the server

The parameters for a POST request are passed via data. In most cases, you will use a POST request to send form data, such as user login.

Tip:

Some APIs require data in JSON format, so use json= instead of data=:

response = requests.post('https://httpbin.org/post', json=payload)

---

GET Request with Parameters: The Type with a Question Mark in the URL

Sometimes a GET request needs to include parameters. For instance, when you search for “requests tutorial” on a search engine, the URL becomes https://www.example.com/search?q=requests. Sending such a parameterized GET request using requests is also easy.

params = {'q': 'requests', 'sort': 'relevance'}
response = requests.get('https://httpbin.org/get', params=params)
print(response.url)  # Outputs the final URL

You will find that requests automatically helps you concatenate the parameters into the URL, making it particularly convenient.

---

Handling Responses: We Want More than Just Data

Simply obtaining data is not enough; you need to learn to extract what you want.

Getting Response Headers

print(response.headers)  # Returns the response headers as a dictionary
print(response.headers['Content-Type'])  # Gets a specific field

Extracting JSON Data

Many APIs return data in JSON format, and handling it with requests is particularly straightforward:

data = response.json()
print(data['args'])  # Assuming the returned data contains an "args" field

Checking Response Status

if response.ok:
    print("Request successful!")
else:
    print("Request failed, status code:", response.status_code)

---

Trying File Upload? Understanding File Uploads

File uploads are actually a type of POST request, and requests can handle that too.

files = {'file': open('example.txt', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)
print(response.text)

See? Uploading a file is as simple as using the files= parameter, and requests will take care of everything for you.

---

Adding Custom Request Headers

Some websites are sensitive to request headers, such as requiring a specific User-Agent. Don’t worry; requests can handle that too.

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}
response = requests.get('https://httpbin.org/headers', headers=headers)
print(response.text)

Pass custom request headers using the headers parameter, simulating browser behavior with ease.

---

Session Management: Operations After Login

You might encounter situations where all subsequent operations require maintaining a logged-in state after logging into a website. In such cases, you need to use a session.

session = requests.Session()
payload = {'username': 'python', 'password': '123456'}

# Login
session.post('https://httpbin.org/post', data=payload)

# Access other pages after logging in
response = session.get('https://httpbin.org/cookies')
print(response.text)

requests.Session() will automatically manage cookies for you, making session management easy.

---

Handling Timeouts and Exceptions: Stay Calm!

Network requests are not always smooth sailing; they may timeout or fail. Therefore, you must include some error handling in your code.

Setting Timeout

try:
    response = requests.get('https://httpbin.org/delay/5', timeout=3)
    print(response.text)
except requests.exceptions.Timeout:
    print("Request timed out!")

Catching Other Exceptions

try:
    response = requests.get('https://example.com', timeout=3)
    response.raise_for_status()  # Raises an exception for 4xx or 5xx response status codes
except requests.exceptions.RequestException as e:
    print("Request failed:", e)

---

Proxies: Essential Skills for Stealth Operations

If you need to send requests through a proxy server, such as for bypassing restrictions or hiding your IP address, requests also has built-in support.

proxies = {
    'http': 'http://10.10.1.10:3128',
    'https': 'https://10.10.1.10:1080',
}
response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.text)

---

Advanced Techniques: Retry Mechanism

Some requests may fail, but problems can be resolved by retrying. Using urllib3 along with requests can implement automatic retries.

from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(total=3, backoff_factor=1, status_forcelist=[500, 502, 503, 504])
adapter = HTTPAdapter(max_retries=retries)
session.mount('http://', adapter)
session.mount('https://', adapter)

response = session.get('https://httpbin.org/status/500')
print(response.status_code)

---

By now, you should have a basic understanding of the core functionalities of requests. Have you realized that this tool is not only powerful but also elegant? Using it to handle HTTP requests is simply delightful. When writing code, consider using requests more often, treating it as your network toolbox, and you won’t go wrong!

Leave a Comment