Practical Analysis of Python Concurrency: ThreadPoolExecutor

Click the blue text to follow us

Hello everyone, I am Cai Ge. Today we will talk about concurrency in Python, especially the use of ThreadPoolExecutor. Concurrency is a very important topic, especially when handling multiple tasks such as network requests and file operations. Through concurrency, we can improve the execution efficiency of programs and save time. Next, we will gradually explore the concept, usage, and some practical application scenarios of ThreadPoolExecutor.

What is Concurrency?

Concurrency refers to the ability of a program to handle multiple tasks within the same time period. In simple terms, it allows the program to execute other tasks while waiting for certain operations (such as network requests) to complete. Python provides various ways to achieve concurrency, among which ThreadPoolExecutor is a very convenient tool.

Tip:

Concurrency is not the same as parallelism. Concurrency involves multiple tasks being processed during the same time period, while parallelism involves multiple tasks being executed simultaneously.

Overview of ThreadPoolExecutor

ThreadPoolExecutor is a class in the concurrent.futures module that allows us to create a thread pool to handle multiple tasks. The benefit of a thread pool is that we do not need to manage the lifecycle of threads ourselves; Python handles these details for us.

Basic Usage

First, we need to import ThreadPoolExecutor, and then we can use the with statement to create a thread pool. Here is a simple example:

from concurrent.futures import ThreadPoolExecutor
import time
def task(n):
    print(f"Task {n} started")
    time.sleep(2)  # Simulate a time-consuming operation
    print(f"Task {n} completed")
    return n * 2

with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(task, i) for i in range(5)]

for future in futures:
    print(f"Result: {future.result()}")

In this example, we created a thread pool that can handle up to 3 tasks simultaneously. We submitted 5 tasks to the thread pool, but due to the pool’s limitation, only 3 tasks can be executed at the same time.

Running Results:

You will see that the start and completion of tasks alternate, which is a manifestation of concurrency.

Handling Return Values and Exceptions

When using ThreadPoolExecutor, we can easily obtain the return value of each task. The return value of a task can be obtained through the Future object’s result() method. Additionally, we can handle exceptions that may occur during task execution.

def task_with_error(n):
    if n == 2:
        raise ValueError("Error occurred in task 2")
    return n * 2

with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(task_with_error, i) for i in range(5)]

for future in futures:
    try:
        print(f"Result: {future.result()}")
    except Exception as e:
        print(f"Caught exception: {e}")

Tip:

Using try...except to catch exceptions can help us handle issues that may arise during task execution.

Practical Application Scenarios

Network Requests

In network programming, ThreadPoolExecutor is very suitable for handling multiple HTTP requests. Suppose we want to scrape data from multiple websites; we can use a thread pool to speed up this process.

import requests
def fetch_url(url):
    response = requests.get(url)
    return response.status_code

urls = ['https://www.example.com', 'https://www.python.org', 'https://www.github.com']
with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(fetch_url, url) for url in urls]

for future in futures:
    print(f"Status code: {future.result()}")

In this example, we concurrently request multiple URLs to obtain their status codes.

File Operations

Another common scenario is file operations. We can use a thread pool to concurrently read multiple files, improving efficiency.

def read_file(filename):
    with open(filename, 'r') as f:
        return f.read()

files = ['file1.txt', 'file2.txt', 'file3.txt']
with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(read_file, file) for file in files]

for future in futures:
    print(f"File content: {future.result()[:50]}")  # Only print the first 50 characters

Conclusion

Today we learned the basic usage of ThreadPoolExecutor, including how to create a thread pool, submit tasks, handle return values and exceptions, as well as some practical application scenarios. By using a thread pool, we can significantly enhance the concurrency capability of our programs and improve execution efficiency.

In actual programming, remember to set the size of the thread pool appropriately to avoid excessive threads leading to resource contention. I hope everyone can practice and explore more concurrency programming techniques!

Take a break, and we will continue tomorrow!

Today’s learning is over, let’s take a break~

What is Concurrency?

Tip:

Overview of ThreadPoolExecutor

Basic Usage

Running Results:

Handling Return Values and Exceptions

Tip:

Practical Application Scenarios

Network Requests

File Operations

Conclusion

Related posts

Leave a Comment Cancel reply