Python Concurrency Programming: The Ultimate Showdown from Multithreading to Asynchronous IO

1. Concurrency Models: Python’s “Three Pillars”

Python provides three core concurrency models, each with its own strengths:

  • Multithreading: Suitable for I/O-bound tasks
  • Multiprocessing: Suitable for CPU-bound tasks
  • Asynchronous IO: The king of high concurrency

Performance Comparison:

Model Switching Overhead Memory Usage Applicable Scenarios
Multithreading Low Low File I/O/Network Requests
Multiprocessing High High CPU Computation/Isolation Requirements
Asynchronous IO Very Low Very Low High Concurrency/Long Connection Services

2. Basics: Model Principles and Basic Usage

1. Multithreading

import threading

def download(url):
    # Simulate download task
    resp = requests.get(url)
    with lock:
        total_size += len(resp.content)

lock = threading.Lock()
total_size = 0
urls = ["http://example.com"]*5

threads = []
for url in urls:
    t = threading.Thread(target=download, args=(url,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f"Total download size: {total_size} bytes")

2. Multiprocessing

from multiprocessing import Pool

def process_data(data):
    # CPU-bound computation
    return sum(i*i for i in data)

if __name__ == "__main__":
    with Pool(4) as p:
        results = p.map(process_data, [[1,2,3]]*10)
    print(f"Computation result: {sum(results)}")

3. Asynchronous IO

import asyncio
import aiohttp

async def fetch(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            return await resp.text()

async def main():
    urls = ["http://example.com"]*5
    tasks = [fetch(url) for url in urls]
    results = await asyncio.gather(*tasks)
    print(f"Received {len(results)} responses")

asyncio.run(main())

3. Advanced: Core Challenges of the Models

1. Multithreading: GIL Lock and Race Conditions

# Error: Data race due to not using a lock
counter = 0

def increment():
    global counter
    for _ in range(1000):
        counter += 1  # Non-atomic operation!

threads = []
for _ in range(10):
    t = threading.Thread(target=increment)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f"Final count: {counter}")  # May be less than 10000

# Fix: Use Lock
lock = threading.Lock()

def safe_increment():
    global counter
    with lock:
        counter += 1

2. Multiprocessing: Inter-Process Communication

from multiprocessing import Process, Queue

def worker(q):
    q.put("Inter-process message")

if __name__ == "__main__":
    q = Queue()
    p = Process(target=worker, args=(q,))
    p.start()
    print(q.get())  # Output: Inter-process message
    p.join()

3. Asynchronous IO: Avoid Blocking the Event Loop

# Error: Executing synchronous blocking code in a coroutine
async def bad_practice():
    time.sleep(5)  # Blocks the entire event loop!
    return "done"

# Fix: Use asynchronous version
async def good_practice():
    await asyncio.sleep(5)
    return "done"

4. Practical Scenarios: Model Selection Guide

Scenario 1: Web Server

  • Multithreading: Flask/Django development server
  • Asynchronous IO: FastAPI + Uvicorn (supports tens of thousands of concurrent connections)
  • Multiprocessing: Gunicorn + synchronous framework (utilizes multi-core CPUs)

Scenario 2: Data Collection

  • Multithreading: Spider downloading multiple web pages
  • Asynchronous IO: Real-time collection of sensor data
  • Multiprocessing: Processing large CSV files (CPU-intensive parsing)

Scenario 3: Real-time Systems

  • Multithreading: GUI programs remain responsive
  • Asynchronous IO: WebSocket server handling long connections
  • Multiprocessing: Isolating critical computation tasks

Leave a Comment