Python Concurrency: Multiprocessing, Threading, and Asyncio

Overview

Python offers three main approaches for concurrent execution: multiprocessing, threading, and asyncio. Each solves different problems and has distinct use cases.

The Fundamental Problem: The GIL

Python's Global Interpreter Lock (GIL) is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecode simultaneously. This is crucial to understanding when to use each approach.

1. Multiprocessing

How It Works

  • Creates separate Python processes, each with its own Python interpreter and memory space
  • Each process has its own GIL
  • True parallel execution on multiple CPU cores
  • Processes communicate via IPC (Inter-Process Communication) mechanisms

Best For

  • CPU-bound tasks: Heavy computations, data processing, mathematical operations
  • Tasks that need true parallelism
  • When you need to bypass the GIL

Pros

  • True parallel execution
  • Each process is isolated (crash in one doesn't affect others)
  • Can utilize all CPU cores

Cons

  • High memory overhead (each process duplicates memory)
  • Slower startup time
  • Communication between processes is expensive (serialization/deserialization)
  • More complex to share data

Example Use Case

from multiprocessing import Pool

def heavy_computation(n):
    return sum(i * i for i in range(n))

# Process 4 large computations in parallel
with Pool(4) as p:
    results = p.map(heavy_computation, [10000000, 10000000, 10000000, 10000000])

2. Threading

How It Works

  • Creates multiple threads within a single process
  • All threads share the same memory space
  • Only one thread executes Python bytecode at a time (due to GIL)
  • Concurrency, not parallelism

Best For

  • I/O-bound tasks: File operations, network requests, database queries
  • When tasks spend time waiting (not computing)
  • When you need shared memory access

Pros

  • Lower memory overhead than multiprocessing
  • Fast context switching
  • Easy data sharing between threads
  • Good for I/O-bound operations

Cons

  • GIL prevents true parallel CPU execution
  • Race conditions and deadlocks possible
  • Not suitable for CPU-bound tasks

Example Use Case

from threading import Thread
import requests

def download_file(url):
    response = requests.get(url)
    # Process response

# Download multiple files concurrently
threads = [Thread(target=download_file, args=(url,)) for url in urls]
for t in threads:
    t.start()
for t in threads:
    t.join()

3. Asyncio

How It Works

  • Single-threaded cooperative multitasking
  • Uses an event loop to manage tasks
  • Tasks voluntarily yield control (using await)
  • Non-blocking I/O operations

Best For

  • High-concurrency I/O-bound tasks: Web servers, API clients, websockets
  • When you have thousands of connections
  • Modern async libraries (aiohttp, asyncpg)

Pros

  • Extremely lightweight (can handle thousands of tasks)
  • Lower overhead than threads
  • Explicit control flow (you see where context switches with await)
  • No race conditions (single-threaded)
  • Very efficient for I/O-bound work

Cons

  • Requires async-compatible libraries
  • One blocking call blocks everything
  • Steeper learning curve
  • Cannot use regular blocking code

Example Use Case

import asyncio
import aiohttp

async def fetch_url(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        results = await asyncio.gather(*tasks)

# Handle thousands of concurrent requests efficiently
asyncio.run(main())

Quick Decision Guide

Task Type Best Choice Why
CPU-intensive calculations Multiprocessing Bypasses GIL, true parallelism
I/O operations (few connections) Threading Simple, good for moderate concurrency
I/O operations (many connections) Asyncio Most efficient for high concurrency
Mixed CPU + I/O Multiprocessing + Asyncio Process pool for CPU, async for I/O

Performance Tips

1. Measure First

Always profile before optimizing. Use cProfile, line_profiler, or py-spy.

2. Choose Based on Bottleneck

  • CPU-bound? → Multiprocessing
  • I/O-bound with moderate concurrency? → Threading
  • I/O-bound with high concurrency? → Asyncio

3. Avoid Common Pitfalls

  • Don't use multiprocessing for I/O-bound tasks (overhead outweighs benefits)
  • Don't use threading for CPU-bound tasks (GIL makes it slower)
  • Don't mix blocking code in asyncio (use loop.run_in_executor() if needed)

4. Optimal Worker Counts

  • Multiprocessing: Usually cpu_count() or cpu_count() - 1
  • Threading: Depends on wait time; often 2-10x cpu_count for I/O
  • Asyncio: Can handle thousands of tasks on one thread

5. Hybrid Approaches

For complex applications, combine approaches:

# Process pool for CPU work, async for coordination
from concurrent.futures import ProcessPoolExecutor
import asyncio

async def process_data(data):
    loop = asyncio.get_event_loop()
    with ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(pool, cpu_intensive_func, data)
    return result

Memory and Overhead Comparison

Approach Memory per Task Startup Time Context Switch Cost
Multiprocessing ~10-50 MB Slow (100ms+) High
Threading ~8 MB Medium (1ms) Medium
Asyncio ~1-5 KB Fast (<0.1ms) Very Low

Real-World Example: Web Scraping

# BAD: CPU-bound with multiprocessing
# Processing 100 URLs with heavy parsing

# Threading: ~30 seconds (I/O wait time reduced)
# Asyncio: ~5 seconds (optimal for I/O)
# Multiprocessing: ~60 seconds (overhead kills performance)

Key Takeaway

There's no "best" approach—only the right tool for your specific problem. Understanding your workload (CPU vs I/O bound) and concurrency needs (10 vs 10,000 tasks) is essential for making the right choice.