# Musings on Python Concurrency Concurrency in Python often feels unintuitive because it does not behave the same way as concurrency in many other programming languages. We usually reach for concurrency to make programs faster, yet in Python the story is bit, let's say, interesting. It's a culmination of design choices made by the people who made python and libraries for concurrency that are trying their best to work within the constraints. Python's 3.14 version does allow you to use free threaded python and we will also explore what the pros and cons of using that would be. ### What does concurrency mean **Concurrency** refers to a program’s ability to execute multiple computations within a given time frame. These computations may: - Run at the same time - Overlap in execution - Or interleave over time You can achieve this either by spawning multiple threads within a single process or by spawning multiple processes that run in parallel with separate memory spaces. Before diving into multiprocessing and multithreading in Python, it helps to clarify what threads and processes actually are. **Thread**: A **thread** is the smallest unit of execution inside a process **Process**: A **process** is an independent execution environment with its own memory space #### Multi Threading and Multi processing Multi threading is like have many people work on the same computer where as multi processing is like having many computers and many workers. In most programming languages: - Multithreading can be used for both I/O-bound and CPU-bound tasks. - Multiprocessing is typically used when you want stronger isolation or better CPU scaling. ### Why Python feels different Python provides libraries for both threading and multiprocessing, but they behave differently from what you might expect if you come from languages In most programming languages, the relationship between threads and CPU cores is straightforward: - More threads → More cores used → More parallel CPU execution. If you create 8 threads on an 8-core machine, you generally expect all 8 cores to be busy at the same time. But in Python, that expectation quietly breaks as they do not execute Python code in parallel across cores. The reason is the Global Interpreter Lock (GIL). #### The GIL The key difference is that Python is not just “running on your CPU” It runs inside something called the **Python interpreter** (specifically, CPython in most environments). And that interpreter enforces a rule: >*"Only one thread may execute Python bytecode at a time."* Instead of each thread running on its own core simultaneously <img src="multi_thread_app.png" style="display: block; margin: 0 auto;"> threads take turns holding the lock and executing bytecode, creating the appearance of concurrency even though only one thread is actually executing Python instructions at any given moment. <img src="multi_thread_py.png" style="display: block; margin: 0 auto;"> But they don’t provide true CPU parallelism for Python code. So whats the point? Even though only one thread can execute Python bytecode at a time due to the GIL, the interpreter can temporarily release the GIL during blocking I/O operations. When a thread performs an I/O operation, such as waiting for a network response, reading from disk, or querying a database; it often enters a waiting state. During this time, the thread is not actively using the CPU. Instead, it is paused until the operating system signals that the I/O operation has completed. If we didn't have concurrency for these kinds of tasks, you would be wasting precious CPU cycles that could be better used. But if we implement the above for CPU heavy tasks we would just end up switching between chunks of bytecode and won't gain any performance improvements. So how do we run CPU heavy workloads? #### Multi Processing in Python Instead of running multiple threads inside one interpreter process, Python starts multiple interpreter processes. Each process has its own memory space and its own GIL, which means they can truly run in parallel on different CPU cores. <img src="multi_proc_py.png" style="display: block; margin: 0 auto;"> The tradeoff is that processes do not share memory by default. Data must be serialized and transferred between processes, typically using pickle. This introduces overhead and can become expensive if large objects need to be exchanged frequently. Multiprocessing works best when tasks are computationally heavy and mostly independent. If each task can run on its own data and produce a result without constant communication with other tasks, multiprocessing scales very well. #### How about doing both? <img src="multi_proc_and_thread.png" style="display: block; margin: 0 auto;"> A very common real world example is the Python web server Gunicorn. It uses a master process that spawns multiple worker processes. Each worker can also run multiple threads to handle concurrent requests. #### What if you have too many threads on the CPU? Need to add details ### Free Threaded Python