I was playing around with the threading and multiprocessing libraries in Python recently when I came across a blog post about Python unlocking the GIL with 3.13. Along with learning the tools of the trade to write concurrent code, I was curious to know the difference between the two libraries.

The following is the code:

import threading
import multiprocessing
import math
import time
  
 
def timer(func):
	def wrapper():
		start = time.perf_counter()
		func()
		end = time.perf_counter()
		print(f"Time taken for {func.__name__}\t: {end - start} s")
	return wrapper
 
 
def factorial(n):
	return math.factorial(n)
 
  
@timer
def single_threaded():
	for i in range(1000, 10000):
		factorial(i)
 
 
@timer
def multi_threaded():
	threads = []
	for i in range(1000, 10000):
		thread = threading.Thread(target=factorial, args=(i,))
		threads.append(thread)
		thread.start()
 
	for thread in threads:
		thread.join()
 
  
 
@timer
def multi_process():
	processes = []
	for i in range(1000, 10000):
		process = multiprocessing.Process(target=factorial, args=(i,))
		processes.append(process)
		process.start()
 
	for process in processes:
		process.join()
 
 
if __name__ == "__main__":
	single_threaded()
	multi_threaded()
	multi_process()

For range(1, 1000), the times were:

Time taken for single_threaded  : 0.010678458000029423 s
Time taken for multi_thread     : 0.05921614600003977 s
Time taken for multi_process    : 0.3115545729999667 s

For range(1000, 10000), the times were:

Time taken for single_threaded  : 7.120449294999958 s
Time taken for multi_threaded   : 11.34618227300001 s
Time taken for multi_process    : 6.9518359699999905 s

For smaller numbers, I found that the single-threaded code was faster than the other two, which initially came across as shocking to me. It made no sense that using a single CPU thread would be faster than concurrently running the code using multiple threads or processes. However, it actually feels normal to me. Single-threaded code is a lot faster with simpler and smaller numbers. Multithreading and multiprocess take over with larger numbers because there is proper scope for concurrent computing here. In other words, optimization only when it’s really required and can produce a net improvement

What is interesting here is that with larger number, the multi-threaded performance was still a lot slower than that of the single-threaded code. The multi-process code worked as per my expectations by bagging the first place

So in Python, threads are not concurrent. They have their own stack and such but all the threads effectively run on a single core. The interpreter merely switches between threads here and there. A speed-up can be expected with multithreading only when the program is not CPU bound. How so? The computing resources stay the same but the thread management overhead increases. Why does this happen? Blame the GIL

Multiprocessing can speed up computation here but it still won’t speed up file I/O