Multiprocessing in Python

If the machine running the code has multiple cores, then multiprocessing (running code in multiple child processes) can help do tasks in parallel. Doing tasks in parallel helps in saving time on the overall execution.

Sequential execution

testFunction is an example function that sleeps for certain time. In the following snippet, we are calling this function 2 times sequentially. Each function call takes 2 seconds to complete; thus, the execution time is around 4 seconds.

from time import time, sleep

def testFunction(sleepTime):
   print(f'Sleeping ... Waking up in {sleepTime} second(s)')
   sleep(sleepTime)
   print(f'Woke up from sleep of {sleepTime} second(s).')

startTimestamp = time()

testFunction(2)

testFunction(2)

endTimestamp = time()
executionTimeInSeconds = endTimestamp - startTimestamp
print(f'Total execution time in seconds: {executionTimeInSeconds}')

Following are the logs from the run of above snippet. We can see that the execution time needed is around 4 seconds.

> python ./test_for_blog.py
Sleeping ... Waking up in 2 second(s)
Woke up from sleep of 2 second(s).
Sleeping ... Waking up in 2 second(s)
Woke up from sleep of 2 second(s).
Total execution time in seconds: 4.004408836364746

multiprocessing.Process class

Process class of the multiprocessing module helps in the creation of processes and running the code in those processes. The constructor of this class takes in the following keyword arguments.

Keyword argument

Optional / mandatory

Default

Description

target

optional

None

Callable function to be called in the process.

name

optional

‘Process-N’ for the Nth child.

Name of the process for identification.

args

optional

Empty tuple

Tuple of arguments for calling the target.

kwargs

optional

Empty dictionary

Dictionary for target’s keyword arguments

daemon

optional

Inherited from parent

Boolean flag to tell whether the process will be a daemon or not.

from time import time, sleep
import multiprocessing as mp

def testFunction(sleepTime):
   print(f'Sleeping ... Waking up in {sleepTime} second(s)')
   sleep(sleepTime)
   print(f'Woke up from sleep of {sleepTime} second(s).')

if __name__ == '__main__':
   startTimestamp = time()

   p1 = mp.Process(target=testFunction, args=(2,), name='Child process 1')
   p2 = mp.Process(target=testFunction, kwargs={'sleepTime': 2}, name='Child process 2')

   p1.start()
   p2.start()

   p1.join()
   p2.join()

   endTimestamp = time()
   executionTimeInSeconds = endTimestamp - startTimestamp
   print(f'Total execution time in seconds: {executionTimeInSeconds}')

In the above snippet, we create 2 processes and call our target function once in each of them. Let’s discuss the snippet in more detail.

if __name__ == '__name__' condition is used to envelope all our multiprocessing code and it helps in giving time to the parent to bootstrap properly before starting the new processes. If we do not use this condition, then we get a runtime error.

start() method calls the target from inside another process.

join() method does execution blocking in the parent process till the time the child process is terminated. Note: If you are creating new processes in a loop, do not call the join method inside the same loop, as it will block the execution and the code will behave sequentially.

Following are the logs from the run of the above snippet.

> python ./test_for_blog.py
Sleeping ... Waking up in 2 second(s)
Sleeping ... Waking up in 2 second(s)
Woke up from sleep of 2 second(s).
Woke up from sleep of 2 second(s).
Total execution time in seconds: 2.070202112197876

Using execution in multiple processes, the total execution time comes around 2 seconds (not 4 seconds which was the case in sequential execution). Parallel processing saved TIME.

Python 3 NEWS for multiprocessing

From Python version 3.8, the spawn method was made the default process start method in macOS, giving it preference over fork method. The fork method should be considered unsafe as it can lead to crashes in subprocesses.

Stay tuned

This was the first blog in the series of blogs on Python multiprocessing. Stay tuned for the next blog which will explore the concept of Process Pools.

Kedar Chandrayan

Kedar Chandrayan

I focus on understanding the WHY of each requirement. Once this is clear, then HOW becomes easy. In my blogs too, I try to take the same approach.