Parallelizing a Numpy Vector Operation

Python Server Side Programming Programming

Numpy is a powerful Python library that serves to store and manipulate large, multi-dimensional arrays. Although it is fast and more efficient than other similar collections like lists, we can further enhance its performance by using the parallelizing mechanism. Parallelizing means splitting the tasks into multiple processes to achieve one single goal. Python provides several ways to parallelize a numpy vector operation including multiprocessing and numexpr module.

Python Programs for Parallelizing a NumPy Vector Operation

Let's discuss the ways to parallelize a numpy vector:

Using multiprocessing

Every Python program is considered as one single process and sometimes it might require to run multiple processes concurrently. For this, Python provides a module named multiprocessing that has a built-in method named 'Pool()' which allows the creation and execution of multiple tasks at the same time.

Example

The following example illustrates how to square each element of the vector using parallelization with the help of multiprocessing.

Approach

The first step is to import the numpy library with reference name 'np' and multiprocessing with 'mp'.

Next, create a user-defined method along with a parameter.

Inside this method, determine the number of available CPU processes using cpu_count(). This value will be further used to create a pool of worker processes for parallel computation.
Then, create a pool of processes using Pool that takes 'num_processes' as an argument that specifies the number of CPU processes available.
Now, use the map method to apply the square() method to each element of the input vector. This method will divide the input vector into chunks and assign each chunk to a worker process for computation. The map function automatically distributes the workload across the available processes and returns the results in the same order as the input vector.
Once the mapping is done, we will close the pool using close() method and wait for all the worker processes to finish using the join() method.
In the end, return the result which is a list of squared values obtained through parallel computation.

Now, create a numpy vector and pass it as an argument to the method and display the result.

# importing required packages
import numpy as np
import multiprocessing as mp
# user-defined method to print square of vector
def square_vector_parallel(vector):
   num_processes = mp.cpu_count()
   pool = mp.Pool(processes = num_processes)
   result = pool.map(np.square, vector)
   pool.close()
   pool.join()
   return result
# creating a numpy vector
vec_tr = np.array([1, 2, 3, 4, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)

Output

[1, 4, 9, 16, 25]

Using numexpr

This package of Python has the capability to parallelize the computation and utilize multiple cores or SIMD instructions that results in fast and efficient performance of NumPy vector.

Example 1

In the following example, we will take two numpy vectors and perform a parallel addition operation on them using 'numexpr' library.

# importing required packages
import numpy as np
import numexpr as nex
# creating two numpy vectors  
a1 = np.array([5, 2, 7, 4, 5])
a2 = np.array([4, 8, 3, 9, 5])
# printing the result
print(nex.evaluate('a1 + a2'))

Output

[ 9 10 10 13 10]

Example 2

This is another example that demonstrates the use of numexpr. We will create a user-defined method that takes a vector as a parameter. Inside this method define an expression expr as 'vector**2', which squares each element of the input vector and passes it to the 'evaluate()' method of numexpr to evaluate the expression in a parallelized manner.

import numpy as np
import numexpr as nex
# user-defined method to print square of vector
def square_vector_parallel(vector):
   expr = 'vector**2'
   result = nex.evaluate(expr)
   return result
# creating a numpy vector
vec_tr = np.array([4, 8, 6, 9, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)

Output

[16 64 36 81 25]

Example 3

In the same code of the previous example, we explicitly set the number of threads to 4 using 'set_num_threads()' method. This allows us to perform thread-level parallelism in the evaluation of the expression.

import numpy as np
import numexpr as nex
# user-defined method to print square of vector
def square_vector_parallel(vector):
# Set the number of threads to utilize
   nex.set_num_threads(4)  
   expr = 'vector**2'
   result = nex.evaluate(expr)
   return result
# creating a numpy vector
vec_tr = np.array([4, 8, 6, 9, 5])
# calling the method
result = square_vector_parallel(vec_tr)
# printing the result
print(result)

Output

[16 64 36 81 25]

Conclusion

In this article, we have discussed several example programs to demonstrate how to parallelize a numpy vector operation. We have used two most popular and widely used Python libraries for concurrent operations named multiprocessing and numexpr.

Shriansh Kumar

Updated on: 21-Jul-2023

162 Views

Kickstart Your Career

Get certified by completing the course

Get Started