How to normalize a NumPy array so the values range exactly between 0 and 1?


NumPy is a powerful library in Python for numerical computing that provides an array object for the efficient handling of large datasets. Often, it is necessary to normalize the values of a NumPy array to ensure they fall within a specific range. One common normalization technique is to scale the values between 0 and 1.

In this article, we will learn how to normalize a NumPy array so the values range exactly between 0 and 1. We will see the different approaches that can be used to achieve this using NumPy, along with syntax and complete examples.

Approaches

There are various approaches or methods through which we can easily normalize a NumPy array so the values range exactly between 0 and 1. Let’s see some of the commonly used approaches along with their syntax and examples −

Approach 1: Using Min-Max Normalization

The first approach to normalize an array to range exactly between 0 and 1 is using the Min-Max normalization. It is also known as feature scaling, rescales the values in a range from 0 to 1 using the minimum and maximum values in the array. This method is widely used and straightforward to implement.

Syntax

Below is the syntax for using min-max normalization to normalize an array to range exactly between 0 and 1 −

normalized_arr = (arr - min_val) / (max_val - min_val)

Example

In the given example, we use the min-max normalization to rescale the values in a range from 0 to 1 based on the minimum and maximum values in the array. For example, if we have an array [10, 4, 5, 6, 2, 8, 11, 20], the minimum value is 2 and the maximum value is 20. By subtracting the minimum value from each element and dividing it by the range (max - min), we can obtain normalized values between 0 and 1.

#import numpy module
import numpy as np

#define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])

# Find the minimum and maximum values in the array
my_min_val = np.min(my_arr)
my_max_val = np.max(my_arr)

# Perform min-max normalization
my_normalized_arr = (my_arr - my_min_val) / (my_max_val - my_min_val)
print(my_normalized_arr)

Output

[0.44444444 0.11111111 0.16666667 0.22222222 0.         0.333333330.5        1.        ] ]

Approach 2: Using Z-score Normalization

The second approach to normalize an array to range exactly between 0 and 1 is using the Z-score normalization. It is also known as standardization, transforming the values to have a mean of 0 and a standard deviation of 1.

Syntax

Below is the syntax for using Z-score normalization to normalize an array to range exactly between 0 and 1 −

mean_val = np.mean(arr)
std_val = np.std(arr)
normalized_arr = (arr - mean_val) / std_val

Example

In the given example, we have used the Z-score normalization to standardize the values by subtracting the mean and dividing by the standard deviation. Although it does not guarantee values between 0 and 1, it is commonly used for statistical analysis. For instance, applying z-score normalization to the array [10, 4, 5, 6, 2, 8, 11, 20].

#import numpy module
import numpy as np

#define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])

# Calculate the mean and standard deviation of the array
my_mean_val = np.mean(arr)
my_std_val = np.std(arr)

# Perform z-score normalization
my_normalized_arr = (my_arr - my_mean_val) / my_std_val
print(my_normalized_arr)

Output

[ 0.85564154 -0.34874292 -0.2032739  -0.05780487 -0.63668096  0.14113318  1.00082049  4.38233801]

Approach 3: Using Rescaling Division

The third approach to normalize an array to range exactly between 0 and 1 is using rescaling division. It is useful when we have a specific maximum value in our mind, if there is, then we can directly divide each element of the array by that value to obtain a normalized range between 0 and 1.

Syntax

Below is the syntax for using rescaling division to normalize an array to range exactly between 0 and 1 −

normalized_arr = arr / max_val

Example

In the below example, we have used the rescaling by division approach which allows for direct scaling of the array's values using a specific maximum value. For an array [10, 4, 5, 6, 2, 8, 11, 20] and a chosen maximum value of 20, dividing each element by 20 yields the normalized array. This method can be useful when a specific maximum value is desired.

#import numpy module
import numpy as np

#define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])

#define the max value
my_max_val = 10

# Perform rescaling by dividing each element by the maximum value
my_normalized_arr = my_arr / my_max_val
print(my_normalized_arr)

Output

[1.   0.4  0.5  0.6  0.2  0.8  1.1  2.  ]

Approach 4: Using Sklearn MinMaxScaler

The fourth and last approach to normalize an array to range exactly between 0 and 1 is using sklearn MinMaxScaler. This method offers a convenient way to normalize data by scaling it to a specific range, in this case, between 0 and 1. The sklearn MinMaxScaler method is useful when we want to preserve the original distribution of the data while ensuring it falls within the desired range.

Syntax

Below is the syntax for using sklearn MinMaxScaler to normalize an array to range exactly between 0 and 1 −

scaler = MinMaxScaler(feature_range=(0, 1))
normalized_arr = scaler.fit_transform(arr.reshape(-1, 1)).flatten()

Example

In the given example, we have used Scikit-learn's MinMaxScaler which provides a convenient approach to normalize an array to a desired range, such as 0 to 1. By fitting the MinMaxScaler to the array [10, 4, 5, 6, 2, 8, 11, 20], and then applying the transform method.

#import numpy module
import numpy as np
from sklearn.preprocessing import MinMaxScaler

#define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])

# Create an instance of MinMaxScaler
my_minmax_scaler = MinMaxScaler(feature_range=(0, 1))

# Reshape the array to be a column vector and fit-transform the data
my_normalized_arr = my_minmax_scaler.fit_transform(arr.reshape(-1, 1)).flatten()
print(my_normalized_arr)

Output

[0.47368421 0.15789474 0.21052632 0.26315789 0.05263158 0.31578947 0.52631579 1.        ]

Conclusion

Normalizing a NumPy array to range exactly between 0 and 1 is a common requirement in data preprocessing tasks. In this article, we learned how to normalize an array to range exactly between 0 and 1. We saw the four different approaches to achieve this normalization: Min-Max normalization, Z-score normalization, rescaling by division, and using Scikit-learn's MinMaxScaler.

Min-Max normalization calculates the range of values in the array and rescales them to the range [0, 1]. Z-score normalization standardizes the values by subtracting the mean and dividing by the standard deviation. Rescaling by division directly divides each element by a specified maximum value. Scikit-learn's MinMaxScaler provides a convenient way to normalize the array using a specific range.

Updated on: 10-Aug-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements