- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to normalize a NumPy array so the values range exactly between 0 and 1?
NumPy is a powerful library in Python for numerical computing that provides an array object for the efficient handling of large datasets. Often, it is necessary to normalize the values of a NumPy array to ensure they fall within a specific range. One common normalization technique is to scale the values between 0 and 1.
In this article, we will learn how to normalize a NumPy array so the values range exactly between 0 and 1. We will see the different approaches that can be used to achieve this using NumPy, along with syntax and complete examples.
Approaches
There are various approaches or methods through which we can easily normalize a NumPy array so the values range exactly between 0 and 1. Let’s see some of the commonly used approaches along with their syntax and examples −
Approach 1: Using Min-Max Normalization
The first approach to normalize an array to range exactly between 0 and 1 is using the Min-Max normalization. It is also known as feature scaling, rescales the values in a range from 0 to 1 using the minimum and maximum values in the array. This method is widely used and straightforward to implement.
Syntax
Below is the syntax for using min-max normalization to normalize an array to range exactly between 0 and 1 −
normalized_arr = (arr - min_val) / (max_val - min_val)
Example
In the given example, we use the min-max normalization to rescale the values in a range from 0 to 1 based on the minimum and maximum values in the array. For example, if we have an array [10, 4, 5, 6, 2, 8, 11, 20], the minimum value is 2 and the maximum value is 20. By subtracting the minimum value from each element and dividing it by the range (max - min), we can obtain normalized values between 0 and 1.
#import numpy module import numpy as np #define array with some values my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20]) # Find the minimum and maximum values in the array my_min_val = np.min(my_arr) my_max_val = np.max(my_arr) # Perform min-max normalization my_normalized_arr = (my_arr - my_min_val) / (my_max_val - my_min_val) print(my_normalized_arr)
Output
[0.44444444 0.11111111 0.16666667 0.22222222 0. 0.333333330.5 1. ] ]
Approach 2: Using Z-score Normalization
The second approach to normalize an array to range exactly between 0 and 1 is using the Z-score normalization. It is also known as standardization, transforming the values to have a mean of 0 and a standard deviation of 1.
Syntax
Below is the syntax for using Z-score normalization to normalize an array to range exactly between 0 and 1 −
mean_val = np.mean(arr) std_val = np.std(arr) normalized_arr = (arr - mean_val) / std_val
Example
In the given example, we have used the Z-score normalization to standardize the values by subtracting the mean and dividing by the standard deviation. Although it does not guarantee values between 0 and 1, it is commonly used for statistical analysis. For instance, applying z-score normalization to the array [10, 4, 5, 6, 2, 8, 11, 20].
#import numpy module import numpy as np #define array with some values my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20]) # Calculate the mean and standard deviation of the array my_mean_val = np.mean(arr) my_std_val = np.std(arr) # Perform z-score normalization my_normalized_arr = (my_arr - my_mean_val) / my_std_val print(my_normalized_arr)
Output
[ 0.85564154 -0.34874292 -0.2032739 -0.05780487 -0.63668096 0.14113318 1.00082049 4.38233801]
Approach 3: Using Rescaling Division
The third approach to normalize an array to range exactly between 0 and 1 is using rescaling division. It is useful when we have a specific maximum value in our mind, if there is, then we can directly divide each element of the array by that value to obtain a normalized range between 0 and 1.
Syntax
Below is the syntax for using rescaling division to normalize an array to range exactly between 0 and 1 −
normalized_arr = arr / max_val
Example
In the below example, we have used the rescaling by division approach which allows for direct scaling of the array's values using a specific maximum value. For an array [10, 4, 5, 6, 2, 8, 11, 20] and a chosen maximum value of 20, dividing each element by 20 yields the normalized array. This method can be useful when a specific maximum value is desired.
#import numpy module import numpy as np #define array with some values my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20]) #define the max value my_max_val = 10 # Perform rescaling by dividing each element by the maximum value my_normalized_arr = my_arr / my_max_val print(my_normalized_arr)
Output
[1. 0.4 0.5 0.6 0.2 0.8 1.1 2. ]
Approach 4: Using Sklearn MinMaxScaler
The fourth and last approach to normalize an array to range exactly between 0 and 1 is using sklearn MinMaxScaler. This method offers a convenient way to normalize data by scaling it to a specific range, in this case, between 0 and 1. The sklearn MinMaxScaler method is useful when we want to preserve the original distribution of the data while ensuring it falls within the desired range.
Syntax
Below is the syntax for using sklearn MinMaxScaler to normalize an array to range exactly between 0 and 1 −
scaler = MinMaxScaler(feature_range=(0, 1)) normalized_arr = scaler.fit_transform(arr.reshape(-1, 1)).flatten()
Example
In the given example, we have used Scikit-learn's MinMaxScaler which provides a convenient approach to normalize an array to a desired range, such as 0 to 1. By fitting the MinMaxScaler to the array [10, 4, 5, 6, 2, 8, 11, 20], and then applying the transform method.
#import numpy module import numpy as np from sklearn.preprocessing import MinMaxScaler #define array with some values my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20]) # Create an instance of MinMaxScaler my_minmax_scaler = MinMaxScaler(feature_range=(0, 1)) # Reshape the array to be a column vector and fit-transform the data my_normalized_arr = my_minmax_scaler.fit_transform(arr.reshape(-1, 1)).flatten() print(my_normalized_arr)
Output
[0.47368421 0.15789474 0.21052632 0.26315789 0.05263158 0.31578947 0.52631579 1. ]
Conclusion
Normalizing a NumPy array to range exactly between 0 and 1 is a common requirement in data preprocessing tasks. In this article, we learned how to normalize an array to range exactly between 0 and 1. We saw the four different approaches to achieve this normalization: Min-Max normalization, Z-score normalization, rescaling by division, and using Scikit-learn's MinMaxScaler.
Min-Max normalization calculates the range of values in the array and rescales them to the range [0, 1]. Z-score normalization standardizes the values by subtracting the mean and dividing by the standard deviation. Rescaling by division directly divides each element by a specified maximum value. Scikit-learn's MinMaxScaler provides a convenient way to normalize the array using a specific range.
To Continue Learning Please Login
Login with Google