Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to normalize a NumPy array so the values range exactly between 0 and 1?
NumPy is a powerful library in Python for numerical computing that provides an array object for the efficient handling of large datasets. Often, it is necessary to normalize the values of a NumPy array to ensure they fall within a specific range. One common normalization technique is to scale the values between 0 and 1.
In this article, we will learn how to normalize a NumPy array so the values range exactly between 0 and 1. We will explore different approaches that can be used to achieve this using NumPy and scikit-learn, along with syntax and complete examples.
Method 1: Using Min-Max Normalization
Min-Max normalization (also known as feature scaling) rescales values to a range from 0 to 1 using the minimum and maximum values in the array. This method guarantees that all values will be exactly between 0 and 1 −
Syntax
normalized_arr = (arr - min_val) / (max_val - min_val)
Example
Here we use min-max normalization to rescale values. For an array [10, 4, 5, 6, 2, 8, 11, 20], the minimum value is 2 and maximum is 20. By subtracting the minimum from each element and dividing by the range (max - min), we get normalized values between 0 and 1 −
import numpy as np
# Define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])
# Find the minimum and maximum values in the array
my_min_val = np.min(my_arr)
my_max_val = np.max(my_arr)
# Perform min-max normalization
my_normalized_arr = (my_arr - my_min_val) / (my_max_val - my_min_val)
print("Original array:", my_arr)
print("Normalized array:", my_normalized_arr)
Original array: [10 4 5 6 2 8 11 20] Normalized array: [0.44444444 0.11111111 0.16666667 0.22222222 0. 0.33333333 0.5 1. ]
Method 2: Using Rescaling Division
This approach divides each element by a specified maximum value. It's useful when you have a predetermined maximum value in mind −
Syntax
normalized_arr = arr / max_val
Example
Here we rescale by dividing each element by a chosen maximum value. Note that this only guarantees values between 0 and 1 if all array elements are less than or equal to the chosen maximum −
import numpy as np
# Define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])
# Define the max value (using array's actual max for 0-1 range)
my_max_val = np.max(my_arr) # Use 20 to ensure 0-1 range
# Perform rescaling by dividing each element by the maximum value
my_normalized_arr = my_arr / my_max_val
print("Original array:", my_arr)
print("Normalized array:", my_normalized_arr)
Original array: [10 4 5 6 2 8 11 20] Normalized array: [0.5 0.2 0.25 0.3 0.1 0.4 0.55 1. ]
Method 3: Using Sklearn MinMaxScaler
Scikit-learn's MinMaxScaler provides a convenient way to normalize data by scaling it to a specific range. This method preserves the original distribution while ensuring values fall within the desired range −
Syntax
scaler = MinMaxScaler(feature_range=(0, 1)) normalized_arr = scaler.fit_transform(arr.reshape(-1, 1)).flatten()
Example
The MinMaxScaler requires reshaping the 1D array to a column vector, then flattening the result −
import numpy as np
from sklearn.preprocessing import MinMaxScaler
# Define array with some values
my_arr = np.array([10, 4, 5, 6, 2, 8, 11, 20])
# Create an instance of MinMaxScaler
my_minmax_scaler = MinMaxScaler(feature_range=(0, 1))
# Reshape the array to be a column vector and fit-transform the data
my_normalized_arr = my_minmax_scaler.fit_transform(my_arr.reshape(-1, 1)).flatten()
print("Original array:", my_arr)
print("Normalized array:", my_normalized_arr)
Original array: [10 4 5 6 2 8 11 20] Normalized array: [0.44444444 0.11111111 0.16666667 0.22222222 0. 0.33333333 0.5 1. ]
Comparison
| Method | Guarantees 0-1 Range | Best For | Dependencies |
|---|---|---|---|
| Min-Max Normalization | Yes | Simple arrays, custom implementation | NumPy only |
| Rescaling Division | Only if max_val ? array max | When you know the theoretical maximum | NumPy only |
| MinMaxScaler | Yes | Professional data preprocessing pipelines | Scikit-learn |
Conclusion
Min-Max normalization and MinMaxScaler are the most reliable methods to ensure values range exactly between 0 and 1. Use Min-Max normalization for simple cases with NumPy only, or MinMaxScaler for professional data science workflows with consistent preprocessing pipelines.
