The process of converting a range of values into standardized range of values is known as normalization. These values could be between -1 to +1 or 0 to 1. Data can be normalized with the help of subtraction and division as well.
Data fed to the learning algorithm as input should remain consistent and structured. All features of the input data should be on a single scale to effectively predict the values. But in real-world, data is unstructured, and most of the times, not on the same scale.
This is when normalization comes into picture. It is one of the most important data-preparation process.
It helps in changing values of the columns of the input dataset to fall on a same scale.
During the process of normalization, the range of values are ensured to be non-distorted.
Note − Not all input dataset fed to machine learning algorithms have to be normalized. Normalization is required only when features in a dataset have completely different scale of values.
There are different kinds of normalization −
Let us understand how L1 normalization works.
Also known as Least Absolute Deviations, it changes the data such that the sum of the absolute values remains as 1 in every row.
Let us see how L1 Normalization can be implemented using scikit learn in Python −
import numpy as np from sklearn import preprocessing input_data = np.array( [[34.78, 31.9, -65.5],[-16.5, 2.45, -83.5],[0.5, -87.98, 45.62],[5.9, 2.38, -55.82]] ) data_normalized_l1 = preprocessing.normalize(input_data, norm='l1') print("\nL1 normalized data is \n", data_normalized_l1)
L1 normalized data is [[ 0.26312604 0.24133757 -0.49553639] [-0.16105417 0.0239141 -0.81503172] [ 0.00372856 -0.65607755 0.34019389] [ 0.09204368 0.03712949 -0.87082683]]
The required packages are imported.
The input data is generated using the Numpy library.
The ‘normalize’ function present in the class ‘preprocessing‘ is used to normalize the data.
The type of normalization is specified as ‘l1’.
This way, any data in the array gets normalized and the sum of every row would be 1 only.
This normalized data is displayed on the console.