# How can data be scaled using scikit-learn library in Python?

PythonServer Side ProgrammingProgramming

Feature scaling is an important step in the data pre-processing stage in building machine learning algorithms. It helps normalize the data to fall within a specific range.

At times, it also helps in increasing the speed at which the calculations are performed by the machine.

## Why it is needed?

Data fed to the learning algorithm as input should remain consistent and structured. All features of the input data should be on a single scale to effectively predict the values. But in real-world, data is unstructured, and most of the times, not on the same scale.

This is when normalization comes into picture. It is one of the most important data-preparation processes. It helps in changing values of the columns of the input dataset to fall on a same scale.

Let us understand how Scikit learn library can be used to perform feature scaling in Python.

## Example

import numpy as np
from sklearn import preprocessing
input_data = np.array(
[[34.78, 31.9, -65.5],
[-16.5, 2.45, -83.5],
[0.5, -87.98, 45.62],
[5.9, 2.38, -55.82]])
data_scaler_minmax = preprocessing.MinMaxScaler(feature_range=(0,1))
data_scaled_minmax = data_scaler_minmax.fit_transform(input_data)
print ("\nThe scaled data is \n", data_scaled_minmax)

## Output

The scaled data is
[[1.  1. 0.1394052 ]
[0.  0.75433767 0. ]
[0.33151326 0. 1. ]
[0.43681747 0.75375375 0.21437423]]

## Explanation

• The required packages are imported.

• The input data is generated using the Numpy library.

• The MinMaxScaler function present in the class ‘preprocessing ‘ is used to scale the data to fall in the range 0 and 1.

• This way, any data in the array gets scaled down to a value between 0 and 1.

• This scaled data is displayed on the console.

Published on 11-Dec-2020 10:26:58