Explain how L2 Normalization can be implemented using scikit-learn library in Python?

The process of converting a range of values into a standardized range is known as normalization. L2 normalization, also known as "Euclidean normalization", scales each row so that the sum of squares equals 1. This technique is commonly used in machine learning for feature scaling and text processing.

What is L2 Normalization?

L2 normalization transforms data by dividing each value by the Euclidean norm (L2 norm) of the row. For a vector [a, b, c], the L2 norm is ?(a² + b² + c²). After normalization, each row will have unit length.

Basic L2 Normalization Example

Here's how to implement L2 normalization using scikit-learn's preprocessing.normalize() function ?

import numpy as np
from sklearn import preprocessing

# Create sample input data
input_data = np.array([
    [34.78, 31.9, -65.5],
    [-16.5, 2.45, -83.5],
    [0.5, -87.98, 45.62],
    [5.9, 2.38, -55.82]
])

print("Original data:")
print(input_data)

# Apply L2 normalization
normalized_data_l2 = preprocessing.normalize(input_data, norm='l2')

print("\nL2 normalized data:")
print(normalized_data_l2)
Original data:
[[ 34.78  31.9  -65.5 ]
 [-16.5    2.45 -83.5 ]
 [  0.5  -87.98  45.62]
 [  5.9    2.38 -55.82]]

L2 normalized data:
[[ 0.43081298  0.39513899 -0.81133554]
 [-0.19377596  0.02877279 -0.98062378]
 [ 0.00504512 -0.88774018  0.4603172 ]
 [ 0.10501701  0.04236279 -0.99356772]]

Verifying the Normalization

Let's verify that each row now has a unit norm (sum of squares equals 1) ?

import numpy as np
from sklearn import preprocessing

input_data = np.array([
    [34.78, 31.9, -65.5],
    [-16.5, 2.45, -83.5],
    [0.5, -87.98, 45.62]
])

normalized_data = preprocessing.normalize(input_data, norm='l2')

# Calculate sum of squares for each row
for i, row in enumerate(normalized_data):
    sum_of_squares = np.sum(row**2)
    print(f"Row {i+1} sum of squares: {sum_of_squares:.6f}")
Row 1 sum of squares: 1.000000
Row 2 sum of squares: 1.000000
Row 3 sum of squares: 1.000000

Using Different Normalization Methods

Scikit-learn supports multiple normalization methods. Here's a comparison ?

import numpy as np
from sklearn import preprocessing

data = np.array([[4, 3], [2, 6], [1, 1]])

# L1 normalization (Manhattan norm)
l1_norm = preprocessing.normalize(data, norm='l1')

# L2 normalization (Euclidean norm)  
l2_norm = preprocessing.normalize(data, norm='l2')

# Max normalization
max_norm = preprocessing.normalize(data, norm='max')

print("Original data:")
print(data)
print("\nL1 normalized:")
print(l1_norm)
print("\nL2 normalized:")
print(l2_norm)
print("\nMax normalized:")
print(max_norm)
Original data:
[[4 3]
 [2 6]
 [1 1]]

L1 normalized:
[[0.57142857 0.42857143]
 [0.25       0.75      ]
 [0.5        0.5       ]]

L2 normalized:
[[0.8        0.6       ]
 [0.31622777 0.9487227 ]
 [0.70710678 0.70710678]]

Max normalized:
[[1.   0.75]
 [0.33 1.  ]
 [1.   1.  ]]

Comparison of Normalization Methods

Method Formula Use Case
L1 Sum of absolute values = 1 Sparse data, feature selection
L2 Sum of squares = 1 Machine learning, neural networks
Max Divide by maximum value Scale to [-1, 1] range

Conclusion

L2 normalization using scikit-learn is essential for machine learning preprocessing. Use preprocessing.normalize(data, norm='l2') to scale rows to unit length, which helps improve model performance and convergence in many algorithms.

Updated on: 2026-03-25T13:22:34+05:30

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements