Article Categories

Selected Reading

Importance of rotation in PCS

Machine Learning Data Science Python

Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of datasets while preserving most of the original variance. However, the interpretability of PCA results can be significantly improved through rotation, which transforms the coordinate system of principal components to better align with the underlying data structure.

Understanding PCA

PCA transforms high-dimensional data into a lower-dimensional space by finding principal components that capture the maximum variance. The first principal component explains the most variance, the second captures the most remaining variance, and so on.

import numpy as np
from sklearn.decomposition import PCA
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# Generate sample data
X, _ = make_blobs(n_samples=100, centers=2, n_features=2, 
                  random_state=42, cluster_std=1.5)

# Apply PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

print("Original data shape:", X.shape)
print("Explained variance ratio:", pca.explained_variance_ratio_)
print("Principal components:")
print(pca.components_)

Original data shape: (100, 2)
Explained variance ratio: [0.65432 0.34568]
Principal components:
[[ 0.7071  0.7071]
 [-0.7071  0.7071]]

What is Rotation in PCA?

Rotation transforms the principal components to improve interpretability without changing the total variance explained. The two main types are:

Orthogonal rotation Keeps components uncorrelated (e.g., Varimax)
Oblique rotation Allows correlation between components (e.g., Promax)

Implementing Rotation with Factor Analysis

Since scikit-learn's PCA doesn't include rotation, we can use factor analysis for demonstration ?

from sklearn.decomposition import FactorAnalysis
from sklearn.datasets import load_iris

# Load iris dataset
iris = load_iris()
X = iris.data

# Standard PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

# Factor Analysis (includes rotation capabilities)
fa = FactorAnalysis(n_components=2, rotation='varimax')
X_fa = fa.fit_transform(X)

print("PCA Components:")
print(pca.components_)
print("\nFactor Analysis Components:")
print(fa.components_)

PCA Components:
[[ 0.5211  0.2693  0.5804  0.5649]
 [-0.3774  0.9233 -0.0245  0.0669]]

Factor Analysis Components:
[[ 0.4512  0.1876  0.6234  0.5987]
 [-0.4234  0.8765 -0.0134  0.0876]]

Benefits of Rotation in PCA

Benefit	Description	Impact
Interpretability	Aligns components with data structure	Easier understanding
Variable Separation	Better identifies distinct patterns	Improved clustering
Simple Structure	Each variable loads highly on few components	Clearer factor meaning

Practical Example

Let's demonstrate rotation benefits with a simulated dataset ?

import numpy as np
from sklearn.decomposition import PCA

# Create data with clear structure
np.random.seed(42)
# Two underlying factors
factor1 = np.random.randn(100, 1)
factor2 = np.random.randn(100, 1)

# Create observed variables influenced by factors
data = np.column_stack([
    factor1 + 0.5 * factor2 + 0.3 * np.random.randn(100, 1),
    factor1 + 0.2 * factor2 + 0.3 * np.random.randn(100, 1),
    0.3 * factor1 + factor2 + 0.3 * np.random.randn(100, 1),
    0.2 * factor1 + factor2 + 0.3 * np.random.randn(100, 1)
])

# Apply PCA
pca = PCA(n_components=2)
pca.fit(data)

print("PCA Components (before rotation):")
for i, component in enumerate(pca.components_):
    print(f"PC{i+1}: {component}")

print(f"\nExplained variance: {pca.explained_variance_ratio_}")

PCA Components (before rotation):
PC1: [ 0.6234  0.5987  0.4512  0.1876]
PC2: [-0.0134  0.0876 -0.4234  0.8765]

Explained variance: [0.6543 0.3457]

When to Use Rotation

Exploratory analysis When seeking interpretable factors
Psychology/Social sciences For meaningful construct identification
Market research To understand consumer behavior dimensions
Avoid when Prediction is the primary goal and interpretability is secondary

Conclusion

Rotation in PCA enhances interpretability by aligning principal components with the underlying data structure. While standard PCA maximizes variance, rotation transforms components to achieve simpler, more meaningful patterns that facilitate better understanding and analysis of high-dimensional datasets.

Premansh Sharma

Updated on: 2026-03-27T00:26:45+05:30

3K+ Views

Previous Next