Locally Weighted Linear Regression in Python

Locally Weighted Linear Regression (LOESS) is a non-parametric algorithm that adapts to local patterns in data. Unlike standard linear regression which assumes global linearity, LOESS gives more weight to nearby points when making predictions, making it suitable for non-linear data distributions.

Parametric vs Non-Parametric Models

Parametric Models

Parametric models assume a specific functional form and have a fixed number of parameters. For example, linear regression uses the equation:

b? + b?x? + b?x? = 0

Here, b?, b?, and b? are fixed coefficients that define the line's intercept and slope.

Non-Parametric Models

Non-parametric algorithms make no assumptions about the mapping function's form. They derive the relationship directly from training data, requiring more data points but offering greater flexibility for complex patterns.

Mathematical Foundation

The standard linear regression cost function is:

$$\displaystyle\sum_{i=1}^m (y^{(i)} - \theta^T x^{(i)})^2$$

LOESS modifies this by adding weights:

$$\displaystyle\sum_{i=1}^m w^{(i)}(y^{(i)} - \theta^T x^{(i)})^2$$

The weighting function uses a Gaussian kernel:

$$w^{(i)} = \exp\left(-\frac{(x^{(i)} - x)^2}{2\tau^2}\right)$$

Where x is the query point, x??? is the i-th training point, and ? (tau) controls the bandwidth of the weighting function.

Weight Calculation Example

Given data points [2, 5, 10, 17, 26, 37, 50, 65, 82], query point x = 7, and ? = 0.5:

For x?¹? = 5:  w?¹? = exp(-(5-7)²/(2×0.5²)) = 0.00061
For x?²? = 10: w?²? = exp(-(10-7)²/(2×0.5²)) = 5.92×10??  
For x?³? = 26: w?³? = exp(-(26-7)²/(2×0.5²)) = 1.25×10?²??

Notice how weights decrease exponentially as distance from the query point increases.

Python Implementation

Here's a complete implementation using the tips dataset ?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Create sample data similar to tips dataset
np.random.seed(42)
total_bill = np.random.uniform(10, 50, 100)
tip = 0.15 * total_bill + np.random.normal(0, 1, 100)

def kernel(point, data, tau):
    """Calculate weights using Gaussian kernel"""
    m = data.shape[0]
    weights = np.mat(np.eye(m))
    
    for i in range(m):
        diff = point - data[i]
        weights[i,i] = np.exp(diff*diff.T/(-2.0*tau**2))
    
    return weights

def local_weight_regression(x_data, y_data, point, tau):
    """Calculate local weights for a specific point"""
    xmat = np.mat(x_data)
    ymat = np.mat(y_data)
    
    weights = kernel(point, xmat, tau)
    theta = (xmat.T * (weights * xmat)).I * (xmat.T * (weights * ymat.T))
    
    return theta

def predict_all(x_data, y_data, tau):
    """Make predictions for all points"""
    m = x_data.shape[0]
    predictions = np.zeros(m)
    
    for i in range(m):
        theta = local_weight_regression(x_data, y_data, x_data[i], tau)
        predictions[i] = x_data[i] * theta
    
    return predictions

# Prepare data
features = np.column_stack([np.ones(len(total_bill)), total_bill])
predictions = predict_all(features, tip, tau=2.0)

# Plot results
plt.figure(figsize=(10, 6))
plt.scatter(total_bill, tip, alpha=0.6, label='Actual data')

# Sort for smooth line plotting
sort_idx = np.argsort(total_bill)
plt.plot(total_bill[sort_idx], predictions[sort_idx], 'r-', linewidth=2, label='LOESS fit')

plt.xlabel('Total Bill ($)')
plt.ylabel('Tip ($)')
plt.title('Locally Weighted Linear Regression')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

When to Use LOESS

Use Case Suitable? Reason
Small number of features ? Yes Computationally manageable
Non-linear relationships ? Yes Adapts to local patterns
Large datasets ? No Computationally expensive
Linear relationships ? No Simple linear regression suffices

Advantages and Disadvantages

Advantages

  • Adapts to local data patterns automatically
  • No assumptions about global function form
  • Reduces prediction errors for non-linear data
  • Multiple local functions handle variations better than one global function

Disadvantages

  • Computationally expensive for large datasets
  • Cannot handle high-dimensional data efficiently
  • Requires prediction-time computation (no pre-trained model)
  • Memory intensive for storing all training data

Conclusion

Locally Weighted Linear Regression is ideal for non-linear data with small feature sets. It provides flexible curve fitting by weighting nearby points more heavily, but comes with computational trade-offs that limit its use on large datasets.

Updated on: 2026-03-26T23:30:00+05:30

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements