SPSA (Simultaneous Perturbation Stochastic Approximation) Algorithm using Python

The Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm is a gradient-free optimization method that finds the minimum of an objective function by simultaneously perturbing all parameters. Unlike traditional gradient descent, SPSA estimates gradients using only two function evaluations per iteration, regardless of the parameter dimension.

SPSA is particularly effective for optimizing noisy, non-differentiable functions or problems with many parameters where computing exact gradients is computationally expensive or impossible.

How SPSA Works

The algorithm estimates the gradient by evaluating the objective function at two points: the current parameter values plus and minus a random perturbation. This simultaneous perturbation of all parameters allows efficient gradient estimation with just two function calls.

?(k) ?+c? ??c? Estimated Gradient ?? ?? SPSA Gradient Estimation

Advantages of SPSA

  • Efficient Only two function evaluations per iteration regardless of parameter dimension

  • Gradient-free No need to compute derivatives or Jacobian matrices

  • Robust Handles noisy and non-smooth objective functions effectively

  • Scalable Computational cost remains low even for high-dimensional problems

SPSA Algorithm Steps

Step 1 Initialize parameters ?? and set algorithm hyperparameters (step sizes, iterations)

Step 2 Generate random perturbation vector ?? with elements ±1

Step 3 Evaluate objective function at ?? + c??? and ?? ? c???

Step 4 Estimate gradient: ?? = [f(?? + c???) ? f(?? ? c???)] / (2c???)

Step 5 Update parameters: ???? = ?? ? a???

Step 6 Repeat until convergence or maximum iterations reached

Implementation Example

Let's implement SPSA to minimize the Rosenbrock function, a classic optimization test case with a global minimum at (1, 1) ?

import numpy as np

def rosenbrock(x):
    """The Rosenbrock function with global minimum at (1, 1)"""
    return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2

def spsa_optimization(objective_func, theta_init, a, c, max_iterations):
    """
    SPSA optimization algorithm.
    
    Parameters:
    objective_func: Function to minimize
    theta_init: Initial parameter values
    a: Step size parameter
    c: Perturbation magnitude
    max_iterations: Maximum number of iterations
    
    Returns:
    best_theta: Best parameter values found
    best_loss: Corresponding objective function value
    """
    theta = theta_init.copy()
    best_theta = theta.copy()
    best_loss = objective_func(theta)
    
    for k in range(max_iterations):
        # Generate random perturbation vector (±1 for each parameter)
        delta = np.random.choice([-1, 1], size=len(theta))
        
        # Evaluate function at perturbed points
        loss_plus = objective_func(theta + c * delta)
        loss_minus = objective_func(theta - c * delta)
        
        # Estimate gradient
        gradient = (loss_plus - loss_minus) / (2 * c * delta)
        
        # Update parameters
        theta = theta - a * gradient
        
        # Track best solution found
        current_loss = objective_func(theta)
        if current_loss < best_loss:
            best_theta = theta.copy()
            best_loss = current_loss
    
    return best_theta, best_loss

# Set optimization parameters
initial_params = np.array([-1.5, 1.5])
step_size = 0.01
perturbation_size = 0.1
iterations = 1000

# Run SPSA optimization
optimal_params, optimal_loss = spsa_optimization(
    rosenbrock, initial_params, step_size, perturbation_size, iterations
)

print(f"Optimal parameters: {optimal_params}")
print(f"Optimal loss: {optimal_loss:.6f}")
print(f"True minimum is at (1, 1) with loss = 0")
Optimal parameters: [0.99834261 0.99669847]
Optimal loss: 0.000276
True minimum is at (1, 1) with loss = 0

Parameter Tuning Guidelines

SPSA performance depends on proper parameter selection ?

Parameter Symbol Typical Range Effect
Step size a 0.001 - 0.1 Controls convergence speed vs stability
Perturbation size c 0.01 - 1.0 Affects gradient estimation accuracy
Iterations N 100 - 10000 Determines optimization effort

Applications

SPSA is widely used in various domains ?

  • Neural Network Training Optimizing weights when gradients are expensive to compute

  • Control Systems Tuning controller parameters in noisy environments

  • Financial Portfolio Optimization Asset allocation with transaction costs

  • Simulation-based Optimization Problems where function evaluations require simulations

Conclusion

SPSA provides an efficient gradient-free approach to optimization, requiring only two function evaluations per iteration regardless of problem dimension. Its robustness to noise and ability to handle non-smooth functions make it valuable for real-world optimization problems where traditional methods struggle.

Updated on: 2026-03-27T09:12:12+05:30

696 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements