SPSA (Simultaneous Perturbation Stochastic Approximation) Algorithm using Python

Python Server Side Programming Programming

A simultaneous perturbation stochastic optimization algorithm (SPSA) finds the minimum of an objective function by simultaneously perturbing the objective function.

Using SPSA, the objective function gradient is estimated by evaluating a small number of functions at random perturbations. It is particularly useful when the objective function is noisy, non-differentiable, or has many parameters.

A variety of applications, such as system identification, control, and machine learning, have been successfully implemented with this algorithm.

Advantages Of Using The SPSA Algorithm

The SPSA has been applied in various realms such as engineering, finance, and machine learning. It has several advantages over other optimization algorithms, such as stochastic gradient descent and gradient descent,

A few of which are −

Reduced computation and memory requirements
Handles non-smooth and noisy functions
There is no need for derivatives
It is capable of handling large parameter spaces

The algorithm can be used to train deep neural networks efficiently by optimizing the weights and biases. Also, SPSA can be used in the financial realm, such as optimizing portfolios of stocks, bonds, and other financial instruments.

Algorithm

Step 1 − Import the numpy library and define any function to apply the SPSA algorithm to optimize/minimize the function

Step 2 − Initialize the input variables (parameters) with some initial values.

Step 3 − Select the number of iterations and the size of the steps.

Step 4 − Using the difference between the function evaluations, estimate the gradient of the objective function.

Step 5 − Update the input variables to minimize the objective function using the estimated gradient.

Step 6 − Return the best parameter values and corresponding loss values.

Step 7 − Repeat steps 3 to 5 until convergence.

Example

In the following code, we have used the Rosenbrock function to explain the implementation of the SPSA algorithm. It is a common test function for optimization algorithms. It has a global minimum at (-1, 1) and a narrow valley leading to the minimum.

import numpy as np

def rosenbrock(x):
    """
    The Rosenbrock function.
    """
    return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2

def SPSA(loss_function, theta, a, c, num_iterations):
    """
    A method for minimizing a loss function via simultaneous perturbation     stochastic approximation (SPSA).
    
    Parameters:
    loss_function (function): Loss function to be minimized.
    theta (numpy array): The initial values of the parameters.
    a (float): Size of the step when updating parameters.
    c (float): The noise parameter for randomly generating perturbations.
    num_iterations (int): Number of iterations in the algorithm.
    
    Returns:
    tuple: A tuple containing the best parameter values found and the 
    corresponding loss value.
    """
    # Initialize the best parameter values and loss
    best_theta = theta
    best_loss = loss_function(theta)

    # Initialize the iteration counter
    i = 0

    # Repeat until convergence or the maximum number of iterations is reached
    while i < num_iterations:

        # Generate a random perturbation vector with elements that are either +1 or -1
        delta = np.random.choice([-1, 1], size=len(theta))

        # Evaluate the objective function at the positive and negative perturbations
        loss_plus = loss_function(theta + c * delta)
        loss_minus = loss_function(theta - c * delta)

        # Calculate the gradient estimate using the perturbations
        gradient = (loss_plus - loss_minus) / (2 * c * delta)

        # Update the parameter values
        theta = theta - a * gradient

        # If the new parameter values result in a lower loss, update the best values
        new_loss = loss_function(theta)
        if new_loss < best_loss:
            best_theta = theta
            best_loss = new_loss

        # Increment the iteration counter
        i += 1

    # Return the best parameter values and the corresponding loss value
    return (best_theta, best_loss)

# Set the initial parameter values, step size, noise parameter, and number of iterations
theta0 = np.array([-1.5, 1.5])
a = 0.1
c = 0.1
num_iterations = 1000

# Run the SPSA algorithm on the Rosenbrock function
best_theta, best_loss = SPSA(rosenbrock, theta0, a, c, num_iterations)

# Print the results
print("Best parameter values:", best_theta)
print("Corresponding loss:", best_loss)

We import all the necessary libraries. Inside the SPSA function, initial parameter values and loss are initialized based on the given theta value. The iteration counter is set to 0 and the main loop continues to run until the maximum convergence or the maximum number of iterations is reached. Inside the main loop, a random perturbation vector “delta” with elements either of +1 or -1 are generated.

Then the loss function is evaluated by subtracting (c*delta where c is the noise parameter) from theta. The parameter values are updated using the formula theta = theta - a*gradient. The process is followed up until the best parameter values and loss values are assigned, upon which final results are displayed.

Output

Best parameter values: [-1.5  1.5]
Corresponding loss: 62.5

Conclusion

Ultimately, SPSA is an optimization algorithm capable of optimizing complex functions with many parameters. A major advantage of SPSA over other optimization algorithms is that it does not require knowledge of the function's derivatives or curvatures, making it suitable for a wide range of optimization problems.

As a result, SPSA is a valuable tool for any data scientist or machine learning practitioner, and its simplicity and effectiveness make it a popular choice for many optimization problems.

Jaisshree

Updated on: 21-Jul-2023

262 Views

Kickstart Your Career

Get certified by completing the course

Get Started