How to compute gradients in PyTorch?

PyTorch Server Side Programming Programming

To compute the gradients, a tensor must have its parameter requires_grad = true. The gradients are same as the partial derivatives.

For example, in the function y = 2*x + 1, x is a tensor with requires_grad = True. We can compute the gradients using y.backward() function and the gradient can be accessed using x.grad.

Here, the value of x.gad is same as the partial derivative of y with respect to x. If the tensor x is without requires_grad, then the gradient is None. We can define a function of multiple variables. Here the variables are the PyTorch tensors.

Steps

We can use the following steps to compute the gradients −

Import the torch library. Make sure you have it already installed.

import torch

Create PyTorch tensors with requires_grad = True and print the tensor.

x = torch.tensor(2.0, requires_grad = True)
print("x:", x)

Define a function y for the above tensor, x.

y = x**2 + 1

Compute the gradients using the backward function for y.

y.backward()

Access and print the gradients with respect to the above-created tensor x using x.grad.

dx = x.grad
print("x.grad :", dx)

Example 1

The following example shows the detailed process to compute the gradients in PyTorch.

# import torch library
import torch

# create tensors with requires_grad = true
x = torch.tensor(2.0, requires_grad = True)

# print the tensor
print("x:", x)

# define a function y for the tensor, x
y = x**2 + 1
print("y:", y)

# Compute gradients using backward function for y
y.backward()

# Access the gradients using x.grad
dx = x.grad
print("x.grad :", dx)

Output

x: tensor(2., requires_grad=True)
y: tensor(5., grad_fn=<AddBackward0>)
x.grad : tensor(4.)

Example 2

In the following Python program, we use three tensors x, w, and b as variables for function y. Tensor x is without requires_grad and w and b are with requires_grad = true.

# import torch library
import torch

# create tensor without requires_grad = true
x = torch.tensor(3)

# create tensors with requires_grad = true
w = torch.tensor(2.0, requires_grad = True)
b = torch.tensor(5.0, requires_grad = True)

# print the tensors
print("x:", x)
print("w:", w)
print("b:", b)

# define a function y for the above tensors
y = w*x + b
print("y:", y)

# Compute gradients by calling backward function for y
y.backward()

# Access and print the gradients w.r.t x, w, and b
dx = x.grad
dw = w.grad
db = b.grad
print("x.grad :", dx)
print("w.grad :", dw)
print("b.grad :", db)

Output

x: tensor(3)
w: tensor(2., requires_grad=True)
b: tensor(5., requires_grad=True)
y: tensor(11., grad_fn=<AddBackward0>)
x.grad : None
w.grad : tensor(3.)
b.grad : tensor(1.)

Notice that the x.grad is None. It's so because the x is defined without requires_grad = True.

Example 3

# import torch library
import torch

# create tensors with requires_grad = true
x = torch.tensor(3.0, requires_grad = True)
y = torch.tensor(4.0, requires_grad = True)

# print the tensors
print("x:", x)
print("y:", y)

# define a function z of above created tensors
z = x**y
print("z:", z)

# call backward function for z to compute the gradients
z.backward()

# Access and print the gradients w.r.t x, and y
dx = x.grad
dy = y.grad
print("x.grad :", dx)
print("y.grad :", dy)

Output

x: tensor(3., requires_grad=True)
y: tensor(4., requires_grad=True)
z: tensor(81., grad_fn=<PowBackward1>)
x.grad : tensor(108.)
y.grad : tensor(88.9876)

Shahid Akhtar Khan

Updated on: 06-Dec-2021

10K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started