Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to compute gradients in PyTorch?
To compute the gradients, a tensor must have its parameter requires_grad = true. The gradients are same as the partial derivatives.
For example, in the function y = 2*x + 1, x is a tensor with requires_grad = True. We can compute the gradients using y.backward() function and the gradient can be accessed using x.grad.
Here, the value of x.gad is same as the partial derivative of y with respect to x. If the tensor x is without requires_grad, then the gradient is None. We can define a function of multiple variables. Here the variables are the PyTorch tensors.
Steps
We can use the following steps to compute the gradients −
Import the torch library. Make sure you have it already installed.
import torch
Create PyTorch tensors with requires_grad = True and print the tensor.
x = torch.tensor(2.0, requires_grad = True)
print("x:", x)
Define a function y for the above tensor, x.
y = x**2 + 1
Compute the gradients using the backward function for y.
y.backward()
Access and print the gradients with respect to the above-created tensor x using x.grad.
dx = x.grad
print("x.grad :", dx)
Example 1
The following example shows the detailed process to compute the gradients in PyTorch.
# import torch library
import torch
# create tensors with requires_grad = true
x = torch.tensor(2.0, requires_grad = True)
# print the tensor
print("x:", x)
# define a function y for the tensor, x
y = x**2 + 1
print("y:", y)
# Compute gradients using backward function for y
y.backward()
# Access the gradients using x.grad
dx = x.grad
print("x.grad :", dx)
Output
x: tensor(2., requires_grad=True) y: tensor(5., grad_fn=<AddBackward0>) x.grad : tensor(4.)
Example 2
In the following Python program, we use three tensors x, w, and b as variables for function y. Tensor x is without requires_grad and w and b are with requires_grad = true.
# import torch library
import torch
# create tensor without requires_grad = true
x = torch.tensor(3)
# create tensors with requires_grad = true
w = torch.tensor(2.0, requires_grad = True)
b = torch.tensor(5.0, requires_grad = True)
# print the tensors
print("x:", x)
print("w:", w)
print("b:", b)
# define a function y for the above tensors
y = w*x + b
print("y:", y)
# Compute gradients by calling backward function for y
y.backward()
# Access and print the gradients w.r.t x, w, and b
dx = x.grad
dw = w.grad
db = b.grad
print("x.grad :", dx)
print("w.grad :", dw)
print("b.grad :", db)
Output
x: tensor(3) w: tensor(2., requires_grad=True) b: tensor(5., requires_grad=True) y: tensor(11., grad_fn=<AddBackward0>) x.grad : None w.grad : tensor(3.) b.grad : tensor(1.)
Notice that the x.grad is None. It's so because the x is defined without requires_grad = True.
Example 3
# import torch library
import torch
# create tensors with requires_grad = true
x = torch.tensor(3.0, requires_grad = True)
y = torch.tensor(4.0, requires_grad = True)
# print the tensors
print("x:", x)
print("y:", y)
# define a function z of above created tensors
z = x**y
print("z:", z)
# call backward function for z to compute the gradients
z.backward()
# Access and print the gradients w.r.t x, and y
dx = x.grad
dy = y.grad
print("x.grad :", dx)
print("y.grad :", dy)
Output
x: tensor(3., requires_grad=True) y: tensor(4., requires_grad=True) z: tensor(81., grad_fn=<PowBackward1>) x.grad : tensor(108.) y.grad : tensor(88.9876)