Article Categories

Selected Reading

Activation Functions in Pytorch

Python Server Side Programming Programming

PyTorch is an open-source machine learning framework that provides various activation functions for building neural networks. An activation function determines the output of a node in a neural network given an input, introducing non-linearity which is essential for solving complex machine learning problems.

What is an Activation Function?

Neural networks consist of input layers, hidden layers, and output layers. The activation function is applied to the weighted sum of inputs at each node, transforming the linear combination into a non-linear output. This non-linearity enables neural networks to learn complex patterns and relationships in data.

Types of Activation Functions in PyTorch

PyTorch provides several built-in activation functions through torch.nn module ?

ReLU Rectified Linear Unit
Leaky ReLU Modified ReLU with small slope for negatives
Sigmoid S-shaped curve between 0 and 1
Tanh Hyperbolic tangent between -1 and 1
Softmax Probability distribution for multi-class

ReLU Activation Function

The Rectified Linear Unit (ReLU) is defined as f(x) = max(0, x). It outputs zero for negative inputs and passes positive inputs unchanged. ReLU is computationally efficient and helps mitigate vanishing gradient problems.

Example

Here's how to implement and use ReLU activation ?

import torch
import torch.nn as nn
import numpy as np

# Using PyTorch's built-in ReLU
relu = nn.ReLU()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = relu(x)
print("PyTorch ReLU:", y)

# Custom ReLU implementation
def custom_relu(x):
    return np.maximum(0, x)

x_np = np.array([-1, 2, -3, 4, 0])
y_custom = custom_relu(x_np)
print("Custom ReLU:", y_custom)

PyTorch ReLU: tensor([0., 2., 0., 4., 0.])
Custom ReLU: [0 2 0 4 0]

Leaky ReLU Activation Function

Leaky ReLU solves the "dying ReLU" problem by allowing small negative values. It's defined as f(x) = max(?x, x) where ? is typically 0.01.

Example

import torch
import torch.nn as nn

# Using PyTorch's Leaky ReLU
leaky_relu = nn.LeakyReLU(negative_slope=0.1)
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = leaky_relu(x)
print("Leaky ReLU:", y)

Leaky ReLU: tensor([-0.1000,  2.0000, -0.3000,  4.0000,  0.0000])

Sigmoid Activation Function

The sigmoid function maps any input to values between 0 and 1, making it useful for binary classification. It's defined as f(x) = 1/(1+e^(-x)).

Example

import torch
import torch.nn as nn

# Using PyTorch's Sigmoid
sigmoid = nn.Sigmoid()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = sigmoid(x)
print("Sigmoid:", y)

Sigmoid: tensor([0.2689, 0.8808, 0.0474, 0.9820, 0.5000])

Tanh Activation Function

The hyperbolic tangent function outputs values between -1 and 1. It's defined as f(x) = (e^x - e^(-x))/(e^x + e^(-x)) and is zero-centered, making it preferred over sigmoid in hidden layers.

Example

import torch
import torch.nn as nn

# Using PyTorch's Tanh
tanh = nn.Tanh()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = tanh(x)
print("Tanh:", y)

Tanh: tensor([-0.7616,  0.9640, -0.9951,  0.9993,  0.0000])

Softmax Activation Function

Softmax converts a vector of values into a probability distribution, commonly used in multi-class classification output layers. Each output represents the probability of belonging to a specific class.

Example

import torch
import torch.nn as nn

# Using PyTorch's Softmax
softmax = nn.Softmax(dim=0)
x = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0])
y = softmax(x)
print("Softmax:", y)
print("Sum of probabilities:", torch.sum(y))

Softmax: tensor([0.0117, 0.0317, 0.0861, 0.2341, 0.6364])
Sum of probabilities: tensor(1.0000)

Comparison

Function	Range	Use Case	Advantages
ReLU	[0, ?)	Hidden layers	Fast, avoids vanishing gradients
Leaky ReLU	(-?, ?)	Hidden layers	Solves dying ReLU problem
Sigmoid	(0, 1)	Binary classification	Outputs probabilities
Tanh	(-1, 1)	Hidden layers	Zero-centered output
Softmax	(0, 1)	Multi-class output	Probability distribution

Conclusion

PyTorch provides a comprehensive set of activation functions, each suited for specific tasks. ReLU variants work well for hidden layers, while Sigmoid and Softmax are ideal for classification outputs. Choose the activation function based on your network architecture and problem requirements.

Rohan Singh

Updated on: 2026-03-27T01:04:55+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started

Previous Next

Article Categories

Activation Functions in Pytorch

What is an Activation Function?

Types of Activation Functions in PyTorch

ReLU Activation Function

Example

Leaky ReLU Activation Function

Example

Sigmoid Activation Function

Example

Tanh Activation Function

Example

Softmax Activation Function

Example

Comparison

Conclusion

Learn More in Our Tutorials

Kickstart Your Career