Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Activation Functions in Pytorch
PyTorch is an open-source machine learning framework that provides various activation functions for building neural networks. An activation function determines the output of a node in a neural network given an input, introducing non-linearity which is essential for solving complex machine learning problems.
What is an Activation Function?
Neural networks consist of input layers, hidden layers, and output layers. The activation function is applied to the weighted sum of inputs at each node, transforming the linear combination into a non-linear output. This non-linearity enables neural networks to learn complex patterns and relationships in data.
Types of Activation Functions in PyTorch
PyTorch provides several built-in activation functions through torch.nn module ?
ReLU Rectified Linear Unit
Leaky ReLU Modified ReLU with small slope for negatives
Sigmoid S-shaped curve between 0 and 1
Tanh Hyperbolic tangent between -1 and 1
Softmax Probability distribution for multi-class
ReLU Activation Function
The Rectified Linear Unit (ReLU) is defined as f(x) = max(0, x). It outputs zero for negative inputs and passes positive inputs unchanged. ReLU is computationally efficient and helps mitigate vanishing gradient problems.
Example
Here's how to implement and use ReLU activation ?
import torch
import torch.nn as nn
import numpy as np
# Using PyTorch's built-in ReLU
relu = nn.ReLU()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = relu(x)
print("PyTorch ReLU:", y)
# Custom ReLU implementation
def custom_relu(x):
return np.maximum(0, x)
x_np = np.array([-1, 2, -3, 4, 0])
y_custom = custom_relu(x_np)
print("Custom ReLU:", y_custom)
PyTorch ReLU: tensor([0., 2., 0., 4., 0.]) Custom ReLU: [0 2 0 4 0]
Leaky ReLU Activation Function
Leaky ReLU solves the "dying ReLU" problem by allowing small negative values. It's defined as f(x) = max(?x, x) where ? is typically 0.01.
Example
import torch
import torch.nn as nn
# Using PyTorch's Leaky ReLU
leaky_relu = nn.LeakyReLU(negative_slope=0.1)
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = leaky_relu(x)
print("Leaky ReLU:", y)
Leaky ReLU: tensor([-0.1000, 2.0000, -0.3000, 4.0000, 0.0000])
Sigmoid Activation Function
The sigmoid function maps any input to values between 0 and 1, making it useful for binary classification. It's defined as f(x) = 1/(1+e^(-x)).
Example
import torch
import torch.nn as nn
# Using PyTorch's Sigmoid
sigmoid = nn.Sigmoid()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = sigmoid(x)
print("Sigmoid:", y)
Sigmoid: tensor([0.2689, 0.8808, 0.0474, 0.9820, 0.5000])
Tanh Activation Function
The hyperbolic tangent function outputs values between -1 and 1. It's defined as f(x) = (e^x - e^(-x))/(e^x + e^(-x)) and is zero-centered, making it preferred over sigmoid in hidden layers.
Example
import torch
import torch.nn as nn
# Using PyTorch's Tanh
tanh = nn.Tanh()
x = torch.tensor([-1.0, 2.0, -3.0, 4.0, 0.0])
y = tanh(x)
print("Tanh:", y)
Tanh: tensor([-0.7616, 0.9640, -0.9951, 0.9993, 0.0000])
Softmax Activation Function
Softmax converts a vector of values into a probability distribution, commonly used in multi-class classification output layers. Each output represents the probability of belonging to a specific class.
Example
import torch
import torch.nn as nn
# Using PyTorch's Softmax
softmax = nn.Softmax(dim=0)
x = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0])
y = softmax(x)
print("Softmax:", y)
print("Sum of probabilities:", torch.sum(y))
Softmax: tensor([0.0117, 0.0317, 0.0861, 0.2341, 0.6364]) Sum of probabilities: tensor(1.0000)
Comparison
| Function | Range | Use Case | Advantages |
|---|---|---|---|
| ReLU | [0, ?) | Hidden layers | Fast, avoids vanishing gradients |
| Leaky ReLU | (-?, ?) | Hidden layers | Solves dying ReLU problem |
| Sigmoid | (0, 1) | Binary classification | Outputs probabilities |
| Tanh | (-1, 1) | Hidden layers | Zero-centered output |
| Softmax | (0, 1) | Multi-class output | Probability distribution |
Conclusion
PyTorch provides a comprehensive set of activation functions, each suited for specific tasks. ReLU variants work well for hidden layers, while Sigmoid and Softmax are ideal for classification outputs. Choose the activation function based on your network architecture and problem requirements.
