Torch - Convolutional Neural Networks

Quiz

Convolutional Neural Networks(CNN) are specialized type of a deep learning model that determines the image processing tasks. These are designed automatically to learn spatial hierarchies of features from the input images, makes each image particularly effective for tasks such as object detection, segmentation and image classification.

Torch, an open source machine learning library based on the Lua programming language, that provides the robust framework for training and building CNNs. It is Known for its speed and flexibility that makes the researchers and developers popular by implementing complex neural network architectures.

The core building of CNNs in Torch includes the convolution layers, pooling layers, loss function, activation function and optimization algorithms. Spatial convolution layers apply operations to the extracting features such as edges, input data, patterns, textures. These layers are implemented using nn.SpatialConvolution in Torch.

Architectures

Convolutional neural network has specific advanced image recognition, with different key architectures to contribute the progress. LeNet-5, one of the earliest CNNs, was trained on the MNISR dataset that features a simple yet effective architecture of convolutional layers followed by connected layers.

The imageNet challenge provides a large dataset and a competitive platform that leads to the development of powerful models like AlexNet, that is determined for the accuracy improvement. GoogleNet introduced the inception architecture that enhances the higher accuracy.

Implementation in Torch using LeNet-5 −

import torch
import torch.nn as nn
import torch.nn.functional as F

class LeNet5(nn.Module):
   def __init__(self):
      super(LeNet5, self).__init__()
      self.conv1 = nn.Conv2d(2, 6, kernel_size=5)
      self.conv2 = nn.Conv2d(10, 15, kernel_size=5)
      self.fc1 = nn.Linear(15*5*5, 110)
      self.fc2 = nn.Linear(110, 74)
      self.fc3 = nn.Linear(74, 10)

   def forward(self, x):
      x = F.relu(self.conv1(x))
      x = F.max_pool2d(x, 2)
      x = F.relu(self.conv2(x))
      x = F.max_pool2d(x, 2)
      x = x.view(-1, 15*5*5)
      x = F.relu(self.fc1(x))
      x = F.relu(self.fc2(x))
      x = self.fc3(x)
      return x

model = LeNet5()

Training

Training in Convolutional Neural Network in Torch defines the model architecture, that prepares the dataset using data loaders. The training loop determines the batches of images from the network that computes the loss and updates weights using backpropagation. This optimizes like SGD and Adam that are used to minimize the loss function and improving the model accuracy.

Adding an extra layer to a neural network with the same number of neurons per layer can specify the improvement of network's performance. The cost function value decreased from 0.18 to 0.06 that indicates better convergence to a minimum number of epochs.

To run the script on GPU, the following code is necessary −

model = model.cuda()
criterion = criterion.cuda()
X = X.cuda()
Y = Y.cuda()

Print Page