Implementation of a CNN based Image Classifier using PyTorch.

PyTorch Server Side Programming Programming

Introduction

Due to its capacity to recognise spatial characteristics in images, convolutional neural networks (CNNs) have been extensively used in image classification applications. A well-liked open-source machine learning package called PyTorch offers assistance in creating and honing neural networks, including CNNs. In this article, we'll go over how to use PyTorch to create a CNN-based image classifier.

Dataset

Let's first talk about the dataset before getting into the specifics of the implementation. The CIFAR-10 dataset, which has 60,000 32x32 color images divided into 10 classes with 6,000 images each, will be the one we use for this course. The categories are truck, ship, frog, horse, bird, cat, deer, automobile, airplane, and automobile. 10,000 testing photos and 50,000 training images make up the dataset.

Data Preprocessing

Preprocessing the data is the initial stage in creating a CNN-based image classifier. In the code, we'll normalize the image pixel values to range from 0 to 1. In order to expand the dataset and lessen overfitting, we will also use various data augmentation techniques. Techniques for enhancing data include random rotations, random cropping, and random horizontal flips.

Building the CNN

We can start creating CNN now that the data has been preprocessed. Our CNN's architecture will include a number of convolutional layers, max pooling layers, and fully linked layers.

Training the CNN

We can now train CNN using the CIFAR-10 dataset after defining it. To accomplish this, we'll make use of PyTorch's built-in neural network training capability, which includes establishing a loss function and an optimizer.

We will employ the cross-entropy loss function, which is frequently employed in classification tasks. Stochastic gradient descent (SGD), with a learning rate of 0.001 and momentum of 0.9, will be the optimizer we apply.

Evaluating the CNN

After CNN has been trained successfully, we can determine its efficiency by analyzing testing dataset. After iterating over the dataset, we can evaluate the accuracy of CNNs to achieve it.

Code

here's an example of a simple CNN-based image classifier implemented using PyTorch,

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim

# Define the CNN architecture
class Net(nn.Module):
   def __init__(self):
      super(Net, self).__init__()
      self.conv1 = nn.Conv2d(3, 6, 5)
      self.pool = nn.MaxPool2d(2, 2)
      self.conv2 = nn.Conv2d(6, 16, 5)
      self.fc1 = nn.Linear(16 * 5 * 5, 120)
      self.fc2 = nn.Linear(120, 84)
      self.fc3 = nn.Linear(84, 10)

   def forward(self, x):
      x = self.pool(nn.functional.relu(self.conv1(x)))
      x = self.pool(nn.functional.relu(self.conv2(x)))
      x = x.view(-1, 16 * 5 * 5)
      x = nn.functional.relu(self.fc1(x))
      x = nn.functional.relu(self.fc2(x))
      x = self.fc3(x)
      return x

# Preprocess the data
transform = transforms.Compose(
   [transforms.RandomHorizontalFlip(),
   transforms.RandomCrop(32, padding=4),
   transforms.ToTensor(),
   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=2)

# Train the network
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(10):
   running_loss = 0.0
   for i, data in enumerate(trainloader, 0):
      inputs, labels = data
      optimizer.zero_grad()
      outputs = net(inputs)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()
      running_loss += loss.item()
      if i % 2000 == 1999:
         print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
         running_loss = 0.0

# Evaluate the network
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
      images, labels = data
      outputs = net(images)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total))

Output

[1,  2000] loss: 2.168
[1,  4000] loss: 1.828
[1,  6000] loss: 1.639
[1,  8000] loss: 1.552
[1, 10000] loss: 1.490
[2,  2000] loss: 1.407
[2,  4000] loss: 1.378
[2,  6000] loss: 1.324
[2,  8000] loss: 1.297
[2, 10000] loss: 1.269
[3,  2000] loss: 1.190
[3,  4000] loss: 1.170
[3,  6000] loss: 1.142
[3,  8000] loss: 1.126
[3, 10000] loss: 1.108
[4,  2000] loss: 1.031
[4,  4000] loss: 1.037
[4,  6000] loss: 1.016
[4,  8000] loss: 1.005
[4, 10000] loss: 1.002
[5,  2000] loss: 0.932
[5,  4000] loss: 0.944
[5,  6000] loss: 0.932
[5,  8000] loss: 0.913
[5, 10000] loss: 0.913
[6,  2000] loss: 0.831
[6,  4000] loss: 0.835
[6,  6000] loss: 0.846
[6,  8000] loss: 0.835
[6, 10000] loss: 0.829
[7,  2000] loss: 0.745
[7,  4000] loss: 0.759
[7,  6000] loss: 0.762
[7,  8000] loss: 0.763
[7, 10000] loss: 0.753
[8,  2000] loss: 0.670
[8,  4000] loss: 0.696
[8,  6000] loss: 0.690
[8,  8000] loss: 0.701
[8, 10000] loss: 0.690
[9,  2000] loss: 0.614
[9,  4000] loss: 0.635
[9,  6000] loss: 0.646
[9,  8000] loss: 0.641
[9, 10000] loss: 0.631
[10,  2000] loss: 0.562
[10,  4000] loss: 0.577
[10,  6000] loss: 0.597
[10,  8000] loss: 0.592
[10, 10000] loss: 0.586
Accuracy of the network on the 10000 test images: 68 %

This output displays the accuracy of the network on the test set following training as well as the loss at each training iteration. As you can see, the accuracy on the test set peaks at 68%, and the loss gradually declines over the course of the 10 training epochs. This result may not be cutting-edge, but it does show the fundamental steps involved in utilising PyTorch to train a CNN for image categorization.

Conclusion

In conclusion, Using PyTorch, a CNN-based image classifier can be made, as shown in this article. Preprocessing of the data, including normalization and data augmentation, is part of the procedure. Convolutional layers, max pooling layers, and fully connected layers make up the CNN architecture. On the CIFAR-10 dataset, cross-entropy loss and a stochastic gradient descent optimizer are used for training. The precision of the CNN is then assessed and assessed on the testing dataset. You can now begin developing your own image categorization models and learn more about computer vision with this knowledge.

Sohail Tabrez

Updated on: 12-Jul-2023

69 Views

Kickstart Your Career

Get certified by completing the course

Get Started