Building Deep Learning Models Using the PyTorch Library

PyTorch is a widely used open-source machine learning framework that was developed by Facebook's AI research team. It is known for its flexibility, speed, and ability to build complex models easily. PyTorch is based on the Torch library, which was originally developed in Lua, and it provides Python bindings.

PyTorch is widely used in academia and industry for various machine learning tasks such as computer vision, natural language processing, and speech recognition. In this tutorial, we will learn how to use the PyTorch library to build a deep learning model.

Getting Started

Before we dive into using the torch library, we first need to install the library using pip. However, since it does not come built-in, we must first install the torch library. This can be done using the pip package manager.

To install the torch library, open your terminal and type the following command βˆ’

pip install torch

This will download and install the torch library and its dependencies. Once installed, we can start working with torch and leverage it’s modules!

In this tutorial, we will be building a Convolutional Neural Network (CNN) for image classification using PyTorch. Convolutional Neural Networks (CNNs) are a type of deep learning model that is widely used for image classification tasks. In this tutorial, we will build a CNN using PyTorch to classify images.

Step 1: Importing the Required Libraries

The first step is to import the required libraries. We will be using the torch, torch.nn, and torchvision libraries.

import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms

Step 2: Loading and Preprocessing the Dataset

We will be using the CIFAR-10 dataset, which is a widely used dataset for image classification tasks. The dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class.

transform = transforms.Compose(
   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
   download=True, transform=transform)
trainloader =, batch_size=4,
   shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
   download=True, transform=transform)
testloader =, batch_size=4,
   shuffle=False, num_workers=2)

We use the torchvision.transforms library to preprocess the images. We first convert the images to tensors and then normalize them. We then load the dataset and create a data loader for both the training and test sets.

Step 3: Defining the CNN Model

After preparing the data, the next step is to define the CNN model using PyTorch. In this step, we will define the structure of our CNN model. Our model will consist of two convolutional layers, followed by two fully connected layers.

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
   def __init__(self):
      super(Net, self).__init__()
      # input image channel, 3 for RGB images
      # output channel, 6 for 6 filters
      # kernel size = 5
      self.conv1 = nn.Conv2d(3, 6, 5)
      # input channel, 6 from previous layer
      # output channel, 16 for 16 filters
      # kernel size = 5
      self.conv2 = nn.Conv2d(6, 16, 5)
      # an affine operation: y = Wx + b
      # 16 * 5 * 5 is the size of the image after convolutional layers
      # 120 is the output size of the first fully connected layer
      self.fc1 = nn.Linear(16 * 5 * 5, 120)
      # 84 is the output size of the second fully connected layer
      self.fc2 = nn.Linear(120, 84)
      # 10 is the output size of the last fully connected layer
      self.fc3 = nn.Linear(84, 10)

   def forward(self, x):
      # max pooling over a (2, 2) window
      x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
      # if the size is a square you can only specify a single number
      x = F.max_pool2d(F.relu(self.conv2(x)), 2)
      # flatten the input for fully connected layers
      x = x.view(-1, self.num_flat_features(x))
      x = F.relu(self.fc1(x))
      x = F.relu(self.fc2(x))
      x = self.fc3(x)
      return x

   def num_flat_features(self, x):
      size = x.size()[1:]  # all dimensions except the batch dimension
      num_features = 1
      for s in size:
         num_features *= s
      return num_features

net = Net()

Step 4: Train the Model

Now that we have defined our CNN model, it's time to train it on our dataset. To do this, we will use the PyTorch DataLoader class to load our data in batches and feed it into the model for training. We will also define our loss function and optimizer.

Here is the code to train our model βˆ’

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Train the model
num_epochs = 10
for epoch in range(num_epochs):
   running_loss = 0.0
   for i, data in enumerate(train_loader, 0):
      inputs, labels = data

      # Forward pass
      outputs = model(inputs)
      loss = criterion(outputs, labels)

      # Backward and optimize

      # Print statistics
      running_loss += loss.item()
      if i % 2000 == 1999:    # Print every 2000 mini-batches
         print('[%d, %5d] loss: %.3f' %
            (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0
print('Finished Training')

We loop over our dataset for 10 epochs and train the model using the training data. In each epoch, we reset the running loss to 0 and loop over the batches of data.

For each batch, we perform a forward pass through the model, compute the loss, perform backpropagation, and optimize the model using the optimizer. Finally, we print the training loss after every 2000 mini-batches.

Step 5: Evaluate the Model

Now that we have trained our model, it's time to evaluate its performance on our test dataset. We will use the PyTorch DataLoader class to load our test data in batches and feed it into the model for evaluation.

Here is the code to evaluate our model βˆ’

# Evaluate the model
correct = 0
total = 0
with torch.no_grad():
   for data in test_loader:
      images, labels = data
      outputs = model(images)
      _, predicted = torch.max(, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
   100 * correct / total))

In this code, we first initialize the correct and total variables to 0. We then loop over our test dataset using the PyTorch DataLoader class and feed the test data into the model. We use the torch.max() function to get the index of the highest output value, which represents the predicted class. We then compare the predicted class to the true class and update the correct and total variables accordingly.

Finally, we print the accuracy of the model on the test dataset.


Concluding, PyTorch is a robust deep learning package with an easy-to-use interface for creating and training neural networks. We covered the fundamentals of using PyTorch to construct a convolutional neural network for image classification in this tutorial.

PyTorch's flexibility and ease of use make it an excellent choice for both researchers and practitioners interested in experimenting with deep learning. The dynamic computational graph and automatic differentiation engine of the library make it simple to create complex models and optimise them efficiently. Furthermore, PyTorch has a large and active community, which means there are a lot of resources for learning and getting help when you need it.

Overall, PyTorch is a wonderful alternative for anyone interested in getting started with deep learning, whether they are a newbie or a seasoned practitioner. PyTorch, with its simple API and powerful capabilities, can help you quickly design and train deep learning models for a variety of applications.

Updated on: 31-Aug-2023


Kickstart Your Career

Get certified by completing the course

Get Started