How to apply a 2D transposed convolution operation in PyTorch?


We can apply a 2D transposed convolution operation over an input image composed of several input planes using the torch.nn.ConvTranspose2d() module. This module can be seen as the gradient of Conv2d with respect to its input.

The input to a 2D transpose convolution layer must be of size [N,C,H,W] where N is the batch size, C is the number of channels, H and W are the height and width of the input image, respectively.

Generally a 2D transposed convolution operation is applied on the image tensors. For a RGB image, the number of channels is 3. The main feature of a transpose convolution operation is the filter or kernel size and stride. This module supports TensorFloat32.

Syntax

torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size)

Parameters

  • in_channels – Number of channels in input image.

  • out_channels – Number of channels produced by transpose convolution operation.

  • kernel_size – Size of the convolving kernel.

Along with the above three parameters, there are some optional parameters also such as stride, padding, dilation, etc. We will take examples of these parameters in detail In the following Python example,.

Steps

You could use the following steps to apply a 2D transpose convolution operation −

  • Import the required library. In all the following examples, the required Python library is torch. Make sure you have already installed it. To apply 2D transpose convolution operation on images we need torchvision and Pillow as well.

import torch
import torchvision
from PIL import Image
  • Define input tensor or read the input image. If an input is an image, then we first convert it into a torch tensor.

  • Define in_channels, out_channels, kernel_size, and other parameters.

  • Next define a transpose convolution operation convt by passing the above-defined parameters to torch.nn.ConvTranspose2d()

convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
  • Apply the transpose convolution operation convt on the input tensor or the image tensor.

output = convt(input)
  • Next print the tensor after the transpose convolution operation. If the input was an image tensor, then to visualize the image, we first convert the tensor obtained after transpose convolution operation to a PIL image and then visualize the image.

Let's have a look at some examples for more clear understanding.

Input Image

We will use the following image as the input file in Example 2.

Example 1

In the following Python example, we perform 2D transpose convolution operation on input tensor. We apply different combinations of kernel_size, stride, padding, and dilation.

# Python 3 program to perform 2D transpose convolution operation
import torch
import torch.nn as nn

'''torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0)

'''

in_channels = 2
out_channels = 3
kernel_size = 2

convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)

# conv = nn.ConvTranspose2d(3, 6, 2)

'''input of size [N,C,H, W]
N==>batch size,
C==> number of channels,
H==> height of input planes in pixels,
W==> width in pixels.
'''

# define the input with below info
N=1
C=2
H=4
W=4
input = torch.empty(N,C,H,W).random_(256)
# input = torch.randn(2,3,32,64)
print("Input Tensor:
", input) print("Input Size:",input.size()) # Perform transpose convolution operation output = convt(input) print("Output Tensor:
", output) print("Output Size:",output.size()) # With square kernels (3,3) and equal stride convt = nn.ConvTranspose2d(2, 3, 3, stride=2) output = convt(input) print("Output Size:",output.size()) # non-square kernels and unequal stride and with padding convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2)) output = convt(input) print("Output Size:",output.size()) # non-square kernels and unequal stride and with padding and dilation convt = nn.ConvTranspose2d(2, 3, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1)) output = convt(input) print("Output Size:",output.size())

Output

Input Tensor:
   tensor([[[[115., 76., 102., 6.],
      [221., 173., 23., 205.],
      [123., 23., 112., 18.],
      [189., 178., 167., 143.]],

      [[239., 180., 226., 88.],
      [224., 30., 196., 224.],
      [ 57., 222., 47., 84.],
      [ 25., 255., 201., 114.]]]])
Input Size: torch.Size([1, 2, 4, 4])
Output Tensor:
   tensor([[[[ 48.1156, 64.6112, 64.9630, 47.2604, 3.9925],
      [74.9169, 80.7055, 138.8992, 82.8471, 54.3722],
      [20.0938, 49.5610, 30.2914, 93.3563, 3.1597],
      [-27.1410, 118.8138, 92.8670, 50.6170, 37.5564],
      [-27.7676, 6.5762, 33.6408, 6.7176, -8.8372]],
      [[ -18.2188, -56.5362, -49.8063, -43.3336, -16.8645],
      [ -23.4012, -6.1607, 40.5064, -17.4547, -25.1738],
      [ -5.7752, 53.6838, -27.9412, 36.7660, 44.0866],
      [ -23.5205, 1.1443, -29.0826, -34.7213, -4.1535],
      [ 5.6746, 38.4026, 72.8414, 59.2990, 34.9241]],
      [[ -35.0380, -31.4031, -38.0059, -19.3247, -5.6272],
      [-109.2401, -12.9763, -62.2776, -31.0825, 19.2766],
      [ -93.6596, -18.5403, -67.5457, -61.8533, 32.3005],
      [ -27.7020, -71.3938, -18.9532, -26.8304, 20.0184],
      [ -29.2334, -85.8179, -35.4292, -16.4065, 19.0788]]]],
   grad_fn=<SlowConvTranspose2DBackward>)
Output Size: torch.Size([1, 3, 5, 5])
Output Size: torch.Size([1, 3, 9, 9])
Output Size: torch.Size([1, 3, 1, 4])
Output Size: torch.Size([1, 3, 5, 4])

Example 2

In the following Python example, we perform 2D transpose convolution operation on an input image. To apply 2D transpose convolution, we first convert the image to a torch tensor and after transpose convolution, again convert it to a PIL image for visualization.

# Python program to perform 2D transpose convolution operation
# Import the required libraries
import torch
import torchvision
from PIL import Image
import torchvision.transforms as T

# Read input image
img = Image.open('car.jpg')

# convert the input image to torch tensor
img = T.ToTensor()(img)
print("Input image size:", img.size()) # size = [3, 464, 700]

# unsqueeze the image to make it 4D tensor
img = img.unsqueeze(0) # image size = [1, 3, 464, 700]

# define transpose convolution layer
# convt = nn.ConvTranspose2d(in_channels, out_channels, kernel_size)
convt = torch.nn.ConvTranspose2d(3, 3, 2)

# apply transpose convolution operation on image
img = convt(img)
# squeeze image to make it 3D
img = img.squeeze(0) # now image is again 3D
print("Output image size:",img.size())

# convert image to PIL image
img = T.ToPILImage()(img)

# display the image after convolution
img.show()

'''
Note: You may get different output image after the convolution operation
because the weights initialized may be different at different runs.
'''

Output

Input image size: torch.Size([3, 464, 700])
Output image size: torch.Size([3, 465, 701])

Note that you may see some changes in the image obtained after each run because of the initialization of weights and biases.

Updated on: 25-Jan-2022

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements