How to apply a 2D Average Pooling in PyTorch?

We can apply a 2D Average Pooling over an input image composed of several input planes using the torch.nn.AvgPool2d() module. The input to a 2D Average Pooling layer must be of size [N,C,H,W] where N is the batch size, C is the number of channels, H and W are the height and width of the input image.

The main feature of an Average Pooling operation is the filter or kernel size and stride. This module supports TensorFloat32.




  • kernel_size – The size of the window to take an average over.

Along with this parameter, there are some optional parameters also such as stride, padding, dilation, etc. We will take examples of these parameters in detail in the following Python examples.


You could use the following steps to apply a 2D Average Pooling −

  • Import the required library. In all the following examples, the required Python library is torch. Make sure you have already installed it. To apply 2D Average Pooling on images we need torchvision and Pillow as well.

import torch
import torchvision
from PIL import Image
  • Define input tensor or read the input image. If an input is an image, then we first convert it into a torch tensor.

  • Define kernel_size, stride and other parameters.

  • Next define an Average Pooling pooling by passing the above defined parameters to torch.nn.AvgPool2d().

pooling = nn.AvgPool2d(kernel_size)
  • Apply the Average Pooling pooling on the input tensor or image tensor.

output = pooling(input)
  • Next print the tensor after Average Pooling. If the input was an image tensor, then to visualize the image, we first convert the tensor obtained after Average Pooling to PIL image and then visualize the image.

Let's take a couple of example to have a better understanding of how it works.

Input Image

We will use the following image as the input file in the second example.

Example 1

In the following Python example, we perform 2D Average Pooling on an input tensor. We apply different combinations of kernel_size, stride, padding, and dilation.

# Python 3 program to perform 2D Avg Pooling
# Import the required libraries
import torch
import torch.nn as nn

'''input of size = [N,C,H, W] or [C,H, W]
N==>batch size,
C==> number of channels,
H==> height of input planes in pixels,
W==> width in pixels.
input = torch.empty(3, 4, 4).random_(256)
print("Input Tensor:
", input) print("Input Size:",input.size()) # pool of square window of size=3, stride=1 pooling1 = nn.AvgPool2d(3, stride=1) # Perform Average Pooling output = pooling1(input) print("Output Tensor:
", output) print("Output Size:",output.size()) # pool of non-square window pooling2 = nn.AvgPool2d((2, 1), stride=(1, 2)) # Perform average Pool output = pooling2(input) print("Output Tensor:
", output) print("Output Size:",output.size())


Input Tensor:
   tensor([[[194., 159., 7., 90.],
      [128., 173., 28., 211.],
      [252., 123., 248., 147.],
      [144., 107., 28., 17.]],

      [[122., 140., 117., 52.],
      [252., 118., 216., 101.],
      [ 88., 121., 25., 210.],
      [223., 162., 39., 125.]],

      [[168., 113., 53., 246.],
      [199., 23., 54., 74.],
      [ 95., 246., 245., 48.],
      [222., 175., 144., 127.]]])
Input Size: torch.Size([3, 4, 4])
Output Tensor:
   tensor([[[145.7778, 131.7778],
      [136.7778, 120.2222]],

      [[133.2222, 122.2222],
      [138.2222, 124.1111]],

      [[132.8889, 122.4444],
      [155.8889, 126.2222]]])
Output Size: torch.Size([3, 2, 2])
Output Tensor:
   tensor([[[161.0000, 17.5000],
      [190.0000, 138.0000],
      [198.0000, 138.0000]],

      [[187.0000, 166.5000],
      [170.0000, 120.5000],
      [155.5000, 32.0000]],

      [[183.5000, 53.5000],
      [147.0000, 149.5000],
      [158.5000, 194.5000]]])
Output Size: torch.Size([3, 3, 2])

Example 2

In the following Python example, we perform 2D Avg Pooling on an input image. To apply 2D Avg Pooling, we first convert the image to a torch tensor and after Avg Pooling again convert it to a PIL image for visualization

# Python 3 program to perform 2D Average Pooling on image
# Import the required libraries
import torch
import torchvision
from PIL import Image
import torchvision.transforms as T
import torch.nn.functional as F

# read the input image
img ='panda.jpg')

# convert the image to torch tensor
img = T.ToTensor()(img)
print("Original size of Image:", img.size()) #Size([3, 466, 700])

# unsqueeze to make 4D
img = img.unsqueeze(0)

# define avg pool with square window of size=4, stride=1
pool = torch.nn.AvgPool2d(4, 1)
img = pool(img)
img = img.squeeze(0)
print("Size after AvgPool:",img.size())
img = T.ToPILImage()(img)


Original size of Image: torch.Size([3, 466, 700])
Size after AvgPool: torch.Size([3, 463, 697])

Note that you may get different output images at different runs because of random initialization of the weights and biases.