Article Categories

Selected Reading

How can autoencoder be generated using an encoder and decoder using Python?

Keras Python Server Side Programming Programming

An autoencoder is a neural network architecture used for unsupervised learning that compresses input data into a lower-dimensional representation and then reconstructs it. It consists of two main components: an encoder that compresses the input and a decoder that reconstructs the original data from the compressed representation.

TensorFlow is a machine learning framework provided by Google. The 'tensorflow' package can be installed using ?

pip install tensorflow

Keras is a high-level deep learning API that runs on top of TensorFlow. It provides essential abstractions for building neural networks efficiently ?

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

Understanding Autoencoder Architecture

An autoencoder has two main parts:

Encoder: Compresses input data into a latent representation
Decoder: Reconstructs the original data from the compressed representation

Building the Encoder

The encoder uses convolutional layers to progressively reduce the spatial dimensions while extracting features ?

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Define the input layer
encoder_input = keras.Input(shape=(28, 28, 1), name="img")

# Build the encoder layers
print("Adding layers to the encoder")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)

print("Performing global max pooling")
encoder_output = layers.GlobalMaxPooling2D()(x)

# Create the encoder model
print("Creating the encoder model")
encoder = keras.Model(encoder_input, encoder_output, name="encoder")
print("Encoder model summary:")
encoder.summary()

Adding layers to the encoder
Performing global max pooling
Creating the encoder model
Encoder model summary:
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________

Building the Complete Autoencoder

The decoder reconstructs the original image from the encoded representation using transpose convolutions ?

# Build the decoder starting from encoder output
print("Building the decoder")
x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)

# Create the complete autoencoder
autoencoder = keras.Model(encoder_input, decoder_output, name="autoencoder")
print("Complete autoencoder summary:")
autoencoder.summary()

Building the decoder
Complete autoencoder summary:
Model: "autoencoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0
_________________________________________________________________
reshape (Reshape)            (None, 4, 4, 1)           0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 6, 6, 16)          160
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32)          4640
_________________________________________________________________
up_sampling2d (UpSampling2D) (None, 24, 24, 32)        0
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16)        4624
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1)         145
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________

Key Components Explained

Conv2D layers: Extract features and reduce spatial dimensions
MaxPooling2D: Downsamples the feature maps
GlobalMaxPooling2D: Creates the bottleneck latent representation
Conv2DTranspose: Upsamples and reconstructs spatial dimensions
UpSampling2D: Increases spatial resolution

Conclusion

This autoencoder compresses 28×28 images into 16-dimensional vectors and reconstructs them back. The encoder-decoder architecture is useful for dimensionality reduction, denoising, and feature learning tasks.

AmitDiwan

Updated on: 2026-03-25T14:47:57+05:30

363 Views

Previous Next