Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How can autoencoder be generated using an encoder and decoder using Python?
An autoencoder is a neural network architecture used for unsupervised learning that compresses input data into a lower-dimensional representation and then reconstructs it. It consists of two main components: an encoder that compresses the input and a decoder that reconstructs the original data from the compressed representation.
TensorFlow is a machine learning framework provided by Google. The 'tensorflow' package can be installed using ?
pip install tensorflow
Keras is a high-level deep learning API that runs on top of TensorFlow. It provides essential abstractions for building neural networks efficiently ?
import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers
Understanding Autoencoder Architecture
An autoencoder has two main parts:
- Encoder: Compresses input data into a latent representation
- Decoder: Reconstructs the original data from the compressed representation
Building the Encoder
The encoder uses convolutional layers to progressively reduce the spatial dimensions while extracting features ?
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Define the input layer
encoder_input = keras.Input(shape=(28, 28, 1), name="img")
# Build the encoder layers
print("Adding layers to the encoder")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)
print("Performing global max pooling")
encoder_output = layers.GlobalMaxPooling2D()(x)
# Create the encoder model
print("Creating the encoder model")
encoder = keras.Model(encoder_input, encoder_output, name="encoder")
print("Encoder model summary:")
encoder.summary()
Adding layers to the encoder Performing global max pooling Creating the encoder model Encoder model summary: Model: "encoder" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img (InputLayer) [(None, 28, 28, 1)] 0 _________________________________________________________________ conv2d (Conv2D) (None, 26, 26, 16) 160 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 32) 4640 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 6, 6, 32) 9248 _________________________________________________________________ conv2d_3 (Conv2D) (None, 4, 4, 16) 4624 _________________________________________________________________ global_max_pooling2d (Global (None, 16) 0 ================================================================= Total params: 18,672 Trainable params: 18,672 Non-trainable params: 0 _________________________________________________________________
Building the Complete Autoencoder
The decoder reconstructs the original image from the encoded representation using transpose convolutions ?
# Build the decoder starting from encoder output
print("Building the decoder")
x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)
# Create the complete autoencoder
autoencoder = keras.Model(encoder_input, decoder_output, name="autoencoder")
print("Complete autoencoder summary:")
autoencoder.summary()
Building the decoder Complete autoencoder summary: Model: "autoencoder" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img (InputLayer) [(None, 28, 28, 1)] 0 _________________________________________________________________ conv2d (Conv2D) (None, 26, 26, 16) 160 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 32) 4640 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 6, 6, 32) 9248 _________________________________________________________________ conv2d_3 (Conv2D) (None, 4, 4, 16) 4624 _________________________________________________________________ global_max_pooling2d (Global (None, 16) 0 _________________________________________________________________ reshape (Reshape) (None, 4, 4, 1) 0 _________________________________________________________________ conv2d_transpose (Conv2DTran (None, 6, 6, 16) 160 _________________________________________________________________ conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32) 4640 _________________________________________________________________ up_sampling2d (UpSampling2D) (None, 24, 24, 32) 0 _________________________________________________________________ conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16) 4624 _________________________________________________________________ conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1) 145 ================================================================= Total params: 28,241 Trainable params: 28,241 Non-trainable params: 0 _________________________________________________________________
Key Components Explained
- Conv2D layers: Extract features and reduce spatial dimensions
- MaxPooling2D: Downsamples the feature maps
- GlobalMaxPooling2D: Creates the bottleneck latent representation
- Conv2DTranspose: Upsamples and reconstructs spatial dimensions
- UpSampling2D: Increases spatial resolution
Conclusion
This autoencoder compresses 28×28 images into 16-dimensional vectors and reconstructs them back. The encoder-decoder architecture is useful for dimensionality reduction, denoising, and feature learning tasks.
