Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How can TensorFlow be used to preprocess Fashion MNIST data in Python?
TensorFlow is a machine learning framework provided by Google. It is an open-source framework used in conjunction with Python to implement algorithms, deep learning applications and much more. It is used in research and for production purposes.
The Fashion MNIST dataset contains grayscale images of clothing items from 10 different categories (T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots). Each image is 28×28 pixels with pixel values ranging from 0 to 255.
Installing TensorFlow
The 'tensorflow' package can be installed on Windows using the below command ?
pip install tensorflow
Loading and Exploring the Dataset
First, let's load the Fashion MNIST dataset and explore its structure ?
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
# Load the Fashion MNIST dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# Class names for Fashion MNIST
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print(f"Training images shape: {train_images.shape}")
print(f"Training labels shape: {train_labels.shape}")
print(f"Test images shape: {test_images.shape}")
print(f"Test labels shape: {test_labels.shape}")
Training images shape: (60000, 28, 28) Training labels shape: (60000,) Test images shape: (10000, 28, 28) Test labels shape: (10000,)
Visualizing Sample Images
Let's display a sample image to understand the data format ?
import tensorflow as tf
import matplotlib.pyplot as plt
# Load dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Display first image with colorbar
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.title(f"Sample Image: {class_names[train_labels[0]]}")
plt.show()
print(f"Pixel value range: {train_images[0].min()} to {train_images[0].max()}")
Data Preprocessing
Before training a neural network, we need to preprocess the data by normalizing pixel values from 0-255 to 0-1 range ?
import tensorflow as tf
import matplotlib.pyplot as plt
# Load dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Normalize pixel values to 0-1 range
train_images = train_images / 255.0
test_images = test_images / 255.0
print(f"After normalization - pixel range: {train_images[0].min()} to {train_images[0].max()}")
After normalization - pixel range: 0.0 to 1.0
Visualizing Preprocessed Data
Let's display multiple preprocessed images to verify the data format ?
import tensorflow as tf
import matplotlib.pyplot as plt
# Load and preprocess dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Normalize pixel values
train_images = train_images / 255.0
test_images = test_images / 255.0
# Display first 15 images with labels
plt.figure(figsize=(12, 12))
for i in range(15):
plt.subplot(5, 3, i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i], cmap=plt.cm.binary)
plt.xlabel(class_names[train_labels[i]])
plt.tight_layout()
plt.show()
Key Preprocessing Steps
Normalization ? Scale pixel values from 0-255 to 0-1 range by dividing by 255
Consistent preprocessing ? Apply same normalization to both training and test datasets
Data verification ? Display sample images to ensure correct preprocessing
Shape preservation ? Maintain original image dimensions (28×28) after preprocessing
Why Normalize the Data?
Normalization is crucial for neural network training because:
Faster convergence ? Smaller input values lead to faster training
Stable gradients ? Prevents gradient explosion/vanishing problems
Consistent scale ? All features contribute equally to learning
Conclusion
Preprocessing Fashion MNIST data involves loading the dataset, normalizing pixel values to 0-1 range, and applying the same preprocessing to both training and test sets. This ensures optimal neural network training performance and consistent data format across all phases.
