How can Tensorflow and Python be used to verify the CIFAR dataset?

The CIFAR dataset can be verified by plotting the images present in the dataset on the console. Since the CIFAR labels are arrays, an extra index would be needed. The imshow method from the matplotlib library is used to display the image.

Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?

We are using Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

Loading and Preparing the CIFAR Dataset

First, let's load the CIFAR-10 dataset and prepare it for visualization ?

import tensorflow as tf
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()

# Define class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
               'dog', 'frog', 'horse', 'ship', 'truck']

print(f"Training images shape: {train_images.shape}")
print(f"Training labels shape: {train_labels.shape}")
print(f"Number of classes: {len(class_names)}")
Training images shape: (50000, 32, 32, 3)
Training labels shape: (50000, 1)
Number of classes: 10

Verifying the Dataset by Plotting Images

Now let's verify the data by visualizing the first 15 images from the training set ?

import tensorflow as tf
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()

# Define class names for CIFAR-10
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
               'dog', 'frog', 'horse', 'ship', 'truck']

print("Verifying the data")
plt.figure(figsize=(10, 10))
print("Plot the first 15 images")
print("An extra index is needed since CIFAR labels are arrays")

for i in range(15):
    plt.subplot(5, 3, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i])
    plt.xlabel(class_names[train_labels[i][0]])

plt.tight_layout()
plt.show()
Verifying the data
Plot the first 15 images
An extra index is needed since CIFAR labels are arrays

Understanding the CIFAR Labels Structure

The CIFAR labels are stored as 2D arrays where each label has shape (1,). This is why we need the extra index [0] to access the actual class number ?

import tensorflow as tf

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()

# Examine the label structure
print(f"First label shape: {train_labels[0].shape}")
print(f"First label value: {train_labels[0]}")
print(f"Accessing the class number: {train_labels[0][0]}")

# Show first 5 labels
print("\nFirst 5 labels:")
for i in range(5):
    print(f"Label {i}: {train_labels[i]} ? Class: {train_labels[i][0]}")
First label shape: (1,)
First label value: [6]
Accessing the class number: 6

First 5 labels:
Label 0: [6] ? Class: 6
Label 1: [9] ? Class: 9
Label 2: [9] ? Class: 9
Label 3: [4] ? Class: 4
Label 4: [1] ? Class: 1

Key Points

  • The CIFAR-10 dataset contains 50,000 training images of size 32x32 pixels with 3 color channels (RGB)
  • Labels are stored as 2D arrays with shape (1,), requiring an extra index [0] to access the class number
  • The matplotlib.pyplot.imshow() function displays the images without needing the cmap=plt.cm.binary parameter for color images
  • Using plt.tight_layout() improves the spacing between subplots

Conclusion

Verifying the CIFAR dataset involves loading the data, plotting sample images, and understanding the label structure. The visualization confirms that images and their corresponding class labels are correctly loaded and ready for machine learning tasks.

Updated on: 2026-03-25T16:09:37+05:30

182 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements