How can Tensorflow be used to pre-process the flower training dataset?

The flower dataset can be pre-processed using the keras preprocessing API. It has a method named ‘image_dataset_from_directory’ that takes the validation set, the directory where data is stored, and other parameters to process the dataset.

Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?

We will use the Keras Sequential API, which is helpful in building a sequential model that is used to work with a plain stack of layers, where every layer has exactly one input tensor and one output tensor. An image classifier is created using a keras.Sequential model, and data is loaded using preprocessing.image_dataset_from_directory. 

Data is efficiently loaded off disk. Overfitting is identified and techniques are applied to mitigate it. These techniques include data augmentation, and dropout. There are images of 3700 flowers. This dataset contaisn 5 sub directories, and there is one sub directory per class. They are: daisy, dandelion, roses, sunflowers, and tulips.

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

print("Pre-processing the dataset using keras.preprocessing")
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
   image_size=(img_height, img_width),
class_names = train_ds.class_names
print("The class names are:")

Code credit:


Pre-processing the dataset using keras.preprocessing
Found 3670 files belonging to 5 classes.
Using 734 files for validation.
The class names are:
['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']


  • The dataset is processed using keras.preprocessing method.
  • The next step is to display the class names on the console.