How can Tensorflow be used to split the flower dataset into training and validation?

The flower dataset can be split into training and validation set, using the keras preprocessing API, with the help of the ‘image_dataset_from_directory’ which asks for the percentage split for the validation set.

Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?

An image classifier is created using a keras.Sequential model, and data is loaded using preprocessing.image_dataset_from_directory. Data is efficiently loaded off disk. Overfitting is identified and techniques are applied to mitigate it. These techniques include data augmentation, and dropout. There are images of 3700 flowers. This dataset contaisn 5 sub directories, and there is one sub directory per class. They are: daisy, dandelion, roses, sunflowers, and tulips.

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

batch_size = 32
img_height = 180
img_width = 180
print("The data is being split into training and validation set")
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
   image_size=(img_height, img_width),

Code credit:


The data is being split into training and validation set
Found 3670 files belonging to 5 classes.
Using 2936 files for training.


  • These images are loaded off the disk using the image_dataset_from_directory utility.
  • This will go from a directory of images on disk to a
  • Once the data has been downloaded, some parameters are defined for the loader.
  • The data is split into training and validation set.

Updated on: 20-Feb-2021


Kickstart Your Career

Get certified by completing the course

Get Started