The flower dataset can be split into training and validation set, using the keras preprocessing API, with the help of the ‘image_dataset_from_directory’ which asks for the percentage split for the validation set.
An image classifier is created using a keras.Sequential model, and data is loaded using preprocessing.image_dataset_from_directory. Data is efficiently loaded off disk. Overfitting is identified and techniques are applied to mitigate it. These techniques include data augmentation, and dropout. There are images of 3700 flowers. This dataset contaisn 5 sub directories, and there is one sub directory per class. They are: daisy, dandelion, roses, sunflowers, and tulips.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
batch_size = 32 img_height = 180 img_width = 180 print("The data is being split into training and validation set") train_ds = tf.keras.preprocessing.image_dataset_from_directory( data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=batch_size)
Code credit: https://www.tensorflow.org/tutorials/images/classification
The data is being split into training and validation set Found 3670 files belonging to 5 classes. Using 2936 files for training.