- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can Tensorflow be used to split the flower dataset into training and validation?
The flower dataset can be split into training and validation set, using the keras preprocessing API, with the help of the ‘image_dataset_from_directory’ which asks for the percentage split for the validation set.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
An image classifier is created using a keras.Sequential model, and data is loaded using preprocessing.image_dataset_from_directory. Data is efficiently loaded off disk. Overfitting is identified and techniques are applied to mitigate it. These techniques include data augmentation, and dropout. There are images of 3700 flowers. This dataset contaisn 5 sub directories, and there is one sub directory per class. They are: daisy, dandelion, roses, sunflowers, and tulips.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
batch_size = 32 img_height = 180 img_width = 180 print("The data is being split into training and validation set") train_ds = tf.keras.preprocessing.image_dataset_from_directory( data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=batch_size)
Code credit: https://www.tensorflow.org/tutorials/images/classification
Output
The data is being split into training and validation set Found 3670 files belonging to 5 classes. Using 2936 files for training.
Explanation
- These images are loaded off the disk using the image_dataset_from_directory utility.
- This will go from a directory of images on disk to a tf.data.Dataset.
- Once the data has been downloaded, some parameters are defined for the loader.
- The data is split into training and validation set.