How can Tensorflow be used to configure the dataset for performance?

Python Server Side Programming Programming Tensorflow

The flower dataset can be configured for performance with the help of buffer prefetch, shuffle method, and cache method. Buffered prefetching can be used to ensure that the data can be taken from disk without having I/O become blocking. Dataset.cache() keeps the images in memory after they have been loaded off disk during the first epoch. Dataset.prefetch() will overlap the data preprocessing and model execution while training.

The Keras Sequential API is used, which is helpful in building a sequential model that is used to work with a plain stack of layers, where every layer has exactly one input tensor and one output tensor.

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

print("Configuring the dataset for better performance")
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

Code credit: https://www.tensorflow.org/tutorials/images/classification

Output

Configuring the dataset for better performance

Explanation

The concept of buffered prefetching can be used so that the data can be taken from disk without having I/O become blocking.
There are two important methods that can be used when loading data.
- cache() keeps the images in memory after they have been loaded off disk during the first epoch.
- This will ensure that the dataset doesn't become a bottleneck when the model is being trained.
- If the dataset is too large to fit into memory, this method can be used to create a performant on-disk cache.
- prefetch() will overlap the data preprocessing and model execution while training.

AmitDiwan

Updated on: 20-Feb-2021

703 Views

Kickstart Your Career

Get certified by completing the course

Get Started