How can TensorFlow be used to configure the IMDB dataset to give good performance and create a model?


Tensorflow is a machine learning framework that is provided by Google. It is an open−source framework used in conjunction with Python to implement algorithms, deep learning applications and much more. It is used in research and for production purposes.

The ‘tensorflow’ package can be installed on Windows using the below line of code −

pip install tensorflow

Tensor is a data structure used in TensorFlow. It helps connect edges in a flow diagram. This flow diagram is known as the ‘Data flow graph’. Tensors are nothing but multidimensional array or a list.

The ‘IMDB’ dataset contains reviews of over 50 thousand movies. This dataset is generally used with operations associated with Natural Language Processing.

We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.

Following is the code to configure the IMDB dataset to give good performance and create a model −

Example

AUTOTUNE = tf.data.experimental.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)
embedding_dim = 16
model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1)])

model.summary()

Code credit https://www.tensorflow.org/tutorials/keras/text_classification

Output

Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, None, 16) 160016
_________________________________________________________________
dropout_2 (Dropout) (None, None, 16) 0
_________________________________________________________________
global_average_pooling1d_1 ( (None, 16) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 16) 0
_________________________________________________________________
dense_1 (Dense) (None, 1) 17
=================================================================
Total params: 160,033
Trainable params: 160,033
Non-trainable params: 0
_________________________________________________________________

Explanation

  • AUTOTUNE ensures that the value of the attribute is tuned dynamically at runtime.

  • The model is built using ‘Keras’, and it is a sequential model which has one dense layer in it.

  • The summary or metadata about the model that was built is displayed on the console.

Updated on: 19-Jan-2021

79 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements