CNTK - Training the Neural Network

Here, we will understand about training the Neural Network in CNTK.

Training a model in CNTK

In the previous section, we have defined all the components for the deep learning model. Now it is time to train it. As we discussed earlier, we can train a NN model in CNTK using the combination of learner and trainer.

Choosing a learner and setting up training

In this section, we will be defining the learner. CNTK provides several learners to choose from. For our model, defined in previous sections, we will be using Stochastic Gradient Descent (SGD) learner.

In order to train the neural network, let us configure the learner and trainer with the help of following steps −

Step 1 − First, we need to import sgd function from cntk.lerners package.

from cntk.learners import sgd

Step 2 − Next, we need to import Trainer function from cntk.train.trainer package.

from cntk.train.trainer import Trainer

Step 3 − Now, we need to create a learner. It can be created by invoking sgd function along with providing model’s parameters and a value for the learning rate.

learner = sgd(z.parametrs, 0.01)

Step 4 − At last, we need to initialize the trainer. It must be provided the network, the combination of the loss and metric along with the learner.

trainer = Trainer(z, (loss, error_rate), [learner])

The learning rate which controls the speed of optimisation should be small number between 0.1 to 0.001.

Choosing a learner and setting up the training - Complete example

from cntk.learners import sgd
from cntk.train.trainer import Trainer
learner = sgd(z.parametrs, 0.01)
trainer = Trainer(z, (loss, error_rate), [learner])

Feeding data into the trainer

Once we chose and configured the trainer, it is time to load the dataset. We have saved the iris dataset as a .CSV file and we will be using data wrangling package named pandas to load the dataset.

Steps to load the dataset from .CSV file

Step 1 − First, we need to import the pandas package.

from import pandas as pd

Step 2 − Now, we need to invoke the function named read_csv function to load the .csv file from the disk.

df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, 
‘petal_length’, ‘petal_width’, index_col=False)

Once we load the dataset, we need to split it into a set of features and a label.

Steps to split the dataset into features and label

Step 1 − First, we need to select all rows and first four columns from the dataset. It can be done by using iloc function.

x = df_source.iloc[:, :4].values

Step 2 − Next we need to select the species column from iris dataset. We will be using the values property to access the underlying numpy array.

x = df_source[‘species’].values

Steps to encode the species column to a numeric vector representation

As we discussed earlier, our model is based on classification, it requires numeric input values. Hence, here we need to encode the species column to a numeric vector representation. Let’s see the steps to do it −

Step 1 − First, we need to create a list expression to iterate over all elements in the array. Then perform a look up in the label_mapping dictionary for each value.

label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}

Step 2 − Next, convert this converted numeric value to a one-hot encoded vector. We will be using one_hot function as follows −

def one_hot(index, length):
result = np.zeros(length)
result[index] = 1
return result

Step 3 − At last, we need to turn this converted list into a numpy array.

y = np.array([one_hot(label_mapping[v], 3) for v in y])

Steps to detect overfitting

The situation, when your model remembers samples but can’t deduce rules from the training samples, is overfitting. With the help of following steps, we can detect overfitting on our model −

Step 1 − First, from sklearn package, import the train_test_split function from the model_selection module.

from sklearn.model_selection import train_test_split

Step 2 − Next, we need to invoke the train_test_split function with features x and labels y as follows −

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0-2, 
stratify=y)

We specified a test_size of 0.2 to set aside 20% of total data.

label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}

Steps to feed training set and validation set to our model

Step 1 − In order to train our model, first, we will be invoking the train_minibatch method. Then give it a dictionary that maps the input data to the input variable that we have used to define the NN and its associated loss function.

trainer.train_minibatch({ features: X_train, label: y_train})

Step 2 − Next, call train_minibatch by using the following for loop −

for _epoch in range(10):
trainer.train_minbatch ({ feature: X_train, label: y_train})
print(‘Loss: {}, Acc: {}’.format(
trainer.previous_minibatch_loss_average,
trainer.previous_minibatch_evaluation_average))

Feeding data into the trainer - Complete example

from import pandas as pd
df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, index_col=False)
x = df_source.iloc[:, :4].values
x = df_source[‘species’].values
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
def one_hot(index, length):
result = np.zeros(length)
result[index] = 1
return result
y = np.array([one_hot(label_mapping[v], 3) for v in y])
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0-2, stratify=y)
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
trainer.train_minibatch({ features: X_train, label: y_train})
for _epoch in range(10):
trainer.train_minbatch ({ feature: X_train, label: y_train})
print(‘Loss: {}, Acc: {}’.format(
trainer.previous_minibatch_loss_average,
trainer.previous_minibatch_evaluation_average))

Measuring the performance of NN

In order to optimise our NN model, whenever we pass data through the trainer, it measures the performance of the model through the metric that we configured for trainer. Such measurement of performance of NN model during training is on training data. But on the other hand, for a full analysis of the model performance we need to use test data as well.

So, to measure the performance of the model using the test data, we can invoke the test_minibatch method on the trainer as follows −

trainer.test_minibatch({ features: X_test, label: y_test})

Making prediction with NN

Once you trained a deep learning model, the most important thing is to make predictions using that. In order to make prediction from the above trained NN, we can follow the given steps−

Step 1 − First, we need to pick a random item from the test set using the following function −

np.random.choice

Step 2 − Next, we need to select the sample data from the test set by using sample_index.

Step 3 − Now, in order to convert the numeric output to the NN to an actual label, create an inverted mapping.

Step 4 − Now, use the selected sample data. Make a prediction by invoking the NN z as a function.

Step 5 − Now, once you got the predicted output, take the index of the neuron that has the highest value as the predicted value. It can be done by using the np.argmax function from the numpy package.

Step 6 − At last, convert the index value into the real label by using inverted_mapping.

Making prediction with NN - Complete example

sample_index = np.random.choice(X_test.shape[0])
sample = X_test[sample_index]
inverted_mapping = {
   1:’Iris-setosa’,
   2:’Iris-versicolor’,
   3:’Iris-virginica’
}
prediction = z(sample)
predicted_label = inverted_mapping[np.argmax(prediction)]
print(predicted_label)

Output

After training the above deep learning model and running it, you will get the following output −

Iris-versicolor