Chainer - Creating Neural Networks



Creating neural networks with Chainer is a flexible and intuitive process which is Define-by-Run approach. This allows developers to construct and modify computational graphs dynamically as data flows through the network. Chainer supports a wide range of neural network architectures from simple feed-forward networks to more complex structures such as recurrent or convolutional neural networks.

By enabling dynamic graph construction the Chainer makes it easier to experiment with different network designs, debug issues and implement advanced models tailored to specific tasks. This flexibility is particularly valuable in research and development where rapid prototyping and iteration are essential.

Steps to Creating a Neural Network in Chainer

Let's see all the steps in detail how to build, train and test a simple feedforward neural network using Chainer. These steps highlights the flexibility and simplicity of Chainer's Define-by-Run approach by making it easier to experiment with different network architectures and training methods.

Install Chainer

Before start creating the neural network with chainer we should make sure that Chainer is installed in our working environment. We can install it using pip with following code −

pip install Chainer

Import Required Libraries

After installing chainer in our working environment, we need to import all the necessary components from Chainer such as Chain, Variable, optimizers and functions for activation functions and loss computation.

import chainer
import chainer.functions as F
import chainer.links as L
from chainer import Chain, optimizers, Variable
import numpy as np

Define the Neural Network

In this step, we will define a simple neural network with two hidden layers. Each layer will use the ReLU activation function and the output layer will use a sigmoid function since this is a binary classification task.

Here is the code to define the Neural Network −

class SimpleNN(Chain):
   def __init__(self):
      super(SimpleNN, self).__init__()
      with self.init_scope():
         self.l1 = L.Linear(None, 10)  # Input to hidden layer 1
         self.l2 = L.Linear(10, 10)    # Hidden layer 1 to hidden layer 2
         self.l3 = L.Linear(10, 1)     # Hidden layer 2 to output layer

   def forward(self, x):
      h1 = F.relu(self.l1(x))
      h2 = F.relu(self.l2(h1))
      y = F.sigmoid(self.l3(h2))  # Sigmoid activation for binary classification
      return y

Create the Model and Optimizer

Next, we have to intiate the model and select an optimizer. Here in this We are using the Adam optimizer. Below is the code −

# Instantiate the model and optimizer
model = SimpleNN()
optimizer = optimizers.Adam()
optimizer.setup(model)

Prepare the Data

For our better understanding we will create some dummy data. Normally we would load our dataset here.

# Generate example data
X_train = np.random.rand(100, 5).astype(np.float32)  # 100 samples, 5 features
y_train = np.random.randint(0, 2, size=(100, 1)).astype(np.int32)  # 100 binary labels (integers)

Training the Network

In this step we are performing manual training loop in Chainer which shows how to train a neural network by iterating through the dataset in minibatches by performing forward passes to make predictions, computing the loss and then updating the model's weights using backpropagation. The loop runs for a specified number of epochs and the loss for each epoch is printed to track the model's training progress.

We will train the network using a simple loop. For each epoch, we will perform the following steps −

  • Forward Pass
  • Compute loss
  • Backward pass (gradient computation)
  • Update weights

Below is the code to perform simple manual training loop in chainer −

n_epochs = 10
batch_size = 10

for epoch in range(n_epochs):
   for i in range(0, len(X_train), batch_size):
      x_batch = Variable(X_train[i:i+batch_size])
      y_batch = Variable(y_train[i:i+batch_size])

      # Forward pass
      y_pred = model.forward(x_batch)

      # Debugging: Print shapes and types
      print(f"x_batch shape: {x_batch.shape}, type: {x_batch.dtype}")
      print(f"y_batch shape: {y_batch.shape}, type: {y_batch.dtype}")
      print(f"y_pred shape: {y_pred.shape}, type: {y_pred.dtype}")

      # Ensure y_pred and y_batch have the same shape
      if y_pred.shape != y_batch.shape:
         y_pred = F.reshape(y_pred, y_batch.shape)

      # Compute loss
      loss = F.sigmoid_cross_entropy(y_pred, y_batch)

      # Backward pass and weight update
      model.cleargrads()
      loss.backward()
      optimizer.update()

   print(f'Epoch {epoch+1}, Loss: {loss.array}')

Testing the Model

After training we need to test the model on new data. Here's comes how we might do that testing in chainer. Here is the code −

# Test the model
X_test = np.random.rand(10, 5).astype(np.float32)  # 10 samples, 5 features
y_test = model.forward(Variable(X_test))
print("Predictions:", y_test.data)

Saving and Loading the Model

We can save the trained model to a file and load it later for inference as mentioned below −

# Save the model
chainer.serializers.save_npz('simple_nn.model', model)

# Load the model
chainer.serializers.load_npz('simple_nn.model', model)

Now let's combine all the above mentioned steps into one and see the result of the created neural network in Chainer −

import chainer
import chainer.functions as F
import chainer.links as L
from chainer import Chain, optimizers, Variable
import numpy as np

class SimpleNN(Chain):
   def __init__(self):
      super(SimpleNN, self).__init__()
      with self.init_scope():
         self.l1 = L.Linear(None, 10)  # Input to hidden layer 1
         self.l2 = L.Linear(10, 10)   # Hidden layer 1 to hidden layer 2
         self.l3 = L.Linear(10, 1)    # Hidden layer 2 to output layer

   def forward(self, x):
      h1 = F.relu(self.l1(x))
      h2 = F.relu(self.l2(h1))
      y = F.sigmoid(self.l3(h2))  # Sigmoid activation for binary classification
      return y

# Instantiate the model and optimizer
model = SimpleNN()
optimizer = optimizers.Adam()
optimizer.setup(model)

# Generate example data
X_train = np.random.rand(100, 5).astype(np.float32)  # 100 samples, 5 features
y_train = np.random.randint(0, 2, size=(100, 1)).astype(np.int32)  # 100 binary labels (integers)

n_epochs = 10
batch_size = 10

for epoch in range(n_epochs):
   for i in range(0, len(X_train), batch_size):
      x_batch = Variable(X_train[i:i+batch_size])
      y_batch = Variable(y_train[i:i+batch_size])

      # Forward pass
      y_pred = model.forward(x_batch)

      # Debugging: Print shapes and types
      print(f"x_batch shape: {x_batch.shape}, type: {x_batch.dtype}")
      print(f"y_batch shape: {y_batch.shape}, type: {y_batch.dtype}")
      print(f"y_pred shape: {y_pred.shape}, type: {y_pred.dtype}")

      # Ensure y_pred and y_batch have the same shape
      if y_pred.shape != y_batch.shape:
         y_pred = F.reshape(y_pred, y_batch.shape)

      # Compute loss
      loss = F.sigmoid_cross_entropy(y_pred, y_batch)

      # Backward pass and weight update
      model.cleargrads()
      loss.backward()
      optimizer.update()

   print(f'Epoch {epoch+1}, Loss: {loss.array}')

# Test the model
X_test = np.random.rand(10, 5).astype(np.float32)  # 10 samples, 5 features
y_test = model.forward(Variable(X_test))
print("Predictions:", y_test.data)

# Save the model
chainer.serializers.save_npz('simple_nn.model', model)

# Load the model
chainer.serializers.load_npz('simple_nn.model', model)

Following is the output of the simple Neural network created with the help of Chainer Framework −

x_batch shape: (10, 5), type: float32
y_batch shape: (10, 1), type: int32
y_pred shape: (10, 1), type: float32
x_batch shape: (10, 5), type: float32
y_batch shape: (10, 1), type: int32
y_pred shape: (10, 1), type: float32
x_batch shape: (10, 5), type: float32
y_batch shape: (10, 1), type: int32
y_pred shape: (10, 1), type: float32
x_batch shape: (10, 5), type: float32
y_batch shape: (10, 1), type: int32
------------------------------------
------------------------------------
------------------------------------
y_pred shape: (10, 1), type: float32
Epoch 10, Loss: 0.6381329298019409
Predictions: [[0.380848  ]
 [0.40808532]
 [0.35226226]
 [0.42560062]
 [0.3757095 ]
 [0.35753834]
 [0.38465175]
 [0.35967904]
 [0.37653774]
 [0.4149222 ]]

Here in this chapter we demonstrated the basic workflow for creating and training a neural network using Chainer. We can experiment with different architectures, optimizers and hyperparameters to see how they affect performance.

Advertisements