What are Radial Basis Function Networks?

The popular type of feed-forward network is the radial basis function (RBF) network. It has two layers, not counting the input layer, and contrasts from a multilayer perceptron in the method that the hidden units implement computations.

Each hidden unit significantly defines a specific point in input space, and its output, or activation, for a given instance based on the distance between its point and the instance, which is only a different point. The closer these two points, the better the activation.

This is implemented by utilizing a nonlinear transformation function to modify the distance into a similarity measure. A bell-shaped Gaussian activation service of which the width can be different for each hidden unit is generally used for this objective. The hidden units are known as RBFs because the points in the instance area for which a given hidden unit makes a similar activation form a hypersphere or hyperellipsoid.

The output layer of an RBF structure is similar to that of a multilayer perceptron − It takes a linear set of the outputs of the hidden units and in classification issues passage it through the sigmoid function.

The parameters that such a network understands are the centers and widths of the RBFs and the weights used to design the linear set of the outputs acquired from the hidden layer. An essential benefit over multilayer perceptrons is that the first group of parameters can be decided independently of the second group and make accurate classifiers.

One method to decide the first group of parameters is to use clustering. The simple k-means clustering algorithm can be applied, clustering each class independently to obtain k-basis functions for each class.

The second group of parameters is understood by keeping the first parameters constant. This includes learning a simple linear classifier using one of the approaches such as linear or logistic regression. If there are long fewer hidden units than training instances, this can be done fastly.

The limitation of RBF networks is that they provide each attribute with a similar weight because all are considered equally in the distance computation unless attribute weight parameters are contained in the complete optimization process.

Therefore, they cannot deal efficiently with inappropriate attributes, against multilayer perceptrons. Support vector machines share similar issues. Support vector machines with Gaussian kernels (i.e., “RBF kernels”) are a definite method of RBF network, in which one function is centered on each training instance, all basis functions have a similar width, and the outputs are merged linearly by calculating the maximum-margin hyperplane. This has the result that some of the RBFs have a nonzero weight the ones that define the support vectors.