What Are Self Organizing Maps - Kohonen Map?

Machine Learning Artificial Intelligence Data Science

Introduction

Kohonen proposed the idea of a self-organizing map (SOM) in the first place. Since it is an unsupervised neural network that is trained using unsupervised learning methods to create a low-dimensional, discretized representation from the input space of the training samples, it is a way to minimise the dimensions of the data. A map is a common name for this representation.

This article will walk through a Kohonen Map beginner's guide, which is a well-known self-organizing map. To begin, let's define what self-organizing maps are.

Self-Organizing Maps

Self-organizing maps, also known as Kohonen maps or SOMs, are a type of artificial neural network that were inspired by the biological models of neural systems in the 1970s. It trains its network using a competitive learning algorithm and an unsupervised learning approach. SOM is used for mapping and clustering (or dimensionality reduction) processes to map multidimensional data onto lower-dimensional spaces to simplify complex situations for easy comprehension. The SOM is made up of two layers: the input layer and the output layer. The Kohonen Map is another name for this.

Self-organizing maps (SOMs) are a type of neural network that are used for unsupervised learning. SOMs are also known as Kohonen maps, named after their inventor, Teuvo Kohonen. SOMs are used to map high-dimensional data to a lower-dimensional space and are particularly useful for visualizing and understanding complex data sets.

The basic structure of a SOM is a two-dimensional grid of nodes, where each node represents a point in the lower-dimensional space. The data points are then mapped to the nodes in the grid, with similar data points being mapped to nearby nodes. The SOM algorithm uses a competitive learning process, where the nodes compete to be the best match for a given data point. This competition causes the nodes to adjust their weights, and over time the nodes will self-organize to form a map of the data.

One of the key advantages of SOMs is their ability to preserve the topological structure of the data. This means that similar data points will be mapped to nearby nodes, while dissimilar data points will be mapped to distant nodes. This makes SOMs well suited for visualizing data, as the resulting map will be easy to interpret. SOMs are also useful for dimensionality reduction, as they can be used to map high-dimensional data to a lower-dimensional space.

SOMs are also used for clustering, as the nodes in the grid can be grouped together based on their similarity to the data points. This allows for the discovery of patterns and structure in the data that may not be immediately apparent. SOMs can also be used for anomaly detection, as data points that are dissimilar to the rest of the data will be mapped to distant nodes.

SOMs have a wide range of applications, including image processing, natural language processing, and bioinformatics. In image processing, SOMs can be used to classify images based on their features. In natural language processing, SOMs can be used to classify text documents based on their content. In bioinformatics, SOMs can be used to cluster and visualize gene expression data.

There are a few variations of SOMs, such as Growing SOM and Adaptive SOM. Growing SOMs can add or remove nodes from the grid as needed, while Adaptive SOMs can adjust the size of the grid to better match the data.

SOMs have a few limitations as well, such as the need for many data points for accurate results and the difficulty of updating the map once it has been trained. SOMs also require a significant number of computational resources and can be sensitive to the initial conditions.

Working of SOM

Imagine an input collection with the dimensions (m, n), where m denotes the number of features each sample has and n is the total number of training examples. The first step is to initialise the weights of size (n, C), where C is the number of clusters. After iterating through the input data for each training example, the winning vector (the weight vector the with shortest distance from of the training example, for instance, the Euclidean distance) is updated. Weight update recommendations are given by −

wij = wij(old) + alpha(t) * (xik - wij(old))

In this case, i denote the ith feature of the training example, j the winning vector, alpha the learning rate at time t, and k the kth training example from the input data. The SOM network has been trained, and fresh examples are clustered using trained weights. We add a fresh illustration to our library of effective vectors.

Algorithm

Step 1 − Initialize the w_ij of each node weight to a random value.
Step 2 − Randomly choose input vector x k.
Step 3 − Repeat steps 4 and 5 for each node on the map.
Step 4 − Determine the distance in Euclid between the weight vector w_ij connected to the first node and the input vector x(t), where t, i and j are all equal to 0.
Step 5 − Pay close attention to the node that generates the smallest t-distance.
Step 6 − Make a global Best Matching Unit computation in step six (BMU). It describes the node that all other calculated nodes are in relation to.
Step 7 − Locate the Kohonen Map's topological neighborhood and its radius.

Application of SOM

Self-Organizing Maps have the benefit of preserving the structural data from the training data even though they are not always linear. When applied to large dimensional data, Principal Component Analysis may simply cause data loss when the dimension is reduced to two. In cases where the data has multiple dimensions, and each predetermined dimension is significant, self-organizing maps can be a wonderful alternative to PCA for the decrease in dimensionality. Seismic facies analysis groups features based on the identification of numerous individual features. By locating feature organisations in the dataset, this method produces organised relational clusters.

Conclusion

In conclusion, Self-Organizing Maps (SOMs) are a powerful tool for unsupervised learning, which can be used to visualize, understand, and extract meaningful information from highdimensional data. SOMs can preserve the topological structure of the data, making them easy to interpret, as well as being useful for clustering, dimensionality reduction, anomaly detection and more. As with most machine learning techniques, SOMs have their limitations, but with the right data and implementation, they can be an asset in any data scientist's toolbox.

Sohail Tabrez

Updated on: 28-Mar-2023

165 Views

Kickstart Your Career

Get certified by completing the course

Get Started