What is CLIQUE?

CLIQUE was the first algorithm projected for dimension-growth subarea clustering in high-dimensional area. In dimension-growth subarea clustering, the clustering process begins at single-dimensional subspaces and increase upward to higher-dimensional ones.

Because CLIQUE partitions each dimension such as grid architecture and decides whether a cell is dense based on the multiple points it includes. It can be looked as an integration of density-based and grid-based clustering approaches.

The ideas of the CLIQUE clustering algorithm are as follows −

  • Given a large group of multidimensional data points, the data area is generally not uniformly engaged by the data points. CLIQUE’s clustering recognizes the sparse and the “crowded” areas in space (or units), thereby finding the complete distribution patterns of the data set.

  • A unit is solid if the fraction of total data points included in it exceeds an input model parameter. In CLIQUE, a cluster is represented as a maximal group of linked dense units.

CLIQUE implements multidimensional clustering in two process which are as follows − In the first process, CLIQUE partitions the d-dimensional data area into non-overlapping rectangular units, recognizing the dense units between these. This is completed (in 1-D) for each dimension.

The identification of the student search space depends on the Apriori property used in association rule mining. In general, the property use prior knowledge of items in the search area so that portions of the area can be pruned.

The property for CLIQUE, is as follows: If a k-dimensional unit is dense, therefore so are its projections in (k−1)-dimensional area. That is, given a k-dimensional student dense unit, if it can check its (k−1) th projection units and discover some that are not dense, then it can understand that the kth dimensional unit cannot be dense either.

Thus, it can make potential or student dense units in the k-dimensional area from the dense units found in (k − 1) dimensional area. In general, the resulting area searched is much smaller than the original area. The dense units are examined to decide the clusters.

In the second process, CLIQUE makes a minimal description for each cluster as follows. For each cluster, it decides the maximal area that covers the cluster of linked dense units. It decides a minimal cover (logic description) for each cluster.

CLIQUE necessarily discover subspaces of the largest dimensionality including high-density clusters exist in those subspaces. It is insensitive to the series of input objects and does not pretend some canonical data distribution. It scales linearly with the size of input and has the best scalability as the multiple dimensions in the data is enhanced.