- Trending Categories
- Data Structure
- Operating System
- C Programming
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can we find subspace clusters from high-dimensional data?
There are several methods have been categorized into three major groups including subspace search techniques, correlation-based clustering techniques, and biclustering techniques.
Subspace Search Technique − A subspace search method searches several subspaces for clusters. Therefore, a cluster is a subset of objects that are the same as each other in a subspace. The similarity is acquired by conventional measures including distance or density.
For instance, the CLIQUE algorithm is a subspace clustering technique. It can specify the subspaces and the clusters in those subspaces in a dimensionality-increasing series and uses antimonotonicity to prune subspaces in which no cluster can continue. A bigger challenge that subspace search technique face is how to search a sequence of subspaces effectively.
There are two types of methods are as follows −
Bottom-up method begins from low-dimensional subspaces and search higher-dimensional subspaces only when there can be clusters in those larger-dimensional. There are several pruning approaches are analysed to reduce the multiple higher-dimensional subspaces that required to be searched. CLIQUE is an instance of a bottom-up approach.
Top-down method begin from the complete space and search smaller and smaller subspaces recursively. Top-down methods are efficient only if the locality assumption influence, which need that the subspace of a cluster can be decided by the local neighborhood.
Correlation-Based Clustering Methods − While subspace search methods search for clusters with a similarity that is computed using conventional metrics such as distance or density, correlation-based methods can find clusters that are represented by advanced correlation models.
A PCA-based approach first uses PCA (Principal Components Analysis) to change a set of new, uncorrelated dimensions, and therefore mine clusters in the new space or its subspaces. Furthermore PCA, other space transformations can be used, including the Hough transform or fractal dimensions.
Biclustering Methods − In some applications, it is required to cluster both objects and attributes at the same time. The resulting clusters are called biclusters and meet four requirements as follows −
It is only a small group of objects perform in a cluster.
A cluster only contains a small number of attributes.
An object can participate in several clusters, or does not engage in any cluster.
An attribute can be included in several clusters, or is not contained in any cluster.
Biclustering techniques were first recommended to address the requirements for exploring gene expression data. A gene is a system of the passing-on of traits from a living structure to its offspring. Generally, a gene consist on a segment of DNA.
Genes are critical for all living things because they define some proteins and functional RNA chains. They influence the data to build and support a living organism’s cells and pass genetic traits to offspring.
A genotype is the genetic makeup of a cell, an organism, or an individual. Phenotypes are apparent features of an organism. Gene expression is the important level in genetics in that genotypes cause phenotypes.
- Related Articles
- What are the challenges of Outlier Detection in High-Dimensional Data?
- How we can extract data using VB Scripting from SAP
- How can we import data from .txt file into MySQL table?
- How can we import data from .CSV file into MySQL table?
- How can text data be embedded into dimensional vectors using Python?
- How many ways can we read data from the keyboard in Java?
- How can we read data from an excel sheet in Selenium Webdriver?
- How do we access elements from the two-dimensional array in C#?
- How can we copy data with some condition/s from existing MySQL table?
- How can we create a MySQL view by using data from multiple tables?
- What are the types of clusters in data mining?
- How we can group data in HTML forms?
- How can we export all the data from MySQL table into a text file?
- How can we export all the data from MySQL table into a CSV file?
- Beowulf Clusters