Database Articles

Page 135 of 547

What is Conceptual Clustering?

Ginni
Ginni
Updated on 24-Nov-2021 3K+ Views

Conceptual clustering is a form of clustering in machine learning that, given a set of unlabeled objects, makes a classification design over the objects. Unlike conventional clustering, which generally identifies groups of like objects, conceptual clustering goes one step further by also discovering characteristic definitions for each group, where each group defines a concept or class.Therefore, conceptual clustering is a two-step process − clustering is implemented first, followed by characterization. Thus, clustering quality is not solely a service of single objects. Most techniques of conceptual clustering adopt a statistical method that uses probability measurements in deciding the concepts or clusters.Probabilistic ...

Read More

What are the types of Constraint-Based Cluster Analysis?

Ginni
Ginni
Updated on 24-Nov-2021 4K+ Views

Constraint-based clustering finds clusters that satisfy user-stated preferences or constraints. It is based on the nature of the constraints, constraint-based clustering can adopt instead of different approaches. There are several categories of constraints which are as follows −Constraints on individual objects − It can define constraints on the objects to be clustered. In a real estate application, for instance, one can like to spatially cluster only those luxury mansions worth over a million dollars. This constraint confines the collection of objects to be clustered. It can simply be managed by preprocessing (e.g., implementing selection using an SQL query), after which ...

Read More

What is Expectation-Maximization?

Ginni
Ginni
Updated on 24-Nov-2021 993 Views

The EM (Expectation-Maximization) algorithm is a famous iterative refinement algorithm that can be used for discovering parameter estimates. It can be considered as an extension of the k-means paradigm, which creates an object to the cluster with which it is most similar, depending on the cluster mean.EM creates each object to a cluster according to a weight defining the probability of membership. In other term, there are no strict boundaries among clusters. Thus, new means are evaluated based on weighted measures.EM begins with an original estimate or “guess” of the parameters of the combination model (collectively defined as the parameter ...

Read More

Why is wavelet transformation useful for clustering?

Ginni
Ginni
Updated on 24-Nov-2021 1K+ Views

WaveCluster is a multiresolution clustering algorithm that first summarizes the records by imposing a multidimensional grid architecture onto the data space. It can use a wavelet transformation to change the original feature space, finding dense domains in the transformed space.In this method, each grid cell summarizes the data of a group of points that map into the cell. This summary data generally fit into the main memory for use by the multiresolution wavelet transform and the subsequent cluster analysis.A wavelet transform is a signal processing approach that decomposes a signal into multiple frequency subbands. The wavelet model can be used ...

Read More

What is Grid Based Methods?

Ginni
Ginni
Updated on 24-Nov-2021 21K+ Views

The grid-based clustering methods use a multi-resolution grid data structure. It quantizes the object areas into a finite number of cells that form a grid structure on which all of the operations for clustering are implemented. The benefit of the method is its quick processing time, which is generally independent of the number of data objects, still dependent on only the multiple cells in each dimension in the quantized space.An instance of the grid-based approach involves STING, which explores statistical data stored in the grid cells, WaveCluster, which clusters objects using a wavelet transform approach, and CLIQUE, which defines a ...

Read More

What is a Chameleon?

Ginni
Ginni
Updated on 24-Nov-2021 6K+ Views

Chameleon is a hierarchical clustering algorithm that uses dynamic modeling to decide the similarity among pairs of clusters. It was changed based on the observed weaknesses of two hierarchical clustering algorithms such as ROCK and CURE.ROCK and related designs emphasize cluster interconnectivity while neglecting data regarding cluster proximity. CURE and related design consider cluster proximity yet neglect cluster interconnectivity. In Chameleon, cluster similarity is assessed depending on how well-connected objects are inside a cluster and on the proximity of clusters. Especially, two clusters are combined if their interconnectivity is high and they are close together.It does not base on a ...

Read More

How efficient is the k-medoids algorithm on large data sets?

Ginni
Ginni
Updated on 24-Nov-2021 593 Views

A classic k-medoids partitioning algorithm like PAM works efficiently for small data sets but does not scale well for huge data sets. It can deal with higher data sets, a sampling-based method, known as CLARA (Clustering Large Applications), can be used.The approach behind CLARA is as follows: If the sample is chosen in a fairly random manner, it must closely define the original data set. The representative objects (medoids) chosen will be similar to those that would have been selected from the entire data set. CLARA draws several samples of the data set, applies PAM on each sample, and returns ...

Read More

What are the requirements of clustering in data mining?

Ginni
Ginni
Updated on 24-Nov-2021 9K+ Views

There are the following requirements of clustering in data mining which are as follows −Scalability − Some clustering algorithms work well on small data sets including fewer than some hundred data objects. A huge database can include millions of objects. Clustering on a sample of a given huge data set can lead to partial results. Highly scalable clustering algorithms are required.Ability to deal with different types of attributes − Some algorithms are designed to cluster interval-based (numerical) information. However, applications can require clustering several types of data, including binary, categorical (nominal), and ordinal data, or a combination of these data ...

Read More

How can we further improve the efficiency of Apriori-based mining?

Ginni
Ginni
Updated on 24-Nov-2021 14K+ Views

There are some variations of the Apriori algorithm that have been projected that target developing the efficiency of the original algorithm which are as follows −The hash-based technique (hashing itemsets into corresponding buckets) − A hash-based technique can be used to decrease the size of the candidate k-itemsets, Ck, for k > 1. For instance, when scanning each transaction in the database to create the frequent 1-itemsets, L1, from the candidate 1-itemsets in C1, it can make some 2-itemsets for each transaction, hash (i.e., map) them into the several buckets of a hash table structure, and increase the equivalent bucket ...

Read More

What are the Web-based tools in OLAP?

Ginni
Ginni
Updated on 24-Nov-2021 2K+ Views

There are the various web-based tools which are as follows −Arbor Essbase Web − This tool provides features as drilling up, down, across; slice and dice, and powerful reporting, all for OLAP. It also provides data entry, such as full multi-user concurrent write capabilities. Arbor Essbase is only a server product, no user package exists, thus assuring its own desktop client version market. The Web product does not restore administrative and development structures but it restores only user access for queries and updates.Information Advantage Web OLAP − This product uses a server-centric messaging architecture, composed of a powerful analytic engine ...

Read More
Showing 1341–1350 of 5,468 articles
« Prev 1 133 134 135 136 137 547 Next »
Advertisements