- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What are the methodologies of data streams clustering?

Data stream clustering is described as the clustering of data that appar continuously including telephone data, multimedia data, monetary transactions etc. Data stream clustering is generally treated as a streaming algorithm and the objective is, given a sequence of points, to make a best clustering of the stream, utilizing a small amount of memory and time.

Some applications needed the automated clustering of such data into set based on their similarities. Examples contains applications for web intrusion detection, analyzing Web clickstreams, and stock market analysis.

There are several dynamic methods for clustering static data sets clustering data streams places additional force on such algorithms. It can be seen the data stream model of computation needed algorithms to create a single pass over the data, with bounded memory and definite processing time, whereas the stream may be highly dynamic and evolving over time.

There are several methodologies of data stream clustering which are as follows −

**Compute and store summaries of past data** − Because of limited memory space and quick response requirements, compute summaries of the previously view data, save the relevant results, and use such summaries to calculate important statistics when needed.

**Apply a divide-and-conquer strategy** − It can divide data streams into chunks based on order of arrival, compute summaries for these chunks, and then merge the summaries. In this method, higher models can be constructed out of smaller building blocks.

**Incremental clustering of incoming data streams** − Because stream data introduce the system continuously and incrementally, the clusters changed should be incrementally sophisticated.

**Perform microclustering as well as macroclustering analysis** − Stream clusters can be computed in two steps are as follows −

It can compute and store summaries at the microcluster level, where microclusters are formed by applying a hierarchical bottom-up clustering algorithm.

It can compute macroclusters (such as by using another clustering algorithm to group the microclusters) at the user-specified level. This two-step calculation efficiently compresses the data and provide results in a smaller area of error.

**Explore multiple time granularity for the analysis of cluster evolution** − Because the more recent data often play a different role from that of the remote (i.e., older) data in stream data analysis, use a tilted time frame model to store snapshots of summarized data at different points in time.

**Divide stream clustering into on-line and off-line processes** − While data are streaming in, basic summaries of data snapshots should be computed, stored, and incrementally updated.

Therefore, an on-line process is needed to maintain such dynamically changing clusters. Meanwhile, a user may pose queries to ask about past, current, or evolving clusters. Such analysis can be performed off-line or as a process independent of online cluster maintenance.

- Related Questions & Answers
- What are the methodologies of statistical data mining?
- What are the requirements of clustering in data mining?
- What are the examples of clustering in data mining?
- What are the types of Clustering in data mining?
- What are the methodologies of web mining?
- What are the applications of clustering?
- What are the methods of clustering?
- What are the clustering methods for spatial data mining?
- What are the characteristics of clustering algorithms?
- What are the methodologies for Information System Security?
- What are the algorithms of Grid-Based Clustering?
- What are the approaches of Graph-based clustering?
- What are the elements in Hierarchical clustering?
- What are the methods for Clustering with Constraints?
- What are the core interfaces of Reactive Streams in Java 9?