What is the architecture of data mining?

Data mining is the process of discovering meaningful new correlations, patterns, and trends by shifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques. It is the analysis of observational datasets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful to the data owner.

It is the procedure of selection, exploration, and modeling of high quantities of information to find regularities or relations that are at first unknown to obtain clear and beneficial results for the owner of the database. Data mining is the phase of exploration and analysis by automatic or semi-automatic means of huge quantities of data to find meaningful designs and methods.

Data mining is an important method where previously unknown and potentially useful data is extracted from a huge amount of information. The data mining process contains several components, and these components constitute a data mining system structure. The major components of data mining are as follows −

  • Information repository − This is one or a set of databases, data warehouses, spreadsheets, or several types of data repositories. Data cleaning and data integration techniques can be implemented on the data.

  • Database or data warehouse server − The database or data warehouse server is answerable for fetching the relevant data, based on the user’s data mining request.

  • Knowledge base − This is the domain knowledge that can guide the search or compute the interestingness of the resulting design.

  • Data mining engine − This is important to the data mining system and it includes a set of functional modules for tasks including characterization, association and correlation analysis, classification, prediction, cluster analysis, outlier analysis, and evolution analysis.

  • Pattern evaluation module − This component generally employs interestingness measures and communicates with the data mining structure to focus the search toward interesting design.

This segment generally employs stake measures that cooperate with the data mining modules to target the search towards fascinating design. It can utilize a stake threshold to filter out discovered designs.

In other term, the pattern evaluation module can be coordinated with the mining module, based on the execution of the data mining techniques used. For effective data mining, it is suggested to push the evaluation of pattern stake as much as applicable into the mining process to confine the search to only fascinating design.

  • User interface − This module connects users and the data mining system, enabling the user to interact with the system by defining a data mining query or task, providing data to help focus the search, and implementing exploratory data mining based on the intermediate data mining results.

Furthermore, this component allows the user to browse database and data warehouse designs or data structures, evaluate mined patterns, and visualize the patterns in different forms.