What are the mining multidimensional association rules from relational databases and data warehouses?

Data MiningDatabaseData Structure

Association rule learning is a type of unsupervised learning technique that tests for the dependency of one data element on another data element and maps accordingly so that it can be more commercial. It tries to discover some interesting relations or associations between the variables of the dataset. It depends on several rules to find interesting relations among variables in the database.

The association rule learning is the essential concept of machine learning, and it is employed in Market Basket analysis, Web usage mining, continuous production, etc. Therefore market basket analysis is an approach used by several big retailers to find the associations between items.

In market basket analysis, customer buying habits are analyzed by finding associations between the different items that customers place in their shopping baskets.

By discovering such associations, retailers create marketing approaches by analyzing which items are generally purchased by the customer. This association can lead to raised sales by supporting retailers to do selective marketing and plan for their shelf area.

The popular area of application for the multi-level association is market basket analysis, which studies the buying habits of customers by searching for sets of items that are frequently, purchased together which was displayed in the concept of concept hierarchy.

Association rules with two or more dimensions or predicates can be referred to as multidimensional association rules. For example,

Age (X, "20...29") ^occupation (X,"Student") =>buys (X,"Laptop")

This rule contains three predicates (age, occupation, and buys), each of which occurs only once in the rule, such rules are called interdimensional association rules. The rules with repeated predicates or containing multiple occurrences of some predicates are called hybrid-dimension association rules.

For example,

Age (X, "20...29") ^buys (X,"Laptop") =>buys (X,"Printer")

The database attributes should be categorical or quantitative.

Categorical attributes have a finite number of possible values, with no ordering among the values also called nominal attributes.

Quantitative attributes are numeric and have an implicit sequencing between values. The three basic approaches regarding the treatment of quantitative attributes are as follows −

  • In the first approach, quantitative attributes are discretized using a predefined concept hierarchy, which occurs before mining. The discretized numeric attributes with their range values can be considered as categorical attributes.

  • In the second approach, quantitative attributes are categorized in bins and it is based on the distribution of the data. These bins can be further combined during the mining process. Therefore the process of discretization is dynamic and established.

  • In the third approach, quantitative attributes are discretized to capture the semantic meaning of such interval data. This powerful discretization phase treated the distance among data points.

Updated on 15-Feb-2022 10:18:15