What is the evaluation of Association Patterns?

Data MiningDatabaseData Structure

Association analysis algorithms have the probable to make a huge number of patterns. For instance, although the data set include only six items, it can create up to thousands of association rules at specific support and confidence thresholds. As the size and dimensionality of real monetary databases can be large, they can easily end up with thousands or even millions of patterns, some of which cannot be interesting.

It is analytical through the patterns to recognize the most interesting ones is not a trivial service because one person's trash can be another person's treasure. It is essential to create a set of well-accepted methods for computing the quality of association patterns.

The first set of criteria can be created through statistical arguments. Patterns that includes a group of mutually separate items or cover several transactions are treated as uninteresting because they can taking fake associations in the data.

Such patterns can be removed by using an objective interestingness part that uses statistics derived from data to decide whether a pattern is interesting. Examples of objective interestingness measures such as support, confidence, and correlation.

The second set of criteria can be created through subjective arguments. A pattern is treated subjectively uninteresting unless it acknowledges unexpected data about the data or supports beneficial knowledge that can lead to profitable services.

For instance, the rule {Butter}→{Bread} cannot be interesting, regardless of having high support and confidence values, because the relationship defined by the rule can appear rather obvious.

In the other term, the rule {Diapers}}→{{Beer} is interesting because the relationship is unexpected and can advise a new cross-selling event for retailers. Incorporating subjective knowledge into pattern computation is a complex task because it needed a considerable amount of previous data from domain experts.

The following are several approaches for incorporating biased knowledge into the pattern discovery task which is as follows −

Visualization − This approach needed a user-friendly environment to maintain the human user in the loop. It also enables the domain experts to connect with the data mining system by executing and testing the discovered patterns.

Template-based approach − This approach enables the users to constrain the type of patterns copied by the mining algorithm. Rather than documenting all the extracted rules, only rules that need a user-specified template are restored to the users.

Subjective interestingness measure − A subjective measure can be represented based on domain data including concept hierarchy or gain limit of elements. The measure can be used to filter patterns that are accessible and non-actionable.

Updated on 11-Feb-2022 13:36:08