What are the techniques based on Support Expectations?

There are two approaches for determining the expected support of a pattern using (a concept hierarchy and a neighborhood-based approach called indirect association.

Support Expectation Based on Concept Hierarchy

Objective measures alone cannot be adequate to remove uninteresting infrequent patterns. For instance, consider bread and laptop computer are frequent items. Even though the itemset {bread, Iaptop conputer} is infrequent and possibly negatively correlated, it is not fascinating because their lack of support appears clear to domain experts. Hence, a subjective approach for deciding expected support is required to prevent generating such infrequent patterns.

Support Expectation Based on Indirect Association

Consider a pair of items, (a, b), that are exceptionally bought by customers. If a and b are unrelated items, including bread and DVO player, therefore their support is expected to be low. In other terms, if a and b are related items, then their support is to be high. The expected support was earlier computed utilizing a concept hierarchy. This shows an approach for deciding the expected support among a pair of items by viewing other items commonly buys together with these two items.

For instance, consider customers who purchase a sleeping bag are also influenced to purchase other camping supplies, whereas those who purchase a desktop computer are also influenced to purchase other computer accessories including an optical mouse or a printer. Considering there is no other item frequently purchased together with both a sleeping bag and a desktop computer, the support for these unrelated items is to be low.

In other terms, consider that diet and regular soda are purchased together with chips and cookies. Even without utilizing a concept hierarchy, both items are expected to be moderately related and their support must be high. Due to their actual support being low, diet and regular soda form an impressive infrequent pattern. Such patterns are called indirect association patterns.

The indirect association has several applications such as in the market basket domain, a and b can define competing items including desktop and laptop computers. In text mining, the indirect association can be used to recognize synonyms, antonyms, or words that are used in multiple contexts. For example, given a set of files, the word data can be indirectly related to gold via the mediator mining.