How is class comparison performed?

Data MiningDatabaseData Structure

Class discrimination or comparison mines characterization that categorize a target class from its contrasting classes. The target and contrasting classes should be comparable providing they share same dimensions and attributes. For instance, the three classes, person, address, and elements, are not comparable. But the sales in the last three years are comparable classes, and so are computer science candidates versus physics candidates.

The techniques developed can be continued to manage class comparison across multiple comparable classes. For instance, the attribute generalization process defined for class characterization can be changed so that the generalization is implemented synchronously between all the classes compared. This enables the attributes in some classes to be generalized to the similar levels of abstraction.

Suppose, for example, that it is given the AllElectronics data for sales in 2003 and sales in 2004 and can compare these two classes. Consider the dimension areas with abstractions at the city, province or state, and country levels. Every class of data must be generalized to the similar location level.

That is, they are synchronously all generalized to the city level, or the responsibility or state level, or the country level. This is more helpful than comparing, say, the sales in Vancouver in 2003 with the sales in the United States in 2004 (i.e., where every set of sales data is generalized to a multiple level).

The users must have the option to overwrite including automated, synchronous comparison with their choices, when chosen. There are several procedures which is as follows −

  • Data collection − The set of relevant records in the database is collected by query processing and is separate accordingly into a target class and one or a set of contrasting classes.

  • Dimension relevance analysis − If there are several dimensions, then dimension relevance analysis must be implemented on these classes to choose only the highly relevant dimensions for more analysis.

  • Synchronous generalization − Generalization is implemented on the target class to the level managed by a user-or professional-specified dimension threshold, which outcomes in a prime target class relation.

  • Presentation of the derived comparison − The resulting class comparison description can be anticipated in the form of tables, graphs, and rules. This presentation generally involves a “contrasting” measure including count% (percentage count) that reflects the comparison among the target and contrasting classes.

The user can regulate the comparison description by using drill-down, roll-up, and different OLAP operations to the target and contrasting classes, as acquired.

Updated on 16-Feb-2022 11:22:26