How can generalization be performed on such data?

Data MiningDatabaseData Structure

A set-valued attribute can be of homogeneous or heterogeneous type. Generally, set-valued information can be generalized by

  • Generalization of every value in the set to its equivalent higher-level concept

  • Derivation of the usual behavior of the set, including the multiple elements in the set, the types or value ranges in the set, the weighted average for statistical data, or the major clusters formed by the set.

  • Furthermore, generalization can be implemented by using several generalization operators to analyse alternative generalization paths. In this method, the result of generalization is a heterogeneous set.

Example − Suppose that the hobby of a person is a set-valued attribute containing the set of values {tennis, hockey, soccer, violin, SimCity}. This set can be generalized to a set of high-level concepts, such as {sports, music, computer games} or into the number 5 (i.e., the number of hobbies in the set).

Furthermore, a count can be related with a generalized value to denote how many elements are generalized to that value, as in {sports (3), music (1), computer games (1)}, where sports(3) denotes three types of sports, etc.

A set-valued attribute can be generalized to a set-valued or an individual-valued attribute; a individual-valued attribute can be generalized to a set-valued attribute if the values form a lattice or “hierarchy” or if the generalization follows multiple paths. Further generalizations on such a generalized set-valued attribute must follow the generalization path of every value in the set.

List-valued attributes and sequence-valued attributes can be generalized in a manner similar to that for set-valued attributes except that the order of the elements in the list or sequence should be preserved in the generalization.

Moreover, a list can be generalized as per its general behavior, including the length of the list, the type of list elements, the value range, the weighted average value for mathematical data, or by down unimportant component in the list. A list can be generalized into a list, a set, or an individual value.

A complex structure-valued attribute may contain sets, tuples, lists, trees, records, and their combinations, where one structure may be nested in another at any level.

  • In general, a structure-valued attribute can be generalized in several ways, such as generalizing each attribute in the structure while maintaining the shape of the structure.

  • It is used to flattening the structure and generalizing the flattened structure.

  • It can be summarizing the low-level structures by high-level concepts or aggregation.

  • It can be returning the type or an overview of the structure.

raja
Updated on 17-Feb-2022 11:53:37

Advertisements