What is AOI?


AOI stands for Attribute-Oriented Induction. The attribute-oriented induction approach to concept description was first proposed in 1989, a few years before the introduction of the data cube approach. The data cube approach is essentially based on materialized views of the data, which typically have been pre-computed in a data warehouse.

In general, it implements off-line aggregation earlier an OLAP or data mining query is submitted for processing. In other words, the attribute-oriented induction approach is generally a query-oriented, generalization-based, on-line data analysis methods.

The general idea of attribute-oriented induction is to first collect the task-relevant data using a database query and then perform generalization based on the examination of the number of distinct values of each attribute in the relevant collection of data.

The generalization is implemented by attribute removal or attribute generalization. Aggregation is implemented by combining identical generalized tuples and accumulating their specific counts. This decreases the size of the generalized data set. The resulting generalized association can be mapped into several forms for presentation to the user, including charts or rules.

The process of attribute-oriented induction which are as follows −

  • First, data focusing must be implemented before attribute-oriented induction. This step corresponds to the description of the task-relevant records (i.e., data for analysis). The data are collected based on the data supported in the data mining query.

  • Because a data mining query is usually relevant to only a portion of the database, selecting the relevant set of data not only makes mining more efficient, but also changes more significant results than mining the whole database.

  • It can be specifying the set of relevant attributes (i.e., attributes for mining, as indicated in DMQL with the in relevance to clause) may be difficult for the user. A user can choose only a few attributes that it is important, while missing others that can also play a role in the representation.

  • For example, suppose that the dimension birth place is defined by the attributes city, province or state, and country. It can allow generalization on the birth place dimension, the other attributes defining this dimension should also be included.

  • In other terms, having the system automatically involve province or state and country as relevant attributes enables city to be generalized to these larger conceptual levels during the induction phase.

  • At the other extreme, suppose that the user may have introduced too many attributes by specifying all of the possible attributes with the clause “in relevance to *”. In this case, all of the attributes in the relation specified by the from clause would be included in the analysis.

  • Some attributes are unlikely to contribute to an interesting representation. A correlation-based or entropy-based analysis method can be used to perform attribute relevance analysis and filter out statistically irrelevant or weakly relevant attributes from the descriptive mining process.

Updated on: 16-Feb-2022

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements