How can we find a good subset of the original attributes?


Attribute subset selection reduces the data set size by removing irrelevant or redundant attributes (or dimensions). The objective of attribute subset selection is to discover a minimum set of attributes such that the subsequent probability distribution of the data classes is as close as feasible to the original distribution obtained using all attributes.

For n attributes, there are 2n possible subsets. An exhaustive search for the optimal subset of attributes can be extremely costly, specifically as n and the number of data classes raise. Hence, heuristic approaches that explore a reduced search space are generally used for attribute subset selection.

These approaches are frequently greedy in that while searching through attribute space, they continually make what views to be the good choice at the time. Their method is to develop a locally optimal choice in the hope that this will lead to a worldwide optimal solution. Such greedy techniques are efficient in practice and can come close to calculating an optimal solution.

The "best" and "worst" attributes are generally decided using tests of statistical significance, which consider that the attributes are separate from one another. Some different attribute evaluation measures can be used, including the information gain measure used in constructing decision trees for classification.

There are basic heuristic methods of attribute subset selection include the following techniques which are as follows −

Stepwise forward selection − The process starts with a null set of attributes as the decreased set. The best of the initial attributes is decided and inserted into the decreased set. At every subsequent iteration or step, the remaining initial attributes are inserted into the set.

Stepwise backward elimination − The process begins with the complete set of attributes. At each phase, it eliminates the worst attribute remaining in the set.

Combination of forwarding selection and backward elimination − The stepwise forward selection and backward elimination techniques can be mixed so that, at each phase, the process chooses the best attribute and eliminates the worst from between the remaining attributes.

Decision tree induction − Decision tree algorithms, including ID3, C4.5, and CART, were initially designed for classification. Decision tree induction constructs a flowchart-like structure where each internal (non-leaf) node denotes a test on an attribute, each branch corresponds to an outcome of the test, and each external (leaf) node denotes a class prediction. At each node, the algorithm selects the "best" attribute to partition the information into single classes.

When decision tree induction is utilized for attribute subset selection, a tree is built from the given information. All attributes that do not occur in the tree are considered to be irrelevant. The group of attributes occurring in the tree form the decreased subset of attributes.

Updated on: 16-Feb-2022

137 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements