- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

Attribute subset selection decreases the data set size by eliminating irrelevant or redundant attributes (or dimensions). Attribute subset selection aims to discover a minimum set of attributes such that the resulting probability distribution of the data classes is as close as applicable to the original distribution accessing using all attributes. Data mining on a reduced set of attributes has an extra benefit. It reduces the multiple attributes occurring in the discovered patterns, provides to create the patterns simpler to understand.

For n attributes, there are 2^{n} possible subsets. An exhaustive search for the optimal subset of attributes can be intensely expensive, particularly as n and multiple data classes increase. Thus, heuristic methods that explore a reduced search space are frequently used for attribute subset selection.

These methods are usually greedy in that while searching through attribute space, they always create what looks to be the better choice at the time. Their strategy is to make a locally optimal choice in the hope that this will lead to a globally optimal solution. Such greedy approaches are efficient in practice and can come close to estimating an optimal solution.

The best and worst attributes are generally determined using tests of statistical significance, which consider that the attributes are separate from one another. Some other attribute evaluation measures can be used, including the information gain measure used in building decision trees for classification.

There are the following methods of attribute subset selection which are as follows −

**Stepwise forward selection**− The process starts with a null set of attributes as the reduced set. The best of the original attributes is determined and added to the reduced set. At every subsequent iteration or step, the best of the remaining original attributes is inserted into the set.**Stepwise backward elimination**− The procedure starts with the full set of attributes. At each step, it removes the worst attribute remaining in the set.**Combination of forward selection and backward elimination**− The stepwise forward selection and backward elimination methods can be connected so that, at each step, the procedure chooses the best attribute and eliminate the worst from among the remaining attributes.**Decision tree induction**− Decision tree algorithms including ID3, C4.5, and CART, were originally designed for classification. Decision tree induction constructs a flowchart-like structure where each internal (non-leaf) node denotes a test on an attribute, each branch corresponds to an outcome of the test, and each external (leaf) node denotes a class prediction. At each node, the algorithm chooses the “best” attribute to partition the data into individual classes.

- Related Questions & Answers
- What is the basic minimal structure of HTML document?
- What is the basic structure of a C# program?
- What is Basic block scheduling?
- What is Basic Communication Model?
- What is dynamic frequency selection (DFS)?
- What is basic syntax of Python for Loops?
- What is the basic terminology in the system of limits and fits?
- What is the basic purpose of a Database Management system(DBMS)?
- What is basic syntax of Python if...else statement?
- What is the usage of the cross-origin attribute in HTML5?
- What is the basic syntax to access Python Dictionary Elements?
- What are the basic computational models?
- What are the basic properties of products in TOC?
- What is the percentage of the completion method?
- What is the use of Authorize Attribute in C# Asp.Net webAPI?

Advertisements