What is a Decision Tree?

A decision tree is a flow-chart-like tree mechanism, where each internal node indicates a test on an attribute, each department defines an outcome of the test, and leaf nodes describe classes or class distributions. The highest node in a tree is the root node.

Algorithms for learning Decision Trees

Algorithm − Create a decision tree from the given training information.

Input − The training samples, samples, described by discrete-valued attributes; the set of students attributes, attribute-list.

Output − A decision tree.


  • Create a node N;

  • If samples are all of the same class, C then

  • Return N as a leaf node labeled with the class C

  • If the attribute-list is null then

  • Return N as a leaf node labeled with the most common class in samples. // majority voting

  • Select test-attribute, the attribute among attribute-list with the highest information gain.

  • Label node N with test attribute.

  • For each known value ai of test-attribute // partition the samples.

  • Grow a branch from node N for the condition test-attribute= ai.

  • Let si be the set of samples in samples for which test-attribute= ai.

  • If si is empty then

  • It can connect a leaf labeled with the most common class in samples.

  • Else attach the node returned by Generate decision tree ( si,attribute-list - test-attribute)

Decision Tree Induction

The automatic production of decision rules for instance is referred to as rule induction or automatic rule induction. It can be creating decision rules in the implicit design of a decision tree are also frequently known as rule induction, but the terms tree induction or decision tree inductions are constantly chosen.

The basic algorithm for decision tree induction is a greedy algorithm. It is used to generate decision trees in a top-down recursive divide-and-conquer manner. The basic algorithm for learning decision trees, is a form of ID3, a famous decision tree induction algorithm.

The basic methods are as follows −

  • The tree begins as an individual node defining the training samples.

  • If the samples are all of similar classes, then the node turns into a leaf and is labeled with that class.

  • The algorithm applies an entropy-based measure referred to as information gain as a heuristic for choosing the attribute that will divide the samples into single classes. This attribute develops into the “test” or “decision” attribute at the node. In this form of the algorithm, all attributes are categorical, i.e., discrete-valued.Continuous valued attributes should be discretized.

  • A department is generated for each known value of the test attribute, and the samples are division appropriately.

  • The algorithm uses a similar process looping to form a decision tree for the samples at each separation. Because an attribute has appeared at a node, it is required not to be treated in some of the node’s descendants.