- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What is Hoeffding Tree Algorithm?

The Hoeffding tree algorithm is a decision tree learning method for stream data classification. It was initially used to track Web clickstreams and construct models to predict which Web hosts and Web sites a user is likely to access. It typically runs in sublinear time and produces a nearly identical decision tree to that of traditional batch learners.

It uses Hoeffding trees, which exploit the idea that a small sample can often be enough to choose an optimal splitting attribute. This idea is supported mathematically by the Hoeffding bound (or additive Chernoff bound).

Suppose we make N independent observations of a random variable r with range R, where r is an attribute selection measure. (For a probability, R is one, and for an information gain, it is log c, where c is the number of classes.) In the case of Hoeffding trees, r is information gain. If we compute the mean, r’, of this sample, the Hoeffding bound states that the true mean of r is at least r’−ε, with probability 1−δ, where δ is user-specified and

$$\varepsilon=\sqrt{\frac{R^{2}ln\frac{1}{\delta}}{2N}} $$

The Hoeffding Tree algorithm uses the Hoeffding bound to determine, with high probability, the smallest number, N, of examples needed at a node when selecting a splitting attribute. The Hoeffding bound is independent of the probability distribution, unlike most other bound equations. This is desirable, as it may be impossible to know the probability distribution of the information gain, or whichever attribute selection measure is used.

The algorithm takes as input a sequence of training examples, S, described by attributes A, and the accuracy parameter, δ. The evaluation function G(A_{i}) is supplied, which could be information gain, gain ratio, Gini index, or some other attribute selection measure. At each node in the decision tree, we need to maximize G (A_{i}) for one of the remaining attributes,A_{i}. The goal is to find the smallest number of tuples, N, for which the Hoeffding bound is satisfied.

The algorithm takes as input a sequence of training examples, S, described by attributes A, and the accuracy parameter, δ. The evaluation function G(A_{i}) is supplied, which could be information gain, gain ratio, Gini index, or some other attribute selection measure. At each node in the decision tree, we need to maximize G (A_{i}) for one of the remaining attributes,A_{i}. The goal is to find the smallest number of tuples, N, for which the Hoeffding bound is satisfied.

For a given node, let A_{a} be the attribute that achieves the highest G, and Abbe the attribute that achieves the second-highest G. If G(A_{a} ) − G(A_{b}) > ε, where ε is calculated.

The only statistics that must be maintained in the Hoeffding tree algorithm are the counts n_{ijk} for the value v_{j} of attribute A_{i} with class label y_{k}. Therefore, if d is the number of attributes, v is the maximum number of values for any attribute, c is the number of classes, and l is the maximum depth (or the number of levels) of the tree, then the total memory required is O (ldvc).

To Continue Learning Please Login

Login with Google