- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What is RIPPER Algorithm?

It is a widely used rule induction algorithm called RIPPER. This algorithm scales almost linearly with the several training instances and is especially suited for constructing models from data sets with overloaded class distributions. RIPPER also works well with noisy data sets because it uses a validation set to prevent model overfitting.

RIPPER selects the majority class as its default class and understands the rules for identifying the minority class. For multiclass problems, the classes are series as per their frequencies.

Let (y_{1} y_{2}...y_{c}) be the ordered classes, where y_{1}is the least frequent class and y_{c} is the most frequent class. During the first iteration, instances that belong to y_{1} are Iabeled as positive examples, while those that belong to other classes are labeled as negative examples.

The sequential covering approach can be used to produce rules that discriminate among the positive and negative examples. Next, RIPPER extracts rules that distinguish y_{2} from other remaining classes. This process is repeated until we are left with y_{c} which is designated as the default class.

RIPPER uses a general-to-specific method to increase a rule and the FOIL's data gain measure to select the best conjunct to be inserted into the rule antecedent. It stops inserting conjuncts when the rule begins covering negative instances.

The new rule is pruned depending on its implementation on the validation set. The following metric is computed to determine whether pruning is needed − (p-n)/(p+n),where p(n) is the number of positive (negative) examples in the validation set covered by the rule.

This metric is monotonically related to the rule's accuracy on the validation set. If the metric is enhanced after pruning, therefore the conjunct is eliminated. Pruning is completed starting from the final conjunct inserted to the rule. For example, given a rule ABCD → y, RIPPER checks whether D should be pruned first, followed by CD, BCD, etc. While the initial rule covers only positive instances, the pruned rule can cover several negative instances in the training set.

After making a rule, some positive and negative instances covered by the rule are removed. The rule is then added into the ruleset as long as it does not violate the stopping condition, which is based on the minimum description length principle.

If the new rule improves the total representation length of the rule set by minimum d bits, thus RIPPER stops inserting rules into its rule set (by default, d is selected to be 64 bits). Another stopping condition used by RIPPER is that the error rate of the rule on the validation set must not exceed 50%. RIPPER implements more optimization steps to decide whether several existing rules in the rule set can be restored by more alternative rules.

- Related Articles
- What is division algorithm ?
- What is Parallel Algorithm?
- What is Dijikstra Algorithm?
- What is Backpropagation Algorithm?
- What is Apriori Algorithm?
- What is Euclid's division algorithm?
- What is Congestion Control Algorithm?
- What is Hoeffding Tree Algorithm?
- What is Distance Vector Routing Algorithm?
- What is the Blowfish encryption algorithm?
- What is the CART Pruning Algorithm?
- What is the C5 Pruning Algorithm?
- What is an Agglomerative Clustering Algorithm?
- What is algorithm for computing the CRC?
- What is a Non-Adaptive Routing Algorithm?