- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What are the benefits of k-NN Algorithms?
A k-nearest-neighbors algorithm is a classification approach that does not create assumptions about the structure of the relationship among the class membership (Y) and the predictors X1, X2,…. Xn.
This is a nonparametric approach because it does not contain the estimation of parameters in a pretended function form, including the linear form pretended in linear regression. This method draws data from similarities among the predictor values of the data in the dataset.
The benefit of k-NN methods is their integrity and the need for parametric assumptions. In the presence of a huge training set, these approaches perform especially well, when each class is featured by several combinations of predictor values.
For example, in real-estate databases, there are likely to be several sets of {home type, number of rooms, neighborhood, asking price, etc.} that characterize homes that sell fastly vs. those that remain for a high period in the industry.
There are three difficulties with the realistic exploitation of the power of the k-NN method.
Although no time is needed to calculate parameters from the training data (as would be the case for parametric models including regression), the time to discover the nearest neighbors in a huge training set can be restrictive. Multiple concepts have been implemented to overcome this difficulty. The main concept is as follows −
It can reduce the time taken to compute distances by working in a reduced dimension using dimension reduction techniques such as principal components analysis.
It can use sophisticated data structures such as search trees to speed up the identification of the nearest neighbor. This method often settles for an “almost nearest” neighbor to enhance speed. An instance is using bucketing, where the data are combined into buckets so that data inside each bucket are near to each other.
The multiple data needed in the training set to qualify as large increases exponentially with the multiple predictor’s p. This is because the expected distance to the nearest neighbor goes up badly with p unless the amount of the training set boosts exponentially with p. This phenomenon is called the curse of dimensionality, a fundamental problem pertinent to some classification, prediction, and clustering approaches.
k-NN is a “lazy learner” − The time-consuming computation is delayed to the time of prediction. For each data to be predicted, it can calculate its distances from the complete set of training data only at the time of prediction. This behavior constraints using this algorithm for real-time prediction of multiple data simultaneously.
- Related Articles
- Graph k-NN decision boundaries in Matplotlib
- What are the characteristics of clustering algorithms?
- What are the advantages of Symmetric Algorithms?
- What are the types of statistical-based algorithms?
- What are the algorithms of Grid-Based Clustering?
- What are Genetic Algorithms?
- What are the benefits of meditation?
- What are the benefits of wheatgrass?
- What are the types of process scheduling algorithms and which algorithms lead to starvation?
- What are the major benefits of Ayurveda?
- What are the health benefits of tamarind?
- What are the health benefits of lettuce?
- What are the health benefits of Ashwagnadha?
- What are the health benefits of Clove?
- What are the health benefits of spinach?
