What are the characteristics of Nearest-Neighbor Classifiers?

Data MiningDatabaseData Structure

The Nearest Neighbour rule produces frequently high performance, without previous assumptions about the allocation from which the training instances are drawn. It includes a training set of both positive and negative cases. A new sample is defined by computing the distance to the convenient training case; the sign of that point then decides the classification of the sample.

The k-NN classifier boosts this concept by taking the k nearest points and creating the sign of the majority. It is frequent to choose k small and odd to divide ties (generally 1, 3, or 5). Larger k values help decrease the effects of noisy points inside the training data set, and the choice of k is implemented through cross-validation.

There are several characteristics of Nearest-Neighbor which are as follows −

Nearest-neighbor classification is an element of more general approaches called instance-based learning. It needs specific training instances to create predictions without having to support an abstraction (or model) derived from data.

Instance-based learning algorithms needed a proximity measure to decide the similarity or distance among instances and a classification function that restores the predicted class of a test instance depending on its proximity to other instances.

Lazy learners including nearest-neighbor classifiers do not need model building. But defining a test example can be quite cheap because it is required to calculate the proximity values individually among the test and training examples. In contrast, eager learners spend the number of their computing resources for model building. Because a model has been constructed, defining a test example is completely quick.

Nearest-neighbor classifiers create their predictions depending on local data, whereas decision tree and rule-based classifiers try to discover a global model that fits the whole input space. Due to the classification decisions being create locally, nearest-neighbor classifiers are affected by noise.

Nearest-neighbor classifiers can make arbitrarily shaped decision boundaries. Such boundaries support a more dynamic model representation distinguished from the decision tree and rule-based classifiers that are forced to rectilinear decision boundaries.

Nearest-neighbor classifiers can make false predictions unless the suitable proximity measure and data preprocessing phases are taken. For instance, consider that it is required to define a set of people based on attributes such as height (measured in meters) and weight (measured in pounds).

The height attribute has a low variability, ranging from 1.5 m to 1.85 m, whereas the weight attribute can change from 90 lb. to 250 lb. If the scale of the attributes is not taken into the application, the proximity measure can be dominated by differences in the weights of a person.

Updated on 11-Feb-2022 12:03:43