What are the issues of Anomaly detection?

Data MiningDatabaseData Structure

There are various issues of anomaly detection which are as follows −

Number of Attributes used to define an anomaly − The question of either an object is anomalous depends on an individual attribute is a question of whether the object's value for that attribute is anomalous. Because an object can have several attributes, it can have anomalous values for several attributes, but ordinary values for multiple attributes.

Moreover, an object can be anomalous even if none of its attribute values are independently anomalous. For instance, it is general to have person who are two feet tall (children) or are 300 pounds in weight, but abnormal to have a two-foot tall person who weighs 300 pounds.

The description of an anomaly should define how the values of multiple attributes are used to decide whether or not an object is an anomaly. This is an essential issue when the dimensionality of the data is large.

Global versus Local Perspective − An object can appear unusual concerning all objects, but not concerning objects in its local neighborhood. For instance, a person whose height is 6 feet 5 inches is extremely tall concerning the general population, but not concerning professional basketball players.

Degree to which a point is an Anomaly − The assessment of an object is an anomaly is documented by some methods in a binary fashion: An object is an anomaly or it is not. Generally, this does not reflect the basic reality that some objects are more intense anomalies than others. Therefore, it is fascinating to have multiple assessment of the degree to which an object is anomalous. This assessment is called the anomaly or outlier score.

Identifying One Anomaly at a Time versus Many Anomalies at Once − In some methods, anomalies are eliminated one at a time; i.e., the most anomalous examples is recognized and removed and then the procedure repeats. For multiple techniques, a set of anomalies is recognized together.

Techniques that tries to recognize one anomaly at a time are often subject to an issue called masking, where the presence of multiple anomalies masks the presence of all. In the other terms, techniques that identify multiple outliers at once can experience swamping, where normal objects are defined as outliers. In model-based method, these effects can appears because the anomalies alter the data model.

Efficiency − There are important differences in the computational cost of several anomaly detection schemes. Classification-based schemes can needed essential resources to make the classification model, but are generally inexpensive to use. Likewise, statistical methods generate a statistical model and can categorize an element in constant time.

raja
Updated on 14-Feb-2022 13:07:37

Advertisements