How does data mining help in Intrusion detection and prevention system?

An intrusion can be represented as any set of services that threaten the integrity, confidentiality, or accessibility of a network resource (e.g., user accounts, file systems, system kernels, etc).

Intrusion detection systems and intrusion prevention systems both monitor network traffic and system performance for malicious activities. The former produces documents whereas the latter is located in-line and is able to actively avoid/block intrusions that are identified.

The advantage of an intrusion prevention system are to recognize malicious activity, log data about said activity, tries to block/stop activity, and document activity. Data mining methods can support an intrusion detection and prevention system to improve its performance in various ways as follows −

New data mining algorithms for intrusion detection − Data mining algorithms can be used for both signature-based and anomaly-based detection. In signature-based detection, training data are labeled as “normal” or “intrusion.

A classifier can be derived to identify known intrusions. Research in this area has involved the software of classification algorithms, association rule mining, and cost-sensitive modeling.

Anomaly-based detection construct models of normal behavior and automatically identify important deviations from it. There are several approaches include the software of clustering, outlier analysis, and classification algorithms and statistical methods. The techniques should be effective and scalable, and capable of managing network data of large volume, dimensionality, and heterogeneity.

Association, correlation, and discriminative pattern analyses help select and build discriminative classifiers − Association, correlation, and discriminative pattern mining can be used to discover relationships among system attributes defining the network data. Such data can support insight concerning the selection of beneficial attributes for intrusion detection. New attributes changed from aggregated records can also be helpful including summary counts of traffic matching a specific pattern.

Analysis of stream data − Because of the transient and dynamic feature of intrusions and malicious attacks, it is important to implement intrusion detection in the data stream environment. Furthermore, an event can be normal on its own, but considered malicious if considered as an element of a sequence of events.

Therefore, it is essential to study what sequences of events are generally encountered, discover sequential patterns, and identify outliers. There are multiple data mining methods for discovering evolving clusters and constructing dynamic classification models in data streams are also essential for real-time intrusion detection.

Distributed data mining − Intrusions can be released from multiple locations and targeted to several different destinations. Distributed data mining methods can be used to explore network data from multiple network locations to identify these distributed attacks.

Visualization and querying tools − Visualization tools should be accessible for considering some anomalous patterns detected. Such tools can involve features for viewing associations, discriminative patterns, clusters, and outliers. Intrusion detection systems must also have a graphical user interface that enables security analysts to pose queries concerning the network data or intrusion detection outcomes.