What are the methods of Data Mining for Intrusion Detection and Prevention?

Data mining is the process of finding useful new correlations, patterns, and trends by transferring through a high amount of data saved in repositories, using pattern recognition technologies including statistical and mathematical techniques. It is the analysis of factual datasets to discover unsuspected relationships and to summarize the records in novel methods that are both logical and helpful to the data owner.

The security of our computer systems and information is at continual risk. The huge growth of the web and the increasing accessibility of tools and tricks for intruding and attacking webs have prompted intrusion detection and avoidance to become an essential component of networked systems.

An intrusion can be represented as any set of services that threaten the integrity, confidentiality, or accessibility of a network resource (e.g., user accounts, file systems, system kernels, etc). Intrusion detection systems and intrusion prevention systems both monitor network traffic and system performance for malicious activities. The former produces documents whereas the latter is located in-line and is able to actively avoid/block intrusions that are identified.

The service of an intrusion prevention system are to recognize malicious activity, log data about said activity, tries to block/stop activity, and document activity.

The bulk of intrusion detection and prevention systems use signature-based detection or anomaly-based detection.

Signature-based detection − This approach of detection utilizes signatures, which are attack patterns that are preconfigured and fixed by domain professionals. A signature-based intrusion prevention system monitors the web traffic for matches to these signatures.

Once a match is discovered, the intrusion detection system will address the anomaly and an intrusion prevention system will take more appropriate actions. Due to the systems are generally dynamic, the signatures required to be updated laboriously whenever new application versions appears or changes in network configuration or several situations appears.

The major limitation is that such a detection structure can only recognize cases that match the signatures. It is unable to identify new or previously unfamiliar intrusion tricks.

Anomaly-based detection − This method construct models of normal network behavior (known as profiles) that are used to identify new patterns that substantially deviate from the profiles. Such deviations can define actual intrusions or simply be new behaviors that required to be inserted to the profiles.

The benefit of anomaly detection is that it can identify novel intrusions that have not been observed. Generally, a human analyst should sort through the deviations to ascertain which define real intrusions. A limiting element of anomaly detection is the high percentage of false positives. New designs of intrusion can be inserted to the set of signatures to improve signature-based detection.