Data Mining - Bayesian Classification
Bayesian classification is based on Baye's Theorem. Bayesian classifiers are the statistical classifiers. Bayesian classifier are able to predict class membership probabilities such as the probability that a given tuple belongs to a particular class.
Baye's Theorem is named after Thomas Bayes. There are two types of probability as follows:
Posterior Probability [P(H/X)]
Prior Probability [P(H)]
Where, X is data tuple and H is some hypothesis.
According to Baye's Theorem
Bayesian Belief Network
Bayesian Belief Network specify joint conditional probability distributions
Bayesian Networks and Probabilistic Network are known as belief network.
Bayesian Belief Network allows class conditional independencies to be defined between subsets of variables.
Bayesian Belief Network provide a graphical model of causal relationship on which learning can be performed.
We can use the trained Bayesian Network for classification. Following are the names with which the Bayesian Belief are also known:
There are two components to define Bayesian Belief Network:
Directed acyclic graph
A set of conditional probability tables
Directed Acyclic Graph
Each node in directed acyclic graph is represents a random variable.
These variable may be discrete or continuous valued.
These variable may corresponds to actual attribute given in data.
Directed Acyclic Graph Representation
The following diagram shows a directed acyclic graph for six boolean variables.
The arc in the diagram allows representation of causal knowledge. For example lung cancer is influenced by a person's family history of lung cancer, as well as whether or not the person is a smoker.It is woth noting that the variable PositiveXRay is independent of whether the patient has a family history of lung cancer or is a smoker, given that we know the patient has lung cancer.
Set of Conditional probability table representation:
The conditional probability table for the values of the variable LungCancer (LC) showing each possible combination of the values of its parent nodes, FamilyHistory (FH) and Smoker (S).