How to create confusion matrix for a rpart model in R?

R ProgrammingServer Side ProgrammingProgramming

To create confusion matrix for a rpart model, we first need to find the predicted values then the table of predicted values and the response variable in the original data can be created, which will be the confusion matrix for the model.

For Example, if we have a vector of predicted values say P and original values in data frame df$O then the confusion matrix can be created by using the following command −

table(P,df$O)

Check out the below Examples to understand how it can be done.

Example 1

Following snippet creates a sample data frame −

Dep_Var1<-factor(sample(0:1,20,replace=TRUE))
Indep_Var1<-rpois(20,5)
df1<-data.frame(Dep_Var1,Indep_Var1)
df1

The following dataframe is created

 Dep_Var1 Indep_Var1
1   0      4
2   0     11
3   0      3
4   0      6
5   0      3
6   1      5
7   1      4
8   0      4
9   1      5
10  0      3
11  1      5
12  1      4
13  0      2
14  0      3
15  0      5
16  1      9
17  1      5
18  0      9
19  1      2
20  0      2

To load rpart package on the above created data frame, add the following code to the above snippet −

Dep_Var1<-factor(sample(0:1,20,replace=TRUE))
Indep_Var1<-rpois(20,5)
df1<-data.frame(Dep_Var1,Indep_Var1)
library(rpart)

To create rpart model and find the predicted values for data in df1 on the above created data frame, add the following code to the above snippet −

Dep_Var1<-factor(sample(0:1,20,replace=TRUE))
Indep_Var1<-rpois(20,5)
df1<-data.frame(Dep_Var1,Indep_Var1)
library(rpart)
Model_1<-rpart(Dep_Var1~Indep_Var1,data=df1)
Prediction_Model_1<-predict(Model_1,type="class")

To create the confusion matrix on the above created data frame, add the following code to the above snippet −

Dep_Var1<-factor(sample(0:1,20,replace=TRUE))
Indep_Var1<-rpois(20,5)
df1<-data.frame(Dep_Var1,Indep_Var1)
library(rpart)
Model_1<-rpart(Dep_Var1~Indep_Var1,data=df1)
Prediction_Model_1<-predict(Model_1,type="class")
table(Prediction_Model_1,df1$Dep_Var1)

Output


If you execute all the above given snippets as a single program, it generates the following Output −

Prediction_Model_1 0 1
                 0 6 1
                 1 6 7

Example 2


Following snippet creates a sample data frame −

Dep_Var2<-factor(sample(0:1,20,replace=TRUE))
Indep_Var2<-rnorm(20)
df2<-data.frame(Dep_Var2,Indep_Var2)
df2

The following dataframe is created

  Dep_Var2 Indep_Var2
1  0        1.139577556
2  1        0.006968284
3  1        0.438159515
4  1        0.599715153
5  1        1.870112573
6  0       -0.810537941
7  0       -0.733628480
8  1        0.625663690
9  1        0.696501333
10 1       -0.967849897
11 1       -2.392595836
12 1        1.459343862
13 1       -0.026408590
14 0       -1.254218214
15 0       -0.865296394
16 0        0.443057916
17 0        1.172367014
18 0        1.334406228
19 1        1.262094268
20 0        0.887480542

To create rpart model and find the predicted values for data in df2 on the above created data frame, add the following code to the above snippet −

Dep_Var2<-factor(sample(0:1,20,replace=TRUE))
Indep_Var2<-rnorm(20)
df2<-data.frame(Dep_Var2,Indep_Var2)
library(rpart)
Model_2<-rpart(Dep_Var2~Indep_Var2,data=df2)
Prediction_Model_2<-predict(Model_2,type="class")

To create the confusion matrix on the above created data frame, add the following code to the above snippet −

Dep_Var2<-factor(sample(0:1,20,replace=TRUE))
Indep_Var2<-rnorm(20)
df2<-data.frame(Dep_Var2,Indep_Var2)
library(rpart)
Model_2<-rpart(Dep_Var2~Indep_Var2,data=df2)
Prediction_Model_2<-predict(Model_2,type="class")
table(Prediction_Model_2,df2$Dep_Var2)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

Prediction_Model_2 0 1
                 0 4 3
                 1 5 8

Example 3

Following snippet creates a sample data frame −

Dep_Var3<-factor(sample(0:1,20,replace=TRUE))
Indep_Var3<-sample(501:1000,20)
df3<-data.frame(Dep_Var3,Indep_Var3)
df3

The following dataframe is created

 Dep_Var3 Indep_Var3
1  1       530
2  0       554
3  0       510
4  1       782
5  0       648
6  1       546
7  1       762
8  0       666
9  1       733
10 0       928
11 0       902
12 1       602
13 1       933
14 1       987
15 1       743
16 0       515
17 1       867
18 1       945
19 0       503
20 1       512

To create rpart model and find the predicted values for data in df3 on the above created data frame, add the following code to the above snippet −

Dep_Var3<-factor(sample(0:1,20,replace=TRUE))
Indep_Var3<-sample(501:1000,20)
df3<-data.frame(Dep_Var3,Indep_Var3)
library(rpart)
Model_3<-rpart(Dep_Var3~Indep_Var3,data=df3)
Prediction_Model_3<-predict(Model_3,type="class")

To create the confusion matrix on the above created data frame, add the following code to the above snippet −

Dep_Var3<-factor(sample(0:1,20,replace=TRUE))
Indep_Var3<-sample(501:1000,20)
df3<-data.frame(Dep_Var3,Indep_Var3)
library(rpart)
Model_3<-rpart(Dep_Var3~Indep_Var3,data=df3)
Prediction_Model_3<-predict(Model_3,type="class")
table(Prediction_Model_3,df3$Dep_Var3)

Output


If you execute all the above given snippets as a single program, it generates the following Output −

Prediction_Model_3 0 1
                 0 6 4
                 1 2 8
raja
Published on 02-Nov-2021 08:37:32
Advertisements