Find the frequency of unique values and missing values for each column in an R data frame.

R ProgrammingServer Side ProgrammingProgramming

To find the frequency of unique values and missing values for each column in an R data frame, we can use apply function with table function and useNA argument set to always.

For Example, if we have a data frame called df then we can find the frequency of unique values and missing values for each column in df by using the below mentioned command −

apply(df,2,table,useNA="always")

Example 1

Following snippet creates a sample data frame −

x1<-sample(c(NA,1,2),20,replace=TRUE)
x2<-sample(c(NA,1,2),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

The following dataframe is created

   x1 x2
 1 1  NA
 2 1   1
 3 2   2
 4 2   2
 5 NA  1
 6 1   1
 7 1   1
 8 1  NA
 9 NA  1
10 1   2
11 2   1
12 2  NA
13 1   2
14 1  NA
15 1  NA
16 NA NA
17 NA  1
18 1   2
19 2   1
20 NA NA

To find the frequency of unique values and missing values for each column in df1 on the above created data frame, add the following code to the above snippet −

x1<-sample(c(NA,1,2),20,replace=TRUE)
x2<-sample(c(NA,1,2),20,replace=TRUE)
df1<-data.frame(x1,x2)
apply(df1,2,table,useNA="always")

Output

If you execute all the above given snippets as a single program, it generates the following Output −

   x1 x2
 1 10 8
 2  5 5
<NA 5 7

Example 2

Following snippet creates a sample data frame −

y1<-sample(c(NA,5,10),20,replace=TRUE)
y2<-sample(c(NA,5,10,20),20,replace=TRUE)
df2<-data.frame(y1,y2)
df2

The following dataframe is created

  y1  y2
 1 5  NA
 2 NA NA
 3 10 NA
 4 5   5
 5 5  NA
 6 5   5
 7 5  10
 8 NA 10
 9 NA 20
10 5  10
11 10 NA
12 NA  5
13 NA NA
14 10 10
15 10 10
16 10  5
17 NA 10
18 10 10
19 5  20
20 NA 10

To find the frequency of unique values and missing values for each column in df2 on the above created data frame, add the following code to the above snippet −

y1<-sample(c(NA,5,10),20,replace=TRUE)
y2<-sample(c(NA,5,10,20),20,replace=TRUE)
df2<-data.frame(y1,y2)
apply(df2,2,table,useNA="always")

Output

If you execute all the above given snippets as a single program, it generates the following Output −

$y1
5 10 <NA
7  6   7
$y2
5 10 20 <NA
4  8  2   6

Example 3

Following snippet creates a sample data frame −

z1<-sample(c(NA,25,45),20,replace=TRUE)
z2<-sample(c(NA,25,45),20,replace=TRUE)
df3<-data.frame(z1,z2)
df3

The following dataframe is created

   z1 z2
 1 45 NA
 2 NA NA
 3 25 25
 4 25 25
 5 NA NA
 6 25 NA
 7 NA 45
 8 25 NA
 9 25 25
10 NA 45
11 45 25
12 25 25
13 25 45
14 NA 25
15 45 NA
16 NA 45
17 25 45
18 25 NA
19 45 NA
20 NA 45

To find the frequency of unique values and missing values for each column in df3 on the above created data frame, add the following code to the above snippet −

z1<-sample(c(NA,25,45),20,replace=TRUE)
z2<-sample(c(NA,25,45),20,replace=TRUE)
df3<-data.frame(z1,z2)
apply(df3,2,table,useNA="always")

Output

If you execute all the above given snippets as a single program, it generates the following Output −

    z1 z2
 25  9 6
 45  4 6
<NA  7 8
raja
Updated on 10-Nov-2021 06:36:06

Advertisements