How to find the frequency for all columns based on a condition in R?


To find the conditional frequency for all columns based on a condition, we can use for loop where we will define the length of each column with condition for which we want to find the frequency.

For example, if we have a data frame called df and we want to find the number of values in each column that are greater than 5 then we can use the below given command −

Columns <- vector()
for(i in 1:ncol(df1)){
   Columns[i]<-length(df1[df1[,i] >5 ,i])
}
Columns

Example 1

Following snippet creates a sample data frame −

x1<-rpois(20,1)
x2<-rpois(20,2)
x3<-rpois(20,3)
df1<-data.frame(x1,x2,x3)
df1

The following dataframe is created −

   x1 x2 x3
1  1  1  1
2  0  1  3
3  1  3  3
4  2  4  2
5  1  2  1
6  0  7  0
7  1  1  2
8  2  1  3
9  0  6  1
10 0  5  3
11 2  1  4
12 2  2 10
13 1  1  4
14 1  2  3
15 0  2  2
16 0  2  3
17 0  1  3
18 0  4  4
19 0  4  6
20 3  1  3

In order to find the frequency in each column of df1 if column value is greater than 2, add the following code to the above snippet −

Columns1 <- vector()
for(i in 1:ncol(df1)){
   + Columns1[i]<-length(df1[df1[,i] >2 ,i])
+ }
Columns1

Output

If you execute all the above given snippets as a single program, it generates the following output −

[1] 1 7 13

Example 2

Following snippet creates a sample data frame −

y1<-rnorm(20)
y2<-rnorm(20)
y3<-rnorm(20)
df2<-data.frame(y1,y2,y3)
df2

The following dataframe is created −

       y1         y2          y3
1  -0.7446072   0.2772768  -0.2099932
2   0.4497256  -1.5064792  -0.7166337
3   0.8316262  -1.0904581   0.5837854
4  -0.2955840   1.8329734   1.9440828
5   1.4989187   0.7655811  -1.7222717
6   1.6513081  -1.4800745   0.9092251
7   0.7703807  -1.3972957  -0.6070779
8   0.8522162  -0.3482059  -0.7727520
9  -0.8581488   1.6068537  -2.3097855
10 -0.6890322   1.8891767  -1.3816252
11 -0.2896339   1.9209137   0.5935030
12 -0.9241086  -2.0833818   0.7365296
13 -1.1093938   1.4950127   1.5394590
14 -0.1203023  -0.7265817  -0.1850344
15 -0.1747876  -0.3429473   0.9155441
16  0.2678002  -0.4080068  -0.5372238
17  0.1292888   0.8621264  -1.0343519
18  1.0656223   0.3492514  -1.8643609
19 -1.0106256   0.3237296  -0.3930171
20  0.7498458  -0.1454423  -1.2903053

In order to find the frequency in each column of df2 if column value is greater than 5, add the following code to the above snippet −

Columns2<-vector()
for(i in 1:ncol(df2)){
   + Columns2[i] <- length(df2[df2[,i]>0.5 ,i])
+ }
Columns2

Output

If you execute all the above given snippets as a single program, it generates the following output −

[1] 7 7 7

Updated on: 05-Nov-2021

324 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements