How to find the number of columns of an R data frame that satisfies a condition based on row values?

R ProgrammingServer Side ProgrammingProgramming

Sometimes we want to extract the count from the data frame and that count could be the number of columns that have same characteristics based on row values. For example, if we have a data frame containing three columns with fifty rows and the values are integers between 1 and 100 then we might want to find the number of columns that have value greater than 20 for each of the rows. This can be done by using rowSums function.

Example

Consider the below data frame −

 Live Demo

> x1<-sample(1:10,20,replace=TRUE)
> x2<-sample(1:100,20)
> x3<-rpois(20,5)
> df<-data.frame(x1,x2,x3)
> df

Output

 x1 x2 x3
1 9 72 9
2 5 20 6
3 3 82 4
4 5 47 4
5 1 45 10
6 6 14 6
7 10 54 7
8 10 13 6
9 4 98 5
10 4 76 5
11 5 53 5
12 9 87 2
13 3 79 6
14 2 73 5
15 10 75 3
16 1 7 2
17 5 92 7
18 5 34 5
19 9 52 5
20 5 43 4

Adding a new column to df with number columns having values greater than 5 −

Example

> df$Number_of_columns_LargerThan5<-rowSums(df>5)
> df

Output

 x1 x2 x3 Number_of_columns_LargerThan5
1 9 72 9    3
2 5 20 6    2
3 3 82 4    1
4 5 47 4    1
5 1 45 10   2
6 6 14 6    3
7 10 54 7   3
8 10 13 6   3
9 4 98 5    1
10 4 76 5   1
11 5 53 5   1
12 9 87 2   2
13 3 79 6   2
14 2 73 5   1
15 10 75 3  2
16 1 7 2    1
17 5 92 7   2
18 5 34 5   1
19 9 52 5   2
20 5 43 4   1

Adding a new column to df with number columns having values less than 5 −

Example

> df$Number_of_columns_LessThan5<-rowSums(df<5)
> df

Output

x1 x2 x3 Number_of_columns_LargerThan5 Number_of_columns_LessThan5
1 9 72 9          3                         1
2 5 20 6          2                         1
3 3 82 4          1                         3
4 5 47 4          1                         2
5 1 45 10         2                         2
6 6 14 6          3                         1
7 10 54 7         3                         1
8 10 13 6         3                         1
9 4 98 5          1                         2
10 4 76 5         1                         2
11 5 53 5         1                         1
12 9 87 2         2                         2
13 3 79 6         2                         2
14 2 73 5         1                         2
15 10 75 3        2                         2
16 1 7 2          1                         3
17 5 92 7         2                         1
18 5 34 5         1                         1
19 9 52 5         2                         1
20 5 43 4         1                         2

Let’s have a look at another example −

Example

 Live Demo

> y1<-sample(1:100,20)
> y2<-sample(1:1000,20)
> df_y<-data.frame(y1,y2)
> df_y

Output

   y1 y2
1 33 663
2 20 523
3 24 791
4 100 330
5 48 264
6 32 579
7 56 51
8 94 57
9 76 711
10 58 411
11 49 849
12 63 805
13 67 696
14 1 237
15 11 147
16 12 448
17 75 465
18 65 220
19 99 958
20 34 909
> df_y$Number_of_columns_less_than_equalto_50<-rowSums(df_y<=50)
> df_y

Output

  y1 y2 Number_of_columns_less_than_equalto_50
1 33 663    1
2 20 523    1
3 24 791    1
4 100 330   0
5 48 264    1
6 32 579    1
7 56 51     0
8 94 57     0
9 76 711    0
10 58 411   0
11 49 849   1
12 63 805   0
13 67 696   0
14 1 237    1
15 11 147   1
16 12 448   1
17 75 465   0
18 65 220   0
19 99 958   0
20 34 909   1
raja
Published on 04-Sep-2020 11:06:33
Advertisements