How to detect a binary column defined with 0 and 1 in an R data frame?


If a column in an R data frame has only two values 0 and 1 then we call it a binary column but it is not necessary that a binary column needs to be defined with 0 and 1 only but it is a general convention. To detect a binary column defined with 0 and 1 in an R data frame, we can use the apply function as shown in the below examples.

Example

Consider the below data frame −

 Live Demo

x1<-sample(0:1,20,replace=TRUE)
x2<-rnorm(20,1,0.57)
x3<-sample(1:5,20,replace=TRUE)
x4<-rpois(20,5)
x5<-rpois(20,1)
df1<-data.frame(x1,x2,x3,x4,x5)
df1

Output

  x1   x2     x3  x4  x5
1  1 0.6203256 4  4    0
2  1 0.6133840 3  6    0
3  1 2.2600124 2  5    1
4  1 0.9189756 4  0    0
5  0 0.6335537 1  4    2
6  1 0.6631676 2  3    1
7  1 1.2910049 1  3    1
8  1 1.6811408 5  5    1
9  0 0.9246393 5  8    0
10 1 1.7744186 2  5    1
11 0 1.5409120 1  8    0
12 1 0.5852456 2  9    0
13 1 0.6707639 1  2    1
14 1 1.4045163 3  7    3
15 0 1.1463694 5  4    3
16 0 1.4744266 4  3    2
17 1 1.7846723 2  6    0
18 1 1.2694807 2  4    0
19 1 1.2146714 1  6    0
20 0 1.2323528 4  6    0

Detecting which column in df1 is binary −

Example

apply(df1,2,function(x) {all(x %in% 0:1)})

Output

 x1    x2    x3    x4    x5
TRUE FALSE FALSE FALSE FALSE

Example

 Live Demo

y1<-rnorm(20)
y2<-sample(0:1,20,replace=TRUE)
y3<-sample(0:1,20,replace=TRUE)
y4<-sample(0:1,20,replace=TRUE)
y5<-rexp(20,1.38)
df2<-data.frame(y1,y2,y3,y4,y5)
df2

Output

      y1     y2 y3 y4    y5
1 -0.41990195 1 1   0 0.17901907
2 -0.82665045 0 0   0 0.61638486
3 0.30680950  0 1    1 0.46992402
4 1.00525636 1  0    0 0.30043897
5 0.88557771 0  1    1 0.30998419
6 0.36112442 0  1    1 0.48023858
7 -1.13961239 1 1   1 1.12290153
8 -0.27722960 1 0   0 0.61928866
9 -0.42098660 0 0   1 0.12449119
10 0.17711381 0 0   0 0.07402737
11 -0.37249602 0 1  1 2.71841887
12 1.16715519 0 0   0 0.43469615
13 1.14925253 0 1   0 1.78815398
14 0.35758175 1 1   1 2.73568363
15 0.42990962 0 1   1 2.08840339
16 -0.57891804 1 1  1 1.45000159
17 -0.84741815 1 0  0 0.33356979
18 0.97735362 0 1   1 0.19783634
19 0.92523768 1 0   1 0.14692741
20 -0.07879863 0 0  0 0.49631861

Detecting which column in df1 is binary −

Example

apply(df2,2,function(x) {all(x %in% 0:1)})

Output

  y1    y2    y3   y4   y5
FALSE TRUE TRUE TRUE FALSE

Updated on: 05-Dec-2020

658 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements