# How to create a column with binary variable based on a condition of other variable in an R data frame?

Sometimes we need to create extra variable to add more information about the present data because it adds value. This is especially used while we do feature engineering. If we come to know about something that may affect our response then we prefer to use it as a variable in our data, hence we make up that with the data we have. For example, creating another variable applying conditions on other variable such as creating a binary variable for goodness if the frequency matches a certain criterion.

## Example

Consider the below data frame −

Live Demo

set.seed(100)
Group<-rep(c("A","B","C","D","E"),times=4)
Frequency<-sample(20:30,20,replace=TRUE)
df1<-data.frame(Group,Frequency)
df1

## Output

 Group Frequency
1  A    29
2  B    26
3  C    25
4  D    22
5  E    28
6  A    29
7  B    26
8  C    25
9  D    25
10 E    23
11 A    26
12 B    25
13 C    21
14 D    26
15 E    26
16 A    26
17 B    30
18 C    27
19 D    21
20 E    22

Creating a column category having two levels as Good and Bad, where Good is for those that have Frequency greater than 25−

## Example

df1$Category<-ifelse(df1$Frequency>25,"Good","Bad")
df1

## Output

 Group Frequency Category
1  A       29       Good
2  B       26       Good
5  E       28       Good
6  A       29       Good
7  B       26       Good
11 A       26       Good
14 D       26       Good
15 E       26       Good
16 A       26       Good
17 B       30       Good
18 C       27       Good
20 E       22       Bad

Let’s have a look at another example −

## Example

Live Demo

Class<-rep(c("Lower","Middle","Upper Middle","Higher"),times=5)
Ratings<-sample(1:10,20,replace=TRUE)
df2<-data.frame(Class,Ratings)
df2

## Output

     Class    Ratings
1    Lower       3
2    Middle      8
3 Upper Middle   2
4    Higher      9
5    Lower       2
6    Middle      3
7 Upper Middle   4
8    Higher      4
9    Lower       4
10   Middle      5
11 Upper Middle  7
12    Higher     9
13    Lower      4
14    Middle     2
15 Upper Middle  6
16    Higher     7
17    Lower      1
18    Middle     6
19 Upper Middle  9
20    Higher     9

## Example

df2$Group<-ifelse(df2$Ratings>5,"Royal","Standard")
df2

## Output

      Class    Ratings    Group
1    Lower       3       Standard
2    Middle      8         Royal
3 Upper Middle   2       Standard
4    Higher      9         Royal
5    Lower       2       Standard
6    Middle      3       Standard
7 Upper Middle   4       Standard
8    Higher      4       Standard
9    Lower       4       Standard
10   Middle      5       Standard
11 Upper Middle  7         Royal
12    Higher     9         Royal
13    Lower      4       Standard
14    Middle     2       Standard
15 Upper Middle  6         Royal
16    Higher     7         Royal
17    Lower      1       Standard
18    Middle     6         Royal
19 Upper Middle  9         Royal
20    Higher     9         Royal

Updated on: 09-Sep-2020

7K+ Views