How to join two data frames based one factor column with different levels and the name of the columns in R using dplyr?

R ProgrammingServer Side ProgrammingProgramming

When there is a common factor with different levels, the joining of data frames is possible but the result will present all the levels with dplyr. We can make use of left_join function to join the two data frames but the size of the first data frame must be greater than the second data frame if they are not same.


Consider the below data frames −

> Class<-c("Statistics","Maths","Chemistry","Physics","Economics","Political Science",
+ "Geography")
> df1<-data.frame(Class)
> df1
1 Statistics
2 Maths
3 Chemistry
4 Physics
5 Economics
6 Political Science
7 Geography
> Subject<-c("Maths","Chemistry","Physics","Economics","Political Science",
+ "Geography")
> Age<-c(18,21,22,25,21,23)
> df2<-data.frame(Subject,Age)
> df2
Subject Age
1 Maths 18
2 Chemistry 21
3 Physics 22
4 Economics 25
5 Political Science 21
6 Geography 23

In these two data frames the levels of factor Class and Subject are same therefore we can use them to join the data frames.

Loading dplyr package −

> library(dplyr)

Joining the two data frames −

> left_join(df1,df2, by = c("Class" = "Subject"))
Class Age
1 Statistics NA
2 Maths 18
3 Chemistry 21
4 Physics 22
5 Economics 25
6 Political Science 21
7 Geography 23
Warning message:
Column `Class`/`Subject` joining factors with different levels, coercing to character vector

It is showing a warning but it is not an issue because this warning is just telling us that there were different levels of factors in the two data frames.

Published on 10-Aug-2020 13:20:20