- Trending Categories
- Data Structure
- Operating System
- C Programming
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to join two data frames based one factor column with different levels and the name of the columns in R using dplyr?
When there is a common factor with different levels, the joining of data frames is possible but the result will present all the levels with dplyr. We can make use of left_join function to join the two data frames but the size of the first data frame must be greater than the second data frame if they are not same.
Consider the below data frames −
> Class<-c("Statistics","Maths","Chemistry","Physics","Economics","Political Science", + "Geography") > df1<-data.frame(Class) > df1 Class 1 Statistics 2 Maths 3 Chemistry 4 Physics 5 Economics 6 Political Science 7 Geography > Subject<-c("Maths","Chemistry","Physics","Economics","Political Science", + "Geography") > Age<-c(18,21,22,25,21,23) > df2<-data.frame(Subject,Age) > df2 Subject Age 1 Maths 18 2 Chemistry 21 3 Physics 22 4 Economics 25 5 Political Science 21 6 Geography 23
In these two data frames the levels of factor Class and Subject are same therefore we can use them to join the data frames.
Loading dplyr package −
Joining the two data frames −
> left_join(df1,df2, by = c("Class" = "Subject")) Class Age 1 Statistics NA 2 Maths 18 3 Chemistry 21 4 Physics 22 5 Economics 25 6 Political Science 21 7 Geography 23 Warning message: Column `Class`/`Subject` joining factors with different levels, coercing to character vector
It is showing a warning but it is not an issue because this warning is just telling us that there were different levels of factors in the two data frames.
- How to join two data frames with the same row order using dplyr in R?
- How to select columns in R based on the string that matches with the column name using dplyr?
- How to rename the factor levels of a factor variable by using mutate of dplyr package in R?
- How to create a new column for factor variable with changed factor levels by using mutate of dplyr package in R?
- How to extract the factor levels from factor column in an R data frame?
- How to do an inner join and outer join of two data frames in R?
- How to find the maximum of factor levels in numerical column and return the output including other columns in the R data frame?
- How to convert numeric columns to factor using dplyr package in R?
- How to convert a data frame into table for two factor columns and one numeric column in R?
- How to find the column means by factor levels in R?
- How to find the number of levels in R for a factor column?
- How to create boxplots based on two factor data in R?
- How to find total of an integer column based on two different character columns in R?
- How to find the cumulative sums by using two factor columns in an R data frame?
- How to find the sum by distinct column for factor levels in an R data frame?