To fill missing values after merging the data frames with another value than NA in R, we can follow the below steps −
First of all, create two data frames.
Then, merge the data frames by a common column between the two.
After that, replace the NAs with another value.
Let’s create a data frame as shown below −
ID<-1:10 x<-sample(1:100,10) df1<-data.frame(ID,x) df1
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
ID x 1 1 28 2 2 50 3 3 13 4 4 43 5 5 48 6 6 49 7 7 52 8 8 54 9 9 72 10 10 32
Create the second data frame
Let’s create a data frame as shown below −
ID<-1:15 y<-sample(1:10,15,replace=TRUE) df2<-data.frame(ID,y) df2
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
ID y 1 1 2 2 2 9 3 3 10 4 4 8 5 5 10 6 6 7 7 7 9 8 8 4 9 9 2 10 10 9 11 11 3 12 12 5 13 13 10 14 14 7 15 15 1
Merge the data frames
Using merge function to merge df1 and df2 by ID column −
ID<-1:10 x<-sample(1:100,10) df1<-data.frame(ID,x) ID<-1:15 y<-sample(1:10,15,replace=TRUE) df2<-data.frame(ID,y) merge(df1,df2,all=TRUE,by="ID") DF<-merge(df1,df2,all=TRUE,by="ID") DF
ID x y 1 1 28 2 2 2 50 9 3 3 13 10 4 4 43 8 5 5 48 10 6 6 49 7 7 7 52 9 8 8 54 4 9 9 72 2 10 10 32 9 11 11 NA 3 12 12 NA 5 13 13 NA 10 14 14 NA 7 15 15 NA 1
Replace NAs with another value
Using is.na function to replace NAs in DF with dot (.) as shown below −
ID<-1:10 x<-sample(1:100,10) df1<-data.frame(ID,x) ID<-1:15 y<-sample(1:10,15,replace=TRUE) df2<-data.frame(ID,y) merge(df1,df2,all=TRUE,by="ID") DF<-merge(df1,df2,all=TRUE,by="ID") DF[is.na(DF)]<-"." DF
ID x y 1 1 28 2 2 2 50 9 3 3 13 10 4 4 43 8 5 5 48 10 6 6 49 7 7 7 52 9 8 8 54 4 9 9 72 2 10 10 32 9 11 11 . 3 12 12 . 5 13 13 . 10 14 14 . 7 15 15 . 1