How to impute missing values by random value for a single column in R?

R Programming Server Side Programming Programming

To impute missing values by random value for a single column in R, we can use impute function from Hmisc package.

For example, if we have a data frame called that contains a column say C which has some missing values then we can use the below given command to fill those missing values randomly −

df$C<-with(df,impute(C,"random"))

Example 1

Following snippet creates a sample data frame −

x<-sample(c(NA,2,5),20,replace=TRUE)
df1<-data.frame(x)
df1

The following dataframe is created −

To load Hmisc package and impute missing values in x randomly, add the following code to the above snippet −

library(Hmisc)
df1$x<-with(df1,impute(x,"random"))
df1

Output

If you execute all the above given snippets as a single program, it generates the following output −

Example 2

Following snippet creates a sample data frame −

y<-sample(c(NA,rnorm(3)),20,replace=TRUE)
df2<-data.frame(y)
df2

The following dataframe is created −

      y
1   0.1912368
2   0.1912368
3   NA
4   0.1912368
5  -0.8921644
6   NA
7  -0.8921644
8  -0.8921644
9   0.3934629
10  NA
11  NA
12  0.3934629
13  0.1912368
14  0.3934629
15  0.3934629
16  0.1912368
17 -0.8921644
18  0.3934629
19  0.1912368
20  0.1912368

To impute missing values in y randomly, add the following code to the above snippet −

df2$y<-with(df2,impute(y,"random"))
df2

Output

If you execute all the above given snippets as a single program, it generates the following output −

     y
1   0.1912368
2   0.1912368
3   0.1912368
4   0.1912368
5  -0.8921644
6   0.3934629
7  -0.8921644
8  -0.8921644
9   0.3934629
10  0.1912368
11 -0.8921644
12  0.3934629
13  0.1912368
14  0.3934629
15  0.3934629
16  0.1912368
17 -0.8921644
18  0.3934629
19  0.1912368
20  0.1912368

Nizamuddin Siddiqui

Updated on: 11-Nov-2021

625 Views

Kickstart Your Career

Get certified by completing the course

Get Started