- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to impute missing values by random value for a single column in R?
To impute missing values by random value for a single column in R, we can use impute function from Hmisc package.
For example, if we have a data frame called that contains a column say C which has some missing values then we can use the below given command to fill those missing values randomly −
df$C<-with(df,impute(C,"random"))
Example 1
Following snippet creates a sample data frame −
x<-sample(c(NA,2,5),20,replace=TRUE) df1<-data.frame(x) df1
The following dataframe is created −
x 1 NA 2 NA 3 2 4 2 5 2 6 NA 7 NA 8 NA 9 2 10 5 11 NA 12 NA 13 NA 14 2 15 2 16 NA 17 5 18 5 19 5 20 NA
To load Hmisc package and impute missing values in x randomly, add the following code to the above snippet −
library(Hmisc) df1$x<-with(df1,impute(x,"random")) df1
Output
If you execute all the above given snippets as a single program, it generates the following output −
x 1 2 2 5 3 2 4 2 5 2 6 2 7 2 8 5 9 2 10 5 11 2 12 5 13 2 14 2 15 2 16 2 17 5 18 5 19 5 20 2
Example 2
Following snippet creates a sample data frame −
y<-sample(c(NA,rnorm(3)),20,replace=TRUE) df2<-data.frame(y) df2
The following dataframe is created −
y 1 0.1912368 2 0.1912368 3 NA 4 0.1912368 5 -0.8921644 6 NA 7 -0.8921644 8 -0.8921644 9 0.3934629 10 NA 11 NA 12 0.3934629 13 0.1912368 14 0.3934629 15 0.3934629 16 0.1912368 17 -0.8921644 18 0.3934629 19 0.1912368 20 0.1912368
To impute missing values in y randomly, add the following code to the above snippet −
df2$y<-with(df2,impute(y,"random")) df2
Output
If you execute all the above given snippets as a single program, it generates the following output −
y 1 0.1912368 2 0.1912368 3 0.1912368 4 0.1912368 5 -0.8921644 6 0.3934629 7 -0.8921644 8 -0.8921644 9 0.3934629 10 0.1912368 11 -0.8921644 12 0.3934629 13 0.1912368 14 0.3934629 15 0.3934629 16 0.1912368 17 -0.8921644 18 0.3934629 19 0.1912368 20 0.1912368
- Related Articles
- How to convert a column with missing values to binary with 0 for missing values in R?
- Create a random sample by ignoring missing values in an R vector.
- How to combine columns by excluding missing values in R?
- How to replace missing values in a column with corresponding values in other column of an R data frame?
- How to repeat column values in R matrix by values in another column?
- How to replace missing values with median in an R data frame column?
- How to create the plot of a vector that contains missing values by adding the missing values in base R?
- How to create a bar plot with bars for missing values in R?
- How to separate two values in single column in R data frame?
- How to repeat column values in R data frame by values in another column?
- How to create a random vector for a range of values in R?
- How to find the number of unique values in a vector by excluding missing values in R?
- How to find the length of columns for missing values in R?
- How to separate two values in single column in data.table object in R?
- Find the frequency of unique values and missing values for each column in an R data frame.
