- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to create an ID column in R based on categories?
If we have a categorical column in an R data frame then it can be used to create an ID column where each category will have its own ID defined on the basis of categories in the categorical column.
For this purpose, we would need to read the categorical column with as.factor and as.numeric function as shown in the below examples.
Example 1
Following snippet creates a sample data frame −
Group<-sample(c("Male","Female"),20,replace=TRUE) Score<-sample(20:50,20) df1<-data.frame(Group,Score) df1
Output
The following dataframe is created −
Group Score 1 Female 20 2 Female 27 3 Female 29 4 Male 50 5 Male 42 6 Female 41 7 Male 32 8 Male 25 9 Female 21 10 Female 49 11 Female 31 12 Female 28 13 Female 36 14 Female 26 15 Male 43 16 Female 45 17 Male 23 18 Female 46 19 Male 48 20 Male 33
To create ID column based on Group in df1, add the following code to the above snippet −
Group<-sample(c("Male","Female"),20,replace=TRUE) Score<-sample(20:50,20) df1<-data.frame(Group,Score) df1$ID<-as.numeric(as.factor(df1$Group)) df1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Group Score ID 1 Female 20 1 2 Female 27 1 3 Female 29 1 4 Male 50 2 5 Male 42 2 6 Female 41 1 7 Male 32 2 8 Male 25 2 9 Female 21 1 10 Female 49 1 11 Female 31 1 12 Female 28 1 13 Female 36 1 14 Female 26 1 15 Male 43 2 16 Female 45 1 17 Male 23 2 18 Female 46 1 19 Male 48 2 20 Male 33 2
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Rank<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Rank) df2
Output
The following dataframe is created −
Class Rank 1 Third 5 2 Third 7 3 First 3 4 Third 8 5 Second 9 6 Third 9 7 First 3 8 Second 10 9 First 4 10 Third 2 11 Third 8 12 Third 1 13 Third 10 14 First 6 15 Third 5 16 Second 6 17 Third 7 18 Third 5 19 Third 2 20 Second 5
To create ID column based on Class in df2, add the following code to the above snippet −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Rank<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Rank) df2$ID<-as.numeric(as.factor(df2$Class)) df2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Rank ID 1 Third 5 3 2 Third 7 3 3 First 3 1 4 Third 8 3 5 Second 9 2 6 Third 9 3 7 First 3 1 8 Second 10 2 9 First 4 1 10 Third 2 3 11 Third 8 3 12 Third 1 3 13 Third 10 3 14 First 6 1 15 Third 5 3 16 Second 6 2 17 Third 7 3 18 Third 5 3 19 Third 2 3 20 Second 5 2
- Related Articles
- How to create a new column in an R data frame based on some condition of another column?
- How to find the proportion of categories based on another categorical column in R's data.table object?
- How to create sample of rows using ID column in R?
- How to create a subset based on levels of a character column in R?
- How to create two lines using ggplot2 based on a categorical column in R?
- How to change row values based on column values in an R data frame?
- How to subset an R data frame based on numerical and categorical column?
- How to create a column with ratio of two columns based on a condition in R?
- How to add a new column in an R data frame with count based on factor column?
- How to create a column with binary variable based on a condition of other variable in an R data frame?
- How to create density plot for categories in R?
- How to create bar chart based on two groups in an R data frame?
- How to sort a matrix based on one column in R?
- How to create an ID column for the combination of values in multiple columns in R data frame?
- How to create boxplots based on two factor data in R?
