- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to create an ID column in R based on categories?
If we have a categorical column in an R data frame then it can be used to create an ID column where each category will have its own ID defined on the basis of categories in the categorical column.
For this purpose, we would need to read the categorical column with as.factor and as.numeric function as shown in the below examples.
Example 1
Following snippet creates a sample data frame −
Group<-sample(c("Male","Female"),20,replace=TRUE) Score<-sample(20:50,20) df1<-data.frame(Group,Score) df1
Output
The following dataframe is created −
Group Score 1 Female 20 2 Female 27 3 Female 29 4 Male 50 5 Male 42 6 Female 41 7 Male 32 8 Male 25 9 Female 21 10 Female 49 11 Female 31 12 Female 28 13 Female 36 14 Female 26 15 Male 43 16 Female 45 17 Male 23 18 Female 46 19 Male 48 20 Male 33
To create ID column based on Group in df1, add the following code to the above snippet −
Group<-sample(c("Male","Female"),20,replace=TRUE) Score<-sample(20:50,20) df1<-data.frame(Group,Score) df1$ID<-as.numeric(as.factor(df1$Group)) df1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Group Score ID 1 Female 20 1 2 Female 27 1 3 Female 29 1 4 Male 50 2 5 Male 42 2 6 Female 41 1 7 Male 32 2 8 Male 25 2 9 Female 21 1 10 Female 49 1 11 Female 31 1 12 Female 28 1 13 Female 36 1 14 Female 26 1 15 Male 43 2 16 Female 45 1 17 Male 23 2 18 Female 46 1 19 Male 48 2 20 Male 33 2
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Rank<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Rank) df2
Output
The following dataframe is created −
Class Rank 1 Third 5 2 Third 7 3 First 3 4 Third 8 5 Second 9 6 Third 9 7 First 3 8 Second 10 9 First 4 10 Third 2 11 Third 8 12 Third 1 13 Third 10 14 First 6 15 Third 5 16 Second 6 17 Third 7 18 Third 5 19 Third 2 20 Second 5
To create ID column based on Class in df2, add the following code to the above snippet −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Rank<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Rank) df2$ID<-as.numeric(as.factor(df2$Class)) df2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Rank ID 1 Third 5 3 2 Third 7 3 3 First 3 1 4 Third 8 3 5 Second 9 2 6 Third 9 3 7 First 3 1 8 Second 10 2 9 First 4 1 10 Third 2 3 11 Third 8 3 12 Third 1 3 13 Third 10 3 14 First 6 1 15 Third 5 3 16 Second 6 2 17 Third 7 3 18 Third 5 3 19 Third 2 3 20 Second 5 2