- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the proportion of categories based on another categorical column in R's data.table object?
To find the proportion of categories based on another categorical column in R's data.table object, we can follow the below steps −
- First of all, create a data.table object.
- Finding the proportion based on categorical column.
Create a data.table object
Loading data.table package and creating a data.table object with two categorical columns −
library(data.table) Category1<-sample(LETTERS[1:3],30,replace=TRUE) Category2<-sample(letters[1:4],30,replace=TRUE) DT<-data.table(Category1,Category2) DT
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Category1 Category2 1: C d 2: B a 3: C d 4: C a 5: B c 6: C b 7: B a 8: B b 9: B c 10: B d 11: B a 12: A d 13: C d 14: B a 15: A a 16: C a 17: B d 18: C c 19: C c 20: A b 21: A d 22: C c 23: B d 24: C c 25: A d 26: A b 27: B a 28: A c 29: B d 30: A d Category1 Category2
Finding the proportion based on categorical column
Use tabulate function with factor function to find the proportion of Category2 values in Category1 values −
library(data.table) Category1<-sample(LETTERS[1:3],30,replace=TRUE) Category2<-sample(letters[1:4],30,replace=TRUE) DT<-data.table(Category1,Category2) DT[order(Category2),.(Category2=letters[1:4],Proportion=tabulate(factor(Category2))/.N ),by=Category1]
Category1 Category2 Proportion 1: B a 0.41666667 2: B b 0.08333333 3: B c 0.16666667 4: B d 0.33333333 5: C a 0.20000000 6: C b 0.10000000 7: C c 0.40000000 8: C d 0.30000000 9: A a 0.12500000 10: A b 0.25000000 11: A c 0.12500000 12: A d 0.50000000
Advertisements