- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the proportion of categories based on another categorical column in R's data.table object?
To find the proportion of categories based on another categorical column in R's data.table object, we can follow the below steps −
- First of all, create a data.table object.
- Finding the proportion based on categorical column.
Create a data.table object
Loading data.table package and creating a data.table object with two categorical columns −
library(data.table) Category1<-sample(LETTERS[1:3],30,replace=TRUE) Category2<-sample(letters[1:4],30,replace=TRUE) DT<-data.table(Category1,Category2) DT
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Category1 Category2 1: C d 2: B a 3: C d 4: C a 5: B c 6: C b 7: B a 8: B b 9: B c 10: B d 11: B a 12: A d 13: C d 14: B a 15: A a 16: C a 17: B d 18: C c 19: C c 20: A b 21: A d 22: C c 23: B d 24: C c 25: A d 26: A b 27: B a 28: A c 29: B d 30: A d Category1 Category2
Finding the proportion based on categorical column
Use tabulate function with factor function to find the proportion of Category2 values in Category1 values −
library(data.table) Category1<-sample(LETTERS[1:3],30,replace=TRUE) Category2<-sample(letters[1:4],30,replace=TRUE) DT<-data.table(Category1,Category2) DT[order(Category2),.(Category2=letters[1:4],Proportion=tabulate(factor(Category2))/.N ),by=Category1]
Category1 Category2 Proportion 1: B a 0.41666667 2: B b 0.08333333 3: B c 0.16666667 4: B d 0.33333333 5: C a 0.20000000 6: C b 0.10000000 7: C c 0.40000000 8: C d 0.30000000 9: A a 0.12500000 10: A b 0.25000000 11: A c 0.12500000 12: A d 0.50000000
- Related Articles
- How to subset an R data frame based on numerical and categorical column?
- How to find the number of unique values of multiple categorical columns based on one categorical column in R?
- How to create an ID column in R based on categories?
- How to find the position of a data frame’s column value based on a column value of another data frame in R?
- How to assign a column value in a data frame based on another column in another R data frame?
- How to find the counts of categories in categorical columns in an R data frame?
- How to divide row values of a numerical column based on categorical column values in an R data frame?
- How to find the sum based on a categorical variable in an R data frame?
- How to find the column means of a column based on another column values that represent factor in an R data frame?
- How to extract a data frame’s column value based on a column value of another data frame in R?
- How to find the table of mean of a numerical column based on two factors in R data frame?
- Find the sum of a column values based on another numerical column in R.
- How to create a new column in an R data frame based on some condition of another column?
- How to subset an R data frame with condition based on only one value from categorical column?
- How to create two lines using ggplot2 based on a categorical column in R?

Advertisements