- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the percentage of each category in a data.table object column in R?
To find the percentage of each category in a data.table object in R, we can follow the below steps −
First of all, create a data.table object.
Then, use summarise function of dplyr package after grouping along with n and nrow.
Example
Create the data.table object
Let’s create a data.table object as shown below −
library(data.table) Factor<-sample(c("very low","low","medium","high","very high"),25,replace=TRUE) Response<-rnorm(25) DT<-data.table(Factor,Response) DT
Output
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Factor Response 1: low -1.61215323 2: very high -0.44482842 3: very low -0.08886876 4: very high -0.23859749 5: high 0.24527395 6: low 1.36309052 7: low 0.85434533 8: low -0.04598830 9: high -0.09693464 10: low -0.66470019 11: high -0.27726887 12: medium 0.30070368 13: very low -1.10935270 14: high 2.05570820 15: low -0.61390253 16: low 1.04354500 17: high -1.74834030 18: very low -0.84208037 19: medium 2.26732484 20: very high -0.30309612 21: low 0.72720852 22: high -2.52034038 23: very low -1.50381843 24: high -0.61927487 25: high 0.14854389 Factor Response
Find the percentage of each category in data.table object
Using summarise function of dplyr package after grouping along with n and nrow to find the percentage of each category in Factor column of data.table object DT −
library(data.table) Factor<-sample(c("very low","low","medium","high","very high"),25,replace=TRUE) Response<-rnorm(25) DT<-data.table(Factor,Response) library(dplyr) DT %>% group_by(Factor) %>% summarise(Percent=n()/nrow(.))
Output
# A tibble: 5 x 2 Factor Percent <chr> <dbl> 1 high 0.32 2 low 0.32 3 medium 0.08 4 very high 0.12 5 very low 0.16
Advertisements