Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to find the frequency of a particular string in a column based on another column in an R data frame using dplyr package?
When we have two or more categorical columns in an R data frame with strings as level of the categories or numbers as strings/integers then we can find the frequency of one based on another. This will help us to identify the cross-column frequencies and we can understand the distribution of one categorical based on another column. To do this with dplyr package, we can use filter function.
Example
Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Gender<−sample(c("Male","Female"),20,replace=TRUE)
df2<−data.frame(Gender,Class)
df2
Output
Gender Class 1 Female Third 2 Female First 3 Female Second 4 Male Third 5 Male Third 6 Female Second 7 Male First 8 Female Third 9 Female Second 10 Female Second 11 Female First 12 Female Second 13 Male First 14 Female Third 15 Female Third 16 Male Third 17 Male Third 18 Male Second 19 Female Second 20 Male Second df2%>%filter(Class=="Third")%>%count(Gender) Gender n 1 Female 4 2 Male 4 df2%>%filter(Class=="First")%>%count(Gender) Gender n 1 Female 2 2 Male 2 df2%>%filter(Class=="Second")%>%count(Gender) Gender n 1 Female 6 2 Male 2
Advertisements
