- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the number of unique values of multiple categorical columns based on one categorical column in R?
To find the number of unique values of multiple categorical columns based on one categorical column, we can follow the below steps −
- First of all, create a data frame
- Use summarise_each function with n_distinct function to find the number of unique values based on a categorical column.
Create the data frame
Let's create a data frame as shown below −
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x C1 C2 1 Seventh B a 2 Third C c 3 Nineth A a 4 Third D c 5 Seventh D d 6 Fourth A c 7 Seventh B a 8 Third D a 9 Seventh D c 10 First A a 11 Eighth D d 12 Tenth C b 13 Fifth A c 14 Second A c 15 Fourth B d 16 Nineth C b 17 Fifth D a 18 First A a 19 Tenth B a 20 Nineth A b 21 Third B b 22 Tenth A a 23 Fifth A a 24 Sixth D b 25 First A c
Find number of unique values based on categorical column
Use n_distinct function and summarise_each function of dplyr package to find the number of unique values in C1 and C2 based on x −
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) library(dplyr) df %>% group_by(x) %>% summarise_each(funs(n_distinct(.)))
Output
# A tibble: 10 x 3 x C1 C2 <chr> <int> <int> 1 Eighth 1 1 2 Fifth 2 2 3 First 1 2 4 Fourth 2 2 5 Nineth 2 2 6 Second 1 1 7 Seventh 2 3 8 Sixth 1 1 9 Tenth 3 2 10 Third 3 3
- Related Articles
- Find the frequency of exclusive group combinations based on multiple categorical columns in R.
- How to find the sum of numerical columns based on the combination of values in categorical columns in R data frame?
- How to divide row values of a numerical column based on categorical column values in an R data frame?
- How to find the proportion of categories based on another categorical column in R's data.table object?
- How to subset an R data frame based on numerical and categorical column?
- How to create two lines using ggplot2 based on a categorical column in R?
- How to find the mean of multiple columns based on a character column in R?
- How to subset an R data frame with condition based on only one value from categorical column?
- How to extract unique rows by categorical column of a data.table object in R?
- How to find the column variance if some columns are categorical in R data frame?
- How to find the column means if some columns are categorical in R data frame?
- How to find the column median if some columns are categorical in R data frame?
- How to find the column minimum if some columns are categorical in R data frame?
- How to find the column maximum if some columns are categorical in R data frame?
- How to find the column totals if some columns are categorical in R data frame?

Advertisements