How to find the most frequent factor value in an R data frame column?

R ProgrammingServer Side ProgrammingProgramming

To find the most frequent factor value in an R data frame column, we can use names function with which.max function after creating the table for the particular column. This might be required while doing factorial analysis and we want to know which factor occurs the most.

Check out the below examples to understand how it can be done.

Example 1

Following snippet creates a sample data frame −

Factor_1<-factor(sample(LETTERS[1:4],20,replace=TRUE))
df1<-data.frame(Factor_1)
df1

The following dataframe is created −

 Factor_1
1  B
2  D
3  B
4  D
5  C
6  D
7  D
8  C
9  C
10 C
11 C
12 C
13 C
14 C
15 A
16 D
17 C
18 C
19 B
20 C

To find which factor occurs the most in df1, add the following code to the above snippet −

Factor_1<-factor(sample(LETTERS[1:4],20,replace=TRUE))
df1<-data.frame(Factor_1)
names(which.max(table(df1$Factor_1)))

Output

If you execute all the above given snippets as a single program, it generates the following output: −

[1] "C"

Example 2

Following snippet creates a sample data frame −

Factor_2<-factor(sample(c("Male","Female"),20,replace=TRUE))
df2<-data.frame(Factor_2)
df2

The following dataframe is created −

   Factor_2
1  Female
2  Female
3  Male
4  Female
5  Male
6  Male
7  Female
8  Male
9  Male
10 Female
11 Female
12 Female
13 Female
14 Male
15 Female
16 Female
17 Female
18 Female
19 Female
20 Female

To find which factor occurs the most in df2, add the following code to the above snippet −

Factor_2<-factor(sample(c("Male","Female"),20,replace=TRUE))
df2<-data.frame(Factor_2)
names(which.max(table(df2$Factor_2)))

Output

If you execute all the above given snippets as a single program, it generates the following output: −

[1] "Female"

Example 3

Following snippet creates a sample data frame −

Factor_3<-factor(sample(c("Hot","Cold","Warm","Lukewarm"),20,replace=TRUE))
df3<-data.frame(Factor_3)
df3

The following dataframe is created −

   Factor_3
1  Hot
2  Lukewarm
3  Warm
4  Warm
5  Cold
6  Hot
7  Hot
8  Warm
9  Warm
10 Warm
11 Hot
12 Lukewarm
13 Cold
14 Lukewarm
15 Lukewarm
16 Lukewarm
17 Hot
18 Lukewarm
19 Lukewarm
20 Lukewarm

To find which factor occurs the most in df3, add the following code to the above snippet −

Factor_3<-factor(sample(c("Hot","Cold","Warm","Lukewarm"),20,replace=TRUE))
df3<-data.frame(Factor_3)
names(which.max(table(df3$Factor_3)))

Output

If you execute all the above given snippets as a single program, it generates the following output: −

[1] "Lukewarm"
raja
Updated on 11-Nov-2021 05:17:15

Advertisements