How to perform one-way anova with unequal sample sizes in R?


To perform the one-way anova with sample sizes having different sizes we can use aov function. Suppose we have a categorical column defined as Group with four categories and a continuous variable Response both stored in a data frame called df then the one-way anova can be performed as −

aov(Response~Group,data=df)

Example

Consider the below data frame −

 Live Demo

Group<-sample(LETTERS[1:4],20,replace=TRUE)
Response<-rpois(20,2)
df1<-data.frame(Group,Response)
df1

Output

 Group Response
1  B    1
2  B    2
3  A    1
4  D    2
5  B    1
6  B    0
7  A    2
8  B    3
9  B    2
10 A    2
11 A    3
12 C    2
13 B    0
14 C    1
15 C    3
16 C    2
17 C    1
18 D    4
19 A    1
20 A    4

Example

str(df1)

Output

'data.frame': 20 obs. of 2 variables:
$ Group : chr "B" "B" "A" "D" ...
$ Response: int 1 2 1 2 1 0 2 3 2 2 ...

Performing one-way anova for data in df1 −

Example

ANOVA_model_1<-aov(Response~Group,data=df1)
summary(ANOVA_model_1)

Output

Df Sum Sq Mean Sq F value Pr(>F)
Group 3 5.488 1.829 1.536 0.244
Residuals 16 19.062 1.191

Example

 Live Demo

Class<-sample(c("I","II","III"),20,replace=TRUE)
Score<-sample(1:100,20)
df2<-data.frame(Class,Score)
df2

Output

 Class Score
1  I    35
2 II    74
3 II    24
4 III   27
5  I    63
6 II    92
7 II    50
8 III   30
9  I    1
10 I    23
11 II   84
12 I    48
13 I    36
14 I    58
15 II   16
16 II   18
17 I    28
18 III  70
19 II   47
20 I    75

Performing one-way anova for data in df2 −

Example

ANOVA_model_2<-aov(Score~Class,data=df2)
summary(ANOVA_model_2)

Output

Df Sum Sq Mean Sq F value Pr(>F)
Class 2 435 217.4 0.317 0.732
Residuals 17 11642 684.8

Example

 Live Demo

Categories<-sample(c("C1","C2","C3"),20,replace=TRUE)
NetScore<-sample(1:10,20,replace=TRUE)
df3<-data.frame(Categories,NetScore)
df3

Output

   Categories NetScore
1    C1       7
2    C2       5
3    C2       4
4    C2      10
5    C2       9
6    C1       1
7    C3       1
8    C2       4
9    C1       2
10   C2       3
11   C3       5
12   C3       6
13   C1       9
14   C1       4
15   C2       2
16   C3       6
17   C1       6
18   C2       3
19   C1       5
20   C2       6

Example

ANOVA_model_3<-aov(NetScore~Categories,data=df3)
summary(ANOVA_model_3)

Output

Df Sum Sq Mean Sq F value Pr(>F)
Categories 2 1.05 0.527 0.072 0.931
Residuals 17 124.75 7.338

Updated on: 07-Dec-2020

824 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements