How to convert MANOVA data frame for two-dependent variables into a count table in R?


MANOVA refers to multivariate analysis of variance, in this method we have more than one dependent variable and multiple independent variables. We want to compare each level of the independent variable combination for each of the dependent variables. To convert MANOVA data frame for two-dependent variables into a count table, we can use cast function of reshape package but we need to melt the data frame first so that the casting can be done appropriately.

Example

 Live Demo

Consider the below data frame −

Gender<−sample(c("Male","Female"),20,replace=TRUE)
Class<−sample(c("I","II","III"),20,replace=TRUE)
Score<−sample(1:100,20)
Rating<−sample(1:10,20,replace=TRUE)
df1<−data.frame(Gender,Class,Score,Rating)
df1

Output

  Gender Class Score Rating
1 Male    II    96    9
2 Male    I     38    3
3 Female  III   32    5
4 Male    I     77    2
5 Male    I     62    2
6 Female  II    81    9
7 Male    II    90    2
8 Female  III   79    8
9 Male    III 34 8
10 Male II 36 9
11 Male I 57 5
12 Male I 29 1
13 Female III 100 7
14 Female II 94 5
15 Male I 35 9
16 Female III 78 4
17 Female I 18 3
18 Female I 47 9
19 Female III 61 1
20 Male III 60 3

Loading reshape package −

library(reshape)

Melting df1 −

df1_melt<−melt(df1)

Using Gender, Class as id variables

Finding the counts based on Gender and Class −

cast(df1_melt,Gender~Class+variable)

Aggregation requires fun.aggregate: length used as default

Gender I_Score I_Rating II_Score II_Rating III_Score III_Rating
1 Female 2 2 2 2 5 5
2 Male 6 6 3 3 2 2

Let’s have a look at another example −

ID<<sample(c("Y1","Y2","Y3","Y4"),20,replace=TRUE)
Grade<<sample(LETTERS[1:3],20,replace=TRUE)
Sal<<sample(20000:50000,20)
Count<<sample(200:210,20,replace=TRUE)
df2<<data.frame(ID,Grade,Sal,Count)
df2

Output

ID Grade Sal Count
1 Y3 B 28528 204
2 Y3 C 40854 207
3 Y3 A 31199 207
4 Y4 B 25338 207
5 Y3 B 30180 209
6 Y2 B 29921 209
7 Y4 C 46134 210
8 Y4 B 46829 205
9 Y3 B 42607 205
10 Y1 A 38174 202
11 Y2 A 41451 207
12 Y1 C 23912 200
13 Y4 B 44047 209
14 Y2 B 32236 200
15 Y2 A 24851 203
16 Y2 B 36341 207
17 Y3 B 37003 208
18 Y2 C 37285 207
19 Y3 B 45113 207
20 Y3 A 40034 203
df2_melt<−melt(df2)

Using ID, Grade as id variables

Finding counts based on ID and Grade −

cast(df2_melt,ID~Grade+variable)

Aggregation requires fun.aggregate: length used as default

      ID    A_Sal    A_Count    B_Sal    B_Count    C_Sal    C_Count
1    Y1       1          1       0          0          1       1
2    Y2       2          2       3          3          1       1
3    Y3       2          2       5          5          1       1
4    Y4       0          0       3          3          1       1

Updated on: 06-Nov-2020

144 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements