If we have a column that represent factor then we might want to find the mean of values in other column(s) for the factor levels. This is helpful in comparing the levels of the factor. In R, we can find the mean for such type of data by using aggregate function. Check out the below examples to understand how it can be done.
Consider the below data frame:
> x1<-sample(c(LETTERS[1:4]),20,replace=TRUE) > y1<-rnorm(20,5,1) > df1<-data.frame(x1,y1) > df1
x1 y1 1 D 5.801197 2 B 3.432060 3 B 6.154168 4 A 5.466655 5 D 5.171689 6 C 5.175170 7 B 5.353469 8 D 4.840470 9 C 4.158980 10 B 4.711343 11 D 4.348326 12 A 5.933382 13 A 3.484782 14 A 2.004760 15 C 4.963307 16 D 4.728794 17 B 3.606417 18 B 6.234446 19 C 4.625489 20 B 6.569928
Finding the mean of y1 based on values in x1:
> aggregate(.~x1,data=df1,mean)
x1 y1 1 A 4.222395 2 B 5.151690 3 C 4.730736 4 D 4.978095
> x2<-sample(0:1,20,replace=TRUE) > y2<-rpois(20,5) > df2<-data.frame(x2,y2) > df2
x2 y2 1 1 6 2 0 5 3 1 3 4 0 3 5 1 4 6 0 7 7 0 5 8 0 3 9 0 5 10 0 4 11 0 4 12 0 7 13 0 4 14 0 6 15 0 2 16 1 7 17 0 9 18 1 2 19 0 6 20 0 5
Finding the mean of y2 based on values in x2:
> aggregate(.~x2,data=df2,mean)
x2 y2 1 0 5.0 2 1 4.4