- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the mean of a numerical column by two categorical columns in an R data frame?
If we have two categorical columns along with a numerical column in an R data frame then we can find the mean of the numerical column by using the combination of the categorical columns with the help of aggregate function. For example, if a data frame df contains a numerical column X and two categorical columns C1 and C2 then the mean of X can be found for the combinations of C1 and C2 by using the below command −
aggregate(X~C1+C2,data=df,FUN="mean")
Example
Consider the below data frame −
C1<-sample(LETTERS[1:4],20,replace=TRUE) C2<-factor(sample(1:2,20,replace=TRUE)) X<-rnorm(20,30,2.87) df1<-data.frame(C1,C2,X) df1
Output
C1 C2 X 1 A 2 30.56001 2 D 2 32.18580 3 A 1 36.63182 4 B 1 32.35519 5 A 1 30.40990 6 B 2 31.57616 7 B 1 28.53280 8 D 1 32.35574 9 B 1 30.53733 10 A 1 27.79314 11 C 2 29.54564 12 A 2 27.64586 13 D 1 27.27475 14 D 2 33.99874 15 D 1 30.41017 16 C 1 27.66988 17 A 1 30.69182 18 A 2 34.12661 19 C 2 34.07609 20 C 1 32.29219
Finding the mean of X for the combinations of C1 and C2 −
Example
aggregate(X~C1+C2,data=df1,FUN="mean")
Output
C1 C2 X 1 A 1 31.38167 2 B 1 30.47510 3 C 1 29.98104 4 D 1 30.01355 5 A 2 30.77749 6 B 2 31.57616 7 C 2 31.81087 8 D 2 33.09227
Example
C1<-sample(c("Hot","Cold"),20,replace=TRUE) C2<-sample(0:1,20,replace=TRUE) Y<-rpois(20,5) df2<-data.frame(C1,C2,Y) df2
Output
C1 C2 Y 1 Cold 1 7 2 Hot 1 5 3 Cold 0 5 4 Hot 1 3 5 Hot 0 5 6 Cold 1 6 7 Cold 1 10 8 Cold 0 2 9 Hot 1 7 10 Hot 1 4 11 Cold 1 7 12 Hot 0 4 13 Cold 0 4 14 Hot 1 3 15 Hot 1 4 16 Cold 0 5 17 Cold 0 8 18 Cold 0 5 19 Cold 0 3 20 Hot 1 7
Finding the mean of Y for the combinations of C1 and C2 −
Example
aggregate(Y~C1+C2,data=df2,FUN="mean")
Output
C1 C2 Y 1 Cold 0 4.571429 2 Hot 0 4.500000 3 Cold 1 7.500000 4 Hot 1 4.714286
- Related Articles
- How to standardize only numerical columns in an R data frame if categorical columns also exist?
- How to find the position of maximum of each numerical column if some columns are categorical in R data frame?
- How to find the position of minimum of each numerical column if some columns are categorical in R data frame?
- How to subset an R data frame based on numerical and categorical column?
- How to find the table of mean of a numerical column based on two factors in R data frame?
- How to find the number of numerical columns in an R data frame?
- How to divide row values of a numerical column based on categorical column values in an R data frame?
- How to concatenate numerical columns in an R data frame?
- How to find the correlation matrix by considering only numerical columns in an R data frame?
- How to find the column variance if some columns are categorical in R data frame?
- How to find the column means if some columns are categorical in R data frame?
- How to find the column median if some columns are categorical in R data frame?
- How to find the column minimum if some columns are categorical in R data frame?
- How to find the column maximum if some columns are categorical in R data frame?
- How to find the column totals if some columns are categorical in R data frame?

Advertisements