- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the table of mean of a numerical column based on two factors in R data frame?
To find the table of mean of a numerical column based on two factors in R data frame, we can follow the below steps −
First of all, create a data frame with two factor and one numerical column.
Then, find the table of mean of numerical column based on factor columns using tapply function.
Example1
Let’s create a data frame as shown below −
Group1<-sample(c("I","II","III"),25,replace=TRUE) Group2<-sample(c("Low","Medium","High"),25,replace=TRUE) Score<-sample(1:100,25) df<-data.frame(Group1,Group2,Score) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Output
Group1 Group2 Score 1 III Low 3 2 II High 45 3 III Medium 17 4 III Low 50 5 I Medium 40 6 III Low 77 7 II Medium 1 8 I High 73 9 II High 62 10 I High 5 11 II Medium 88 12 I High 98 13 I Medium 60 14 II Low 84 15 III Low 12 16 I Medium 66 17 III Medium 23 18 II High 61 19 I High 15 20 III High 94 21 II Medium 87 22 II Medium 37 23 I Medium 11 24 I High 26 25 I Medium 93
Find the table of mean based on two factor columns
Using tapply function with mean function to find the mean of Score column based on Group1 and Group2 columns −
Group1<-sample(c("I","II","III"),25,replace=TRUE) Group2<-sample(c("Low","Medium","High"),25,replace=TRUE) Score<-sample(1:100,25) df<-data.frame(Group1,Group2,Score) tapply(Score,list(df$Group1,df$Group2),mean)
Output
High Low Medium I 43.4 NA 54.00 II 56.0 84.0 53.25 III 94.0 35.5 20.00
Example2
Let’s create a data frame as shown below −
f1<-sample(c("Male","Female"),25,replace=TRUE) f2<-sample(c("Slow","Fast"),25,replace=TRUE) Result<-sample(1:100,25) dat<-data.frame(f1,f2,Result) dat
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Output
f1 f2 Result 1 Male Slow 37 2 Male Fast 13 3 Female Fast 87 4 Male Slow 36 5 Male Fast 22 6 Female Fast 86 7 Male Slow 42 8 Female Fast 17 9 Female Slow 46 10 Male Fast 27 11 Female Fast 49 12 Male Slow 24 13 Male Fast 53 14 Male Fast 67 15 Female Fast 28 16 Male Fast 6 17 Female Slow 61 18 Male Slow 90 19 Male Fast 12 20 Male Fast 47 21 Female Fast 9 22 Female Fast 66 23 Male Fast 73 24 Male Slow 14 25 Female Fast 81
Find the table of mean based on two factor columns
Using tapply function with mean function to find the mean of Result column based on f1 and f2 columns −
f1<-sample(c("Male","Female"),25,replace=TRUE) f2<-sample(c("Slow","Fast"),25,replace=TRUE) Result<-sample(1:100,25) dat<-data.frame(f1,f2,Result) tapply(Result,list(dat$f1,dat$f2),mean)
Output
Fast Slow Female 52.87500 53.5 Male 35.55556 40.5