- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to perform shapiro test for all columns in an R data frame?
The shapiro test is used to test for the normality of variables and the null hypothesis for this test is the variable is normally distributed. If we have numerical columns in an R data frame then we might to check the normality of all the variables. This can be done with the help of apply function and shapiro.test as shown in the below example.
Example
Consider the below data frame −
set.seed(321) x1<−rnorm(20,2,0.34) x2<−rpois(20,5) x3<−rpois(20,2) x4<−rpois(20,5) x5<−rpois(20,6) x6<−runif(20,1,5) x7<−rexp(20,0.62) x8<−rpois(20,10) df<−data.frame(x1,x2,x3,x4,x5,x6,x7,x8) df
Output
x1 x2 x3 x4 x5 x6 x7 x8 1 2.579667 7 0 2 4 4.712527 2.69354358 9 2 1.757907 4 0 3 3 1.519762 2.63275896 9 3 1.905485 5 2 5 4 3.087971 1.83827735 5 4 1.959319 7 0 10 14 3.564951 1.19092513 10 5 1.957853 7 3 5 5 4.576069 0.61126332 10 6 2.091182 4 0 4 10 3.316821 2.56506184 8 7 2.247126 3 4 5 7 1.636518 1.88751338 9 8 2.079266 8 4 7 7 3.018356 0.11237261 8 9 2.115299 3 2 7 4 4.516734 0.17862062 13 10 1.812349 3 0 6 5 3.009659 0.57255735 8 11 2.118218 5 2 6 4 1.025079 0.09536165 10 12 2.504761 4 1 3 4 1.936312 3.11482640 14 13 2.064031 1 0 5 7 2.388424 2.96859719 13 14 2.830708 2 4 9 6 3.779138 0.61244047 6 15 1.607831 6 5 7 7 2.740338 1.15703781 12 16 1.726412 6 3 5 7 4.690268 2.78394417 10 17 2.155064 3 2 8 11 4.043131 0.12627601 7 18 2.142913 3 4 8 4 1.481830 0.14825531 8 19 2.196379 4 2 3 6 1.490243 4.61761476 5 20 2.151761 6 1 5 2 1.914817 0.26060923 11
Applying shapiro test on all columns of df −
Example
apply(df,2,shapiro.test)
Output
$x1 Shapiro-Wilk normality test data: newX[, i] W = 0.94053, p-value = 0.2453 $x2 Shapiro-Wilk normality test data: newX[, i] W = 0.95223, p-value = 0.4022 $x3 Shapiro-Wilk normality test data: newX[, i] W = 0.88855, p-value = 0.02529 $x4 Shapiro-Wilk normality test data: newX[, i] W = 0.96244, p-value = 0.5938 $x5 Shapiro-Wilk normality test data: newX[, i] W = 0.87904, p-value = 0.017 $x6 Shapiro-Wilk normality test data: newX[, i] W = 0.93067, p-value = 0.1591 $x7 Shapiro-Wilk normality test data: newX[, i] W = 0.88531, p-value = 0.02208 $x8 Shapiro-Wilk normality test data: newX[, i] W = 0.96271, p-value = 0.5992
Advertisements