How to perform shapiro test for all columns in an R data frame?


The shapiro test is used to test for the normality of variables and the null hypothesis for this test is the variable is normally distributed. If we have numerical columns in an R data frame then we might to check the normality of all the variables. This can be done with the help of apply function and shapiro.test as shown in the below example.

Example

 Live Demo

Consider the below data frame −

set.seed(321)
x1<−rnorm(20,2,0.34)
x2<−rpois(20,5)
x3<−rpois(20,2)
x4<−rpois(20,5)
x5<−rpois(20,6)
x6<−runif(20,1,5)
x7<−rexp(20,0.62)
x8<−rpois(20,10)
df<−data.frame(x1,x2,x3,x4,x5,x6,x7,x8)
df

Output

x1 x2 x3 x4 x5 x6 x7 x8
1 2.579667 7 0 2 4 4.712527 2.69354358 9
2 1.757907 4 0 3 3 1.519762 2.63275896 9
3 1.905485 5 2 5 4 3.087971 1.83827735 5
4 1.959319 7 0 10 14 3.564951 1.19092513 10
5 1.957853 7 3 5 5 4.576069 0.61126332 10
6 2.091182 4 0 4 10 3.316821 2.56506184 8
7 2.247126 3 4 5 7 1.636518 1.88751338 9
8 2.079266 8 4 7 7 3.018356 0.11237261 8
9 2.115299 3 2 7 4 4.516734 0.17862062 13
10 1.812349 3 0 6 5 3.009659 0.57255735 8
11 2.118218 5 2 6 4 1.025079 0.09536165 10
12 2.504761 4 1 3 4 1.936312 3.11482640 14
13 2.064031 1 0 5 7 2.388424 2.96859719 13
14 2.830708 2 4 9 6 3.779138 0.61244047 6
15 1.607831 6 5 7 7 2.740338 1.15703781 12
16 1.726412 6 3 5 7 4.690268 2.78394417 10
17 2.155064 3 2 8 11 4.043131 0.12627601 7
18 2.142913 3 4 8 4 1.481830 0.14825531 8
19 2.196379 4 2 3 6 1.490243 4.61761476 5
20 2.151761 6 1 5 2 1.914817 0.26060923 11

Applying shapiro test on all columns of df −

Example

apply(df,2,shapiro.test)

Output

$x1
Shapiro-Wilk normality test
data: newX[, i]
W = 0.94053, p-value = 0.2453
$x2
Shapiro-Wilk normality test
data: newX[, i]
W = 0.95223, p-value = 0.4022
$x3
Shapiro-Wilk normality test
data: newX[, i]
W = 0.88855, p-value = 0.02529
$x4
Shapiro-Wilk normality test
data: newX[, i]
W = 0.96244, p-value = 0.5938
$x5
Shapiro-Wilk normality test
data: newX[, i]
W = 0.87904, p-value = 0.017
$x6
Shapiro-Wilk normality test
data: newX[, i]
W = 0.93067, p-value = 0.1591
$x7
Shapiro-Wilk normality test
data: newX[, i]
W = 0.88531, p-value = 0.02208
$x8
Shapiro-Wilk normality test
data: newX[, i]
W = 0.96271, p-value = 0.5992

Updated on: 10-Feb-2021

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements