How to find the sum of squared deviations for an R data frame column?


The sum of squared deviations is the total of the square of difference between each value and the mean. To find this value, we need to create the formula in R platform. For example, if we have a data frame called df that contains a column x then the sum of squared deviations for x can be calculated by using sum((df$x−mean(df$x))^2).

Example1

 Live Demo

Consider the below data frame −

set.seed(1021)
x1<−letters[1:20]
x2<−rpois(20,5)
df1<−data.frame(x1,x2)
df1

Output

x1 x2
1 a 4
2 b 2
3 c 2
4 d 4
5 e 4
6 f 6
7 g 4
8 h 4
9 i 8
10 j 4
11 k 4
12 l 3
13 m 6
14 n 3
15 o 7
16 p 0
17 q 2
18 r 8
19 s 3
20 t 5

Finding the sum of squared deviations for column x2 in df1 −

Example

sum((df1$x2−mean(df1$x2))^2)

Output

[1] 80.55

Example2

 Live Demo

y1<−1:20
y2<−rnorm(20,2525,301.2)
df2<−data.frame(y1,y2)
df2

Output

y1 y2
1 1 2643.340
2 2 2682.804
3 3 2555.982
4 4 2906.473
5 5 1771.400
6 6 2763.651
7 7 2818.183
8 8 3184.697
9 9 2731.398
10 10 2530.297
11 11 2361.374
12 12 2534.605
13 13 2266.180
14 14 2237.827
15 15 3178.079
16 16 2761.979
17 17 2224.662
18 18 2351.776
19 19 2200.108
20 20 2067.530

Finding the sum of squared deviations for column y2 in df2 −

Example

sum((df2$y2−mean(df2$y2))^2)

Output

[1] 2464370

Updated on: 09-Feb-2021

867 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements