How to find the high leverage values for a regression model in R?

R ProgrammingServer Side ProgrammingProgramming

To find the high leverage values for a regression model, we first need to find the predicted values or hat values that can be found by using hatvalues function and then define the condition for high leverage and extract them. For example if we have a regression model say M then the hat values can be found by using the command hatvalues(M), now to find the high leverage values that are greater than 0.05 can be found by using the below code −

which(hatvalues(M)>0.05)

Example1

Consider the below data frame −

 Live Demo

x1<-c(100,rnorm(20))
y1<-c(100,rnorm(20))
df1<-data.frame(x1,y1)
df1

Output

      x1               y1
1   100.000000000 1.000000e+02
2   0.719522993   2.605494e-01
3   1.090771537  -1.865098e-04
4  -1.011001579  -8.651429e-01
5   0.371659744  -4.091896e-01
6  -1.604978659   2.156159e-01
7   0.062783668   3.888361e-01
8  -1.039950551  -1.191512e+00
9   0.221366328  -3.761471e-01
10 -1.034533695  -1.374490e+00
11  0.802379399   2.332555e+00
12 -0.005172285   8.169747e-01
13  2.082001055  -8.485883e-01
14  1.078353650   1.659481e+00
15  1.493539847   1.546731e+00
16 -0.504001706  -6.429116e-01
17 -0.251490099  -1.853292e-01
18  0.802616056  -1.794599e+00
19  0.333122883   8.486888e-02
20  1.284447868   4.464472e-01
21 -0.158662245   6.892470e-01

Creating the regression model between x1 and y1 −

Model1<-lm(y1~x1,data=df1)

Finding the hat values using Model1 −

hatvalues(Model1)
      1         2          3         4          5         6          7
0.99823012 0.04953700 0.04921783 0.05140777 0.04986241 0.05219527 0.05017270
      8         9          10        11         12         13       14
0.05144442 0.05001088 0.05143755 0.04946325 0.05024367 0.04850787 0.04922804
      15       16          17        18         19          20      21
0.04890439 0.05079436 0.05050904 0.04946304 0.04990002 0.04906284 0.05040753

Finding high leverage value by considering values greater than 0.05 −

which(hatvalues(Model1)>0.05)
1 4 6 7 8 9 10 12 16 17 21
1 4 6 7 8 9 10 12 16 17 21

Example2

 Live Demo

x2<-rpois(20,5)
y2<-rpois(20,1)
df2<-data.frame(x2,y2)
df2

Output

   x2 y2
1  3  2
2  4  1
3  7  0
4  6  0
5  6  0
6  6  1
7  2  3
8  7  1
9  3  2
10 5  0
11 6  1
12 6  0
13 3  0
14 5  1
15 6  0
16 5  0
17 9  1
18 3  0
19 5  0
20 3  4

Creating the regression model between x2 and y2 −

Model2<-lm(y2~x2,data=df2)

Finding the hat values using Model2 −

hatvalues(Model2)
       1         2         3         4          5          6         7
0.11666667 0.06666667 0.11666667 0.06666667 0.06666667 0.06666667 0.20000000
       8         9        10         11         12         13       14
0.11666667 0.11666667 0.05000000 0.06666667 0.06666667 0.11666667 0.05000000
      15        16         17        18         19          20
0.06666667 0.05000000 0.31666667 0.11666667 0.05000000 0.11666667

Finding high leverage value by considering values greater than 0.05 −

which(hatvalues(Model2)>0.08)
1 3 7 8 9 13 17 18 20
1 3 7 8 9 13 17 18 20
raja
Updated on 06-Mar-2021 12:08:26

Advertisements