- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to perform group-wise linear regression for a data frame in R?
The group−wise linear regression means creating regression model for group levels. For example, if we have a dependent variable y and the independent variable x also a grouping variable G that divides the combination of x and y into multiple groups then we can create a linear regression model for each of the group. In R, we can convert data frame to data.table object, this will help us to create the regression models easily.
Example
Consider the below data frame −
G1<−sample(LETTERS[1:4],20,replace=TRUE) x1<−rnorm(20,2,0.96) y1<−rnorm(20,5,1) df1<−data.frame(G1,x1,y1) df1
Output
G1 x1 y1 1 C 1.2692290 3.994126 2 C 1.6317682 4.474443 3 D 1.3686734 5.444823 4 D 2.4969567 5.818360 5 C 2.3882221 3.766412 6 A 2.7568873 5.506297 7 A 2.1352764 4.548771 8 B 2.5232049 5.378314 9 A 2.8695959 4.735447 10 C −0.2317400 5.280478 11 A 1.1473469 5.064822 12 A 2.9099241 4.090654 13 A 2.4095434 6.538454 14 C 2.5310162 7.137598 15 A 2.4097431 4.778472 16 C 0.4945313 5.511772 17 C 1.3427334 5.030479 18 A 1.5200120 6.758618 19 A 2.4414779 5.854175 20 B −0.6968409 4.594522
Loading data.table package and converting data frame df1 to a data.table object −
library(data.table) df1<−data.table(df1)
Creating linear regression model groups defined in column G1 −
df1[,as.list(coef(lm(y1 ~ x1))), by=G1]
Output
G1 (Intercept) x1 1: C 4.959098 0.05109642 2: D 4.991700 0.33106700 3: A 6.536957 -0.53189331 4: B 4.764140 0.24341026
Let’s have a look at another example −
Class<−sample(c("I","II","III"),20,replace=TRUE) Ratings<−sample(1:10,20,replace=TRUE) Salary<−sample(20000:50000,20) df2<−data.frame(Class,Ratings,Salary) df2
Output
Class Ratings Salary 1 I 4 28423 2 III 1 34728 3 II 1 26975 4 I 9 26777 5 II 6 29501 6 I 8 33061 7 II 4 43584 8 I 4 42525 9 II 9 30526 10 I 1 32872 11 I 7 21198 12 I 3 20971 13 III 9 49071 14 I 1 40314 15 III 1 36269 16 I 6 45482 17 II 1 48595 18 I 8 44054 19 I 1 25294 20 III 10 34944 df2<−data.table(df2)
Creating regression models of Salary and Ratings for the three Classes −
df2[,as.list(coef(lm(Salary~Ratings))),by=Class]
Output
Class (Intercept) Ratings 1: I 31894.13 194.9152 2: III 35270.10 663.4089 3: II 40405.42 -1087.9103
- Related Articles
- How to find group-wise summary statistics for an R data frame?
- How to add regression residuals to data frame in R?
- Find the group wise large and small values in an R data frame.
- How to perform Wilcoxon test for all columns in an R data frame?
- How to perform shapiro test for all columns in an R data frame?
- How to create a group column in an R data frame?
- How to create a data frame of the maximum value for each group in an R data frame using dplyr?
- How to find the ID wise frequency in an R data frame?
- How to select rows with group wise minimum or maximum values of a variable in an R data frame using dplyr?
- How to create group names for consecutively duplicate values in an R data frame column?
- How to extract p-value and R-squared from a linear regression in R?
- How to find residual variance of a linear regression model in R?
- How to find the row-wise frequency of zeros in an R data frame?
- How to find the row wise mode of strings in an R data frame?
- How to create a predictive linear regression line for a range of independent variable in base R?

Advertisements