- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to deal with glm.fit error “NA/NaN/Inf” for logistic regression model in R?
When we create a general linear model for logistic regression model, we need to specify the distribution family as binomial. The error “NA/NaN/Inf” occurs when we do not specify the distribution family. Hence, family="binomial" needs to be used inside glm function while creating the logistic regression model.
Example 1
Following snippet creates a sample data frame −
iv1<-rpois(20,5) iv2<-rpois(20,2) iv3<-rpois(20,5) DV1<-sample(0:1,20,replace=TRUE) df1<-data.frame(iv1,iv2,iv3,DV1) df1
The following dataframe is created −
iv1 iv2 iv3 DV1 1 5 2 6 0 2 3 1 3 1 3 3 4 8 1 4 5 3 3 1 5 8 2 6 1 6 3 1 4 0 7 6 1 8 1 8 3 1 7 0 9 9 2 6 0 10 7 2 4 0 11 6 4 5 1 12 12 2 4 1 13 6 2 2 0 14 5 1 3 0 15 4 1 10 0 16 3 3 4 0 17 4 1 6 1 18 9 3 4 1 19 7 1 3 1 20 4 3 4 0
To create logistic regression model for data in df1, add the following code to the above snippet −
Model_1<-glm(factor(DV1)~iv1+iv2+iv3,data=df1)
Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, −
NA/NaN/Inf in 'y'
In addition: Warning messages −
1: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
2: In Ops.factor(eta, offset) : ‘-’ not meaningful for factors
3: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
To create logistic regression model for data in df1 with distribution family as binomial, add the following code to the above snippet −
iv1<-rpois(20,5) iv2<-rpois(20,2) iv3<-rpois(20,5) DV1<-sample(0:1,20,replace=TRUE) df1<-data.frame(iv1,iv2,iv3,DV1) Model_1<-glm(factor(DV1)~iv1+iv2+iv3,data=df1,family="binomial") summary(Model_1)
Outpu
If you execute all the above given codes as a single program, it generates the following output −
Call: glm(formula = factor(DV1) ~ iv1 + iv2 + iv3, family = "binomial", data = df1) Deviance Residuals: Min 1Q Median 3Q Max -1.61472 -1.05484 -0.07657 1.07422 1.71351 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.59874 2.15616 -1.205 0.228 iv1 0.26684 0.22055 1.210 0.226 iv2 0.38736 0.47527 0.815 0.415 iv3 0.06822 0.23316 0.293 0.770 (Dispersion parameter for binomial family taken to be 1) Null deviance: 27.726 on 19 degrees of freedom Residual deviance: 25.223 on 16 degrees of freedom AIC: 33.223 Number of Fisher Scoring iterations: 4
Example 2
Following snippet creates a sample data frame −
x1<-runif(20,2,10) x2<-rnorm(20) DV2<-sample(0:1,20,replace=TRUE) df2<-data.frame(x1,x2,DV2) df2
The following dataframe is created −
x1 x2 DV2 1 9.599662 -0.37487878 1 2 3.670901 -1.05763026 0 3 5.856532 -1.63384915 1 4 5.140322 0.70749809 1 5 7.215530 -0.45739769 0 6 2.347001 0.25501067 1 7 7.997737 0.32140975 0 8 4.880330 0.45770428 1 9 4.680856 1.36704134 1 10 3.720922 0.45992890 0 11 9.192565 0.91105622 0 12 7.699731 -0.35100775 1 13 3.183395 1.31957271 1 14 5.571414 0.82899477 0 15 6.724491 0.01077159 0 16 8.844951 -0.27490769 1 17 6.509826 0.25185960 1 18 9.098870 -1.75332078 1 19 2.230271 -0.52357984 1 20 4.004921 0.51763553 1
To create logistic regression model for data in df1, add the following code to the above snippet −
Model_2<-glm(factor(DV2)~x1+x2,data=df2)
Error in glm.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, −
NA/NaN/Inf in 'y'
In addition: Warning messages −
1: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
2: In Ops.factor(eta, offset) : ‘-’ not meaningful for factors
3: In Ops.factor(y, mu) : ‘-’ not meaningful for factors
To create logistic regression model for data in df2 with distribution family as binomial, add the following code to the above snippet −
x1<-runif(20,2,10) x2<-rnorm(20) DV2<-sample(0:1,20,replace=TRUE) df2<-data.frame(x1,x2,DV2) Model_2<-glm(factor(DV2)~x1+x2,data=df2,family="binomial") summary(Model_2)
Outpu
If you execute all the above given codes as a single program, it generates the following output −
Call: glm(formula = factor(DV2) ~ x1 + x2, family = "binomial", data = df2) Deviance Residuals: Min 1Q Median 3Q Max -1.7809 -1.2987 0.8107 0.9623 1.0866 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.5657 1.4628 1.070 0.284 x1 -0.1536 0.2236 -0.687 0.492 x2 -0.3353 0.6104 -0.549 0.583 (Dispersion parameter for binomial family taken to be 1) Null deviance: 25.898 on 19 degrees of freedom Residual deviance: 25.267 on 17 degrees of freedom AIC: 31.267 Number of Fisher Scoring iterations: 4
- Related Articles
- How to set the coefficient of one variable to 1 for logistic regression model in R?\n
- How to deal with NA output of apply in R?
- How to find the 95% confidence interval for the glm model in R?
- How to deal with error “$ operator is invalid for atomic vectors” in R?
- How to find the residual of a glm model in R?
- How to deal with “could not find function” error in R?
- How to deal with error invalid xlim value in base R plot?
- How to create polynomial regression model in R?
- How to extract the regression coefficients, standard error of coefficients, t scores, and p-values from a regression model in R?
- How to display p-value with coefficients in stargazer output for linear regression model in R?
- How to display R-squared value on scatterplot with regression model line in R?
- How to convert NaN to NA in an R data frame?
- How to deal with error “undefined columns selected” while subsetting data in R?
- How to deal with the error “Error in int_abline---plot.new has not been called yet” in R?
- Explain how the logistic regression function works with Tensorflow?
