- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Why do we get warning 'newdata' had 1 row but variables found have X rows while predicting a linear model in R?
The reason we get newdata had 1 row warning is the newdata is not correctly defined. We should give the name of the explanatory variable or independent variable to the newdata so that the model can identify that we are passing the mean of the explanatory variable, otherwise it considers all the values of the explanatory hence the result of the predict function yields the predicted values for the sample size.
Example
Consider the below data frame −
set.seed(123) x<-rnorm(20,0.05) y<-rpois(20,5) df<-data.frame(x,y) df x y 1 -0.5104756 3 2 -0.1801775 4 3 1.6087083 4 4 0.1205084 4 5 0.1792877 3 6 1.7650650 3 7 0.5109162 3 8 -1.2150612 5 9 -0.6368529 4 10 -0.3956620 7 11 1.2740818 2 12 0.4098138 5 13 0.4507715 7 14 0.1606827 2 15 -0.5058411 5 16 1.8369131 3 17 0.5478505 3 18 -1.9166172 6 19 0.7513559 8 20 -0.4227914 4
Creating the linear model −
M<-lm(y~x,data=df)
Now let’s predict the value of y for the mean of x −
predict(M,newdata=data.frame(mean(df$x)),interval="confidence") fit lwr upr 1 4.645695 3.690676 5.600715 2 4.459543 3.635161 5.283925 3 3.451347 2.071115 4.831579 4 4.290080 3.520452 5.059707 5 4.256952 3.489416 5.024489 6 3.363226 1.876124 4.850329 7 4.070050 3.260221 4.879880 8 5.042792 3.669549 6.416034 9 4.716920 3.697691 5.736149 10 4.580988 3.678189 5.483786 11 3.639939 2.475080 4.804798 12 4.127031 3.339496 4.914565 13 4.103947 3.308320 4.899575 14 4.267438 3.499558 5.035318 15 4.643083 3.690292 5.595875 16 3.322734 1.785518 4.859949 17 4.049235 3.229372 4.869097 18 5.438181 3.566862 7.309500 19 3.934541 3.043288 4.825795 20 4.596277 3.681723 5.510832
Warning message −
'newdata' had 1 row but variables found have 20 rows
To get rid of this warning we need to define the newdata for x variable as shown below −
predict(M,newdata=data.frame(x=mean(df$x)),interval="confidence") fit lwr upr 1 4.25 3.482529 5.017471
Same thing happens when we try to predict y for a fixed value −
predict(M,newdata=data.frame(1.2),interval="confidence") fit lwr upr 1 4.645695 3.690676 5.600715 2 4.459543 3.635161 5.283925 3 3.451347 2.071115 4.831579 4 4.290080 3.520452 5.059707 5 4.256952 3.489416 5.024489 6 3.363226 1.876124 4.850329 7 4.070050 3.260221 4.879880 8 5.042792 3.669549 6.416034 9 4.716920 3.697691 5.736149 10 4.580988 3.678189 5.483786 11 3.639939 2.475080 4.804798 12 4.127031 3.339496 4.914565 13 4.103947 3.308320 4.899575 14 4.267438 3.499558 5.035318 15 4.643083 3.690292 5.595875 16 3.322734 1.785518 4.859949 17 4.049235 3.229372 4.869097 18 5.438181 3.566862 7.309500 19 3.934541 3.043288 4.825795 20 4.596277 3.681723 5.510832
Warning message −
'newdata' had 1 row but variables found have 20 rows
predict(M,newdata=data.frame(x=1.2),interval="confidence") fit lwr upr 1 3.681691 2.56125 4.802131