How to deal with error “var(x) : Calling var(x) on a factor x is defunct.” in R?

R ProgrammingServer Side ProgrammingProgramming

The error “Calling var(x) on a factor x is defunct” occurs when we try to apply a numerical function on factor data.

For example, if we have a factor column in a data frame then applying numerical functions on that column would result in the above error. To deal with this problem, we can use as.numeric function along with the numerical function as shown in the below examples.

Example 1

Following snippet creates a sample data frame −

x<-factor(rpois(20,5))
df<-data.frame(x)
df

Output

The following dataframe is created −

    x
1   7
2   3
3   7
4   4
5   7
6   6
7   4
8   6
9   8
10  2
11  6
12  6
13  9
14  5
15  3
16  4
17 10
18  2
19  4
20  2

Now, in order to apply t test on data in x, add the following code to the above snippet −

t.test(df$x,mu=2)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

Error in var(x) : Calling var(x) on a factor x is defunct.

Hence, use something like the following ' to test for a constant vector.

'all(duplicated(x)[-1L])

Output

If you execute all the above given snippets as a single program, it generates the following Output −

In addition: Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA

Now, use as.numeric function on x while applying t.test function and add the following code to the above snippet −

t.test(as.numeric(df$x),mu=2)

Output

If you execute all the above given snippets as a single program, it generates the following Outpu for the one sample t-test −

data: as.numeric(df$x)
t = 4.3061, df = 19, p-value = 0.0003811
alternative hypothesis: true mean is not equal to 2
95 percent confidence interval:
   3.156355 5.343645
sample estimates:
mean of x
   4.25

Example 2

Following snippet creates a sample data frame −

y<-factor(rpois(20,2))
dat<-data.frame(y)
dat

The following dataframe is created −

    y
1   1
2   2
3   4
4   4
5   2
6   0
7   0
8   1
9   3
10  0
11  2
12  0
13  0
14  2
15  2
16  2
17  2
18  0
19  4
20  2

Now, in order to apply t test on data, add the following code to the above snippet −

t.test(dat$y,mu=0)

Output

If you execute all the above given snippets as a single program, it generates the following output −

Error in var(x) : Calling var(x) on a factor x is defunct.

Use something like the following to test for a constant vector −

'all(duplicated(x)[-1L])'

Output

If you execute all the above given snippets as a single program, it generates the following Output −

In addition: Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA

Now, use as.numeric function on y while applying t.test function, add the following code to the above snippet −

t.test(as.numeric(dat$y),mu=0)

Output

If you execute all the above given snippets as a single program, it generates the following Output for the one sample t-test −

data: as.numeric(dat$y)
t = 8.5446, df = 19, p-value = 6.216e-08
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
   2.000878 3.299122
sample estimates:
mean of x
   2.65
raja
Published on 02-Nov-2021 06:42:00
Advertisements