How to Perform an F-Test in Python

Statisticians use F-test to check whether the two datasets have the same variance or not. F-test is named after Sir Ronald Fisher. To use the F-Test, we make two hypotheses: a null hypothesis and one alternate hypothesis. Then we select any of these two hypotheses based on the F-Test results.

Variance is a data distribution metric that measures data deviation from the mean. Higher values show more dispersion than smaller values.

In this article, you will learn how to perform an F-Test in Python programming language with its use cases.

F-Test Process

The process to perform the F-Test is as follows ?

  • To begin with, define the null and alternate hypotheses.

    • Null Hypothesis or H?: ??² = ??² (the variances of the populations are equal)

    • Alternate Hypothesis or H?: ??² ? ??² (the variances of the populations are unequal)

  • Choose the statistic for testing.

  • Calculate the degrees of freedom for the populations. For instance, if m and n are population sizes, the degrees of freedom are denoted as (df1) = m?1 and (df2) = n?1 respectively.

  • Now find the F value from the F-table.

  • At last, divide the value of alpha by 2 for two-tailed tests to calculate the critical value.

Thus, we define the F value using the degrees of freedom of the populations. We read the df1 in the first row while df2 in the first column.

There are various F Tables for unique kinds of degrees of freedom. We compare the F statistic from step 2 with the critical value calculated in step 4. Then we can reject the null hypothesis if the critical value is lesser than the F statistic. On the contrary, we can accept the null hypothesis when the critical value is greater than the F statistic at some significant level.

F-Test Decision Process Calculate F-statistic F = ??²/??² (larger variance in numerator) Find Critical Value From F-table using df1, df2, and ? Compare F-stat vs Critical Value F-stat > Critical Reject H? F-stat ? Critical Accept H?

Assumptions

We make some assumptions before performing the F-Test based on the dataset ?

  • The data populations follow the normal distribution (i.e., they fit the bell curve).

  • Samples are independent of each other (i.e., no correlation between samples).

Apart from these assumptions, we should also consider the following key points while performing the F-Test ?

  • The maximum variance value should be in the numerator to perform the right-tailed test.

  • Determine the critical value after dividing alpha by 2 in the case of the two-tailed test.

  • Check if you have variance or standard deviations.

  • If you do not have degrees of freedom in the F Table, then go with the maximum value as the critical value.

F-Test in Python

Syntax

scipy.stats.f()

Parameters

  • x: quantiles

  • q: lower or upper tail probability

  • dfn, dfd: shape parameters (degrees of freedom)

  • loc: location parameter

  • scale: scale parameter (default=1)

  • size: random variate shape

  • moments: ['mvsk'] letters, specifying which moments to compute

Example

Let's perform an F-test to compare variances of two groups ?

import numpy as np
import scipy.stats

# Create sample data
group1 = [0.28, 0.2, 0.26, 0.28, 0.5]
group2 = [0.2, 0.23, 0.26, 0.21, 0.23]

# Converting the lists to arrays
x = np.array(group1)
y = np.array(group2)

# Calculate the variance of each group
var1 = np.var(group1, ddof=1)
var2 = np.var(group2, ddof=1)
print(f"Variance of group1: {var1}")
print(f"Variance of group2: {var2}")

def f_test(group1, group2):
    # Calculate F-statistic (ratio of variances)
    f_stat = np.var(group1, ddof=1) / np.var(group2, ddof=1)
    
    # Degrees of freedom
    df1 = len(group1) - 1
    df2 = len(group2) - 1
    
    # Calculate p-value
    p_value = 1 - scipy.stats.f.cdf(f_stat, df1, df2)
    
    return f_stat, p_value

# Perform F-test
f_statistic, p_val = f_test(group1, group2)
print(f"F-statistic: {f_statistic:.5f}")
print(f"P-value: {p_val:.6f}")

# Decision based on significance level (? = 0.05)
alpha = 0.05
if p_val < alpha:
    print("Reject null hypothesis: Variances are significantly different")
else:
    print("Accept null hypothesis: Variances are not significantly different")
Variance of group1: 0.01308
Variance of group2: 0.00053
F-statistic: 24.67925
P-value: 0.019127
Reject null hypothesis: Variances are significantly different

Using scipy.stats.f_oneway for Multiple Groups

For comparing variances across multiple groups, you can use f_oneway ?

import numpy as np
from scipy.stats import f_oneway

# Create three sample groups
group1 = [23, 25, 28, 30, 32]
group2 = [20, 22, 24, 26, 28]
group3 = [18, 20, 22, 24, 26]

# Perform one-way F-test
f_stat, p_value = f_oneway(group1, group2, group3)

print(f"F-statistic: {f_stat:.4f}")
print(f"P-value: {p_value:.6f}")

if p_value < 0.05:
    print("Groups have significantly different means")
else:
    print("Groups do not have significantly different means")
F-statistic: 10.5000
P-value: 0.001895
Groups have significantly different means

Conclusion

The F-test is a powerful statistical tool for comparing variances between populations. Use scipy.stats.f.cdf() to calculate p-values and make decisions about variance equality based on your chosen significance level.

Updated on: 2026-03-27T06:03:33+05:30

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements