How to Perform a Chi-Square Goodness of Fit Test in Python

Data Scientists often use statistical methods for hypothesis testing to gain insights from datasets. The Chi-Square Goodness of Fit Test is a statistical method that validates whether observed categorical data follows an expected distribution. It helps determine if sample data significantly differs from what we would expect under a specific hypothesis.

Understanding Chi-Square Goodness of Fit Test

The Chi-Square Goodness of Fit test compares observed frequencies with expected frequencies to determine if there is a significant difference. This test makes several important assumptions ?

  • Variables are independent

  • Only one categorical feature is present

  • Each category must have more than five expected frequency counts

  • Dataset is randomly sampled

  • Categories are mutually exclusive

Chi-Square Test Statistic Formula

The test statistic is calculated using the following formula ?

?² = ? (O - E)² E where O = observed, E = expected

Where ?

  • denotes the chi-square statistic

  • O represents observed values

  • E represents expected values

Hypothesis Testing Steps

The Chi-Square test follows these standard hypothesis testing steps ?

  • Step 1: Define null hypothesis (H?) and alternative hypothesis (H?)

  • Step 2: Set significance level (typically ? = 0.05)

  • Step 3: Calculate the test statistic

  • Step 4: Compare with critical value or use p-value to make decision

Method 1: Using scipy.stats.chisquare()

The built-in function provides a convenient way to perform the test ?

import scipy.stats as stats
import numpy as np

# Observed frequencies (e.g., dice rolls)
observed = [18, 22, 16, 25, 12, 7]

# Expected frequencies (equal probability for fair dice)
expected = [16.67, 16.67, 16.67, 16.67, 16.67, 16.67]

# Perform Chi-Square test
chi_square_stat, p_value = stats.chisquare(observed, expected)

print(f"Chi-square statistic: {chi_square_stat:.4f}")
print(f"P-value: {p_value:.4f}")

# Critical value for ? = 0.05, df = 5
critical_value = stats.chi2.ppf(1-0.05, df=5)
print(f"Critical value: {critical_value:.4f}")

# Decision
if p_value < 0.05:
    print("Reject null hypothesis: Data does not follow expected distribution")
else:
    print("Fail to reject null hypothesis: Data follows expected distribution")
Chi-square statistic: 7.2000
P-value: 0.2060
Critical value: 11.0705
Fail to reject null hypothesis: Data follows expected distribution

Method 2: Manual Calculation Using Formula

Implementing the chi-square formula manually helps understand the underlying calculation ?

import numpy as np
import scipy.stats as stats

# Same data as above
observed = [18, 22, 16, 25, 12, 7]
expected = [16.67, 16.67, 16.67, 16.67, 16.67, 16.67]

# Manual calculation
chi_square_manual = 0
for i in range(len(observed)):
    chi_square_manual += (observed[i] - expected[i])**2 / expected[i]

print(f"Manual chi-square statistic: {chi_square_manual:.4f}")

# Compare with built-in function
chi_square_builtin, _ = stats.chisquare(observed, expected)
print(f"Built-in chi-square statistic: {chi_square_builtin:.4f}")
print(f"Results match: {abs(chi_square_manual - chi_square_builtin) < 0.0001}")
Manual chi-square statistic: 7.2000
Built-in chi-square statistic: 7.2000
Results match: True

Practical Example: Testing Dice Fairness

Let's test whether a dice is fair by comparing observed rolls with expected uniform distribution ?

import scipy.stats as stats
import numpy as np

# Dice roll results (100 rolls)
dice_results = [1, 3, 2, 6, 4, 1, 5, 2, 3, 4, 6, 1, 2, 5, 3, 6, 4, 1, 5, 2,
                3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6,
                4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1,
                5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2,
                3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6, 4, 1, 5, 2, 3, 6]

# Count frequencies for each face
observed_freq = [dice_results.count(i) for i in range(1, 7)]
expected_freq = [100/6] * 6  # Equal probability for fair dice

print("Observed frequencies:", observed_freq)
print("Expected frequencies:", [f"{x:.2f}" for x in expected_freq])

# Perform test
chi_stat, p_val = stats.chisquare(observed_freq, expected_freq)

print(f"\nChi-square statistic: {chi_stat:.4f}")
print(f"P-value: {p_val:.4f}")

if p_val < 0.05:
    print("Conclusion: The dice appears to be biased (p < 0.05)")
else:
    print("Conclusion: The dice appears to be fair (p ? 0.05)")
Observed frequencies: [17, 17, 17, 17, 16, 16]
Expected frequencies: ['16.67', '16.67', '16.67', '16.67', '16.67', '16.67']

Chi-square statistic: 0.1200
P-value: 0.9988
Conclusion: The dice appears to be fair (p ? 0.05)

Key Considerations

  • Sample size: Each expected frequency should be at least 5

  • Degrees of freedom: Number of categories minus 1

  • Significance level: Commonly set at 0.05 (5%)

  • One-tailed test: Chi-square is always a right-tailed test

Conclusion

The Chi-Square Goodness of Fit test is essential for validating whether observed data follows an expected distribution. Use scipy.stats.chisquare() for convenience or implement manually for deeper understanding. The test helps make data-driven decisions about distributional assumptions in statistical analysis.

Updated on: 2026-03-27T06:02:56+05:30

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements