Python - Calculate the variance of a column in a Pandas DataFrame

To calculate the variance of column values in a Pandas DataFrame, use the var() method. Variance measures how spread out the data points are from the mean value.

Syntax

The basic syntax for calculating variance is ?

DataFrame['column_name'].var()

Creating a DataFrame

First, import the required Pandas library and create a DataFrame ?

import pandas as pd

# Create DataFrame with car data
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame1:")
print(dataFrame1)
DataFrame1:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3    Tesla     80
4  Bentley    110
5   Jaguar     90

Calculating Variance of a Single Column

Use the var() method to find the variance of the "Units" column ?

import pandas as pd

dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

# Calculate variance of Units column
variance = dataFrame1['Units'].var()
print("Variance of Units column:", variance)
Variance of Units column: 586.6666666666666

Multiple DataFrame Example

Here's a complete example calculating variance for different DataFrames ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame1:")
print(dataFrame1)

# Finding Variance of "Units" column values
print("\nVariance of Units column from DataFrame1 =", dataFrame1['Units'].var())

# Create DataFrame2
dataFrame2 = pd.DataFrame({
    "Product": ['TV', 'PenDrive', 'HeadPhone', 'EarPhone', 'HDD', 'SSD'],
    "Price": [8000, 500, 3000, 1500, 3000, 4000]
})

print("\nDataFrame2:")
print(dataFrame2)

# Finding Variance of "Price" column values
print("\nVariance of Price column from DataFrame2 =", dataFrame2['Price'].var())
DataFrame1:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3    Tesla     80
4  Bentley    110
5   Jaguar     90

Variance of Units column from DataFrame1 = 586.6666666666666

DataFrame2:
    Product  Price
0        TV   8000
1  PenDrive    500
2 HeadPhone   3000
3  EarPhone   1500
4       HDD   3000
5       SSD   4000

Variance of Price column from DataFrame2 = 6766666.666666667

Key Points

  • The var() method calculates sample variance by default (divides by N-1)
  • For population variance, use var(ddof=0) which divides by N
  • Higher variance indicates more spread out data points
  • Variance is always non-negative

Conclusion

Use the var() method to calculate variance of DataFrame columns. This statistical measure helps understand data dispersion and variability in your datasets.

Updated on: 2026-03-26T02:03:22+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements