Absolute and Relative frequency in Pandas

In statistics, frequency indicates how many times a value appears in a dataset. Absolute frequency is the raw count, while relative frequency is the proportion (count divided by total observations). Pandas provides built-in methods for calculating both.

Absolute Frequency

Using value_counts()

The simplest way to count occurrences of each value ?

import pandas as pd

data = ["Chandigarh", "Hyderabad", "Pune", "Pune", "Chandigarh", "Pune"]
df = pd.Series(data).value_counts()
print(df)
Pune          3
Chandigarh    2
Hyderabad     1
dtype: int64

Using crosstab()

An alternative that produces a tabular format ?

import pandas as pd

data = ["Chandigarh", "Hyderabad", "Pune", "Pune", "Chandigarh", "Pune"]
df = pd.DataFrame(data, columns=["City"])
tab_result = pd.crosstab(index=df["City"], columns=["count"])
print(tab_result)
col_0       count
City
Chandigarh      2
Hyderabad       1
Pune            3

Relative Frequency

Relative frequency is the ratio of each value's count to the total observations. It can be expressed as a decimal or percentage ?

$$\mathrm{Relative\:Frequency = \frac{Absolute\:Frequency}{Total\:Observations}}$$

Using value_counts(normalize=True)

import pandas as pd

data = ["Chandigarh", "Hyderabad", "Pune", "Pune", "Chandigarh", "Pune"]

# Method 1: normalize parameter
print(pd.Series(data).value_counts(normalize=True))
Pune          0.500000
Chandigarh    0.333333
Hyderabad     0.166667
dtype: float64

Manual Calculation

import pandas as pd

data = ["Chandigarh", "Hyderabad", "Pune", "Pune", "Chandigarh", "Pune"]

# Method 2: divide by total count
freq = pd.Series(data).value_counts()
relative = freq / len(data)
print(relative)
Pune          0.500000
Chandigarh    0.333333
Hyderabad     0.166667
dtype: float64

Pune appears 3 out of 6 times = 0.50 (50%), Chandigarh 2/6 = 0.33 (33%), Hyderabad 1/6 = 0.17 (17%).

Comparison

City Absolute Frequency Relative Frequency Percentage
Pune 3 0.50 50%
Chandigarh 2 0.33 33%
Hyderabad 1 0.17 17%

Conclusion

Use value_counts() for absolute frequency and value_counts(normalize=True) for relative frequency directly. The crosstab() method is useful when you need tabular output or cross-tabulation between multiple columns. Relative frequency is preferred when comparing datasets of different sizes.

Updated on: 2026-03-15T16:43:52+05:30

909 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements