Write a Python program to quantify the shape of a distribution in a dataframe

Distribution shape analysis is crucial in data science for understanding data characteristics. Python's Pandas provides built-in methods to calculate kurtosis (measures peakedness) and skewness (measures asymmetry) to quantify distribution shapes.

What is Kurtosis and Skewness?

Kurtosis measures how peaked or flat a distribution is compared to a normal distribution. Values above 0 indicate a more peaked distribution, while negative values indicate a flatter distribution.

Skewness measures the asymmetry of a distribution. Positive skewness indicates a tail extending toward higher values, while negative skewness indicates a tail extending toward lower values.

Creating a Sample DataFrame

Let's start by creating a sample DataFrame with numerical data ?

import pandas as pd

data = {"Column1": [12, 34, 56, 78, 90],
        "Column2": [23, 30, 45, 50, 90]}
df = pd.DataFrame(data)
print("DataFrame is:")
print(df)
DataFrame is:
   Column1  Column2
0       12       23
1       34       30
2       56       45
3       78       50
4       90       90

Calculating Kurtosis

Use the kurt() method to calculate kurtosis for each column ?

import pandas as pd

data = {"Column1": [12, 34, 56, 78, 90],
        "Column2": [23, 30, 45, 50, 90]}
df = pd.DataFrame(data)

kurtosis = df.kurt(axis=0)
print("Kurtosis is:")
print(kurtosis)
Kurtosis is:
Column1   -1.526243
Column2    1.948382
dtype: float64

Calculating Skewness

Use the skew() method to calculate skewness for each column ?

import pandas as pd

data = {"Column1": [12, 34, 56, 78, 90],
        "Column2": [23, 30, 45, 50, 90]}
df = pd.DataFrame(data)

skewness = df.skew(axis=0)
print("Asymmetry distribution - skewness is:")
print(skewness)
Asymmetry distribution - skewness is:
Column1   -0.280389
Column2    1.309355
dtype: float64

Complete Example

Here's the complete program that calculates both kurtosis and skewness ?

import pandas as pd

data = {"Column1": [12, 34, 56, 78, 90],
        "Column2": [23, 30, 45, 50, 90]}
df = pd.DataFrame(data)

print("DataFrame is:")
print(df)

kurtosis = df.kurt(axis=0)
print("\nKurtosis is:")
print(kurtosis)

skewness = df.skew(axis=0)
print("\nAsymmetry distribution - skewness is:")
print(skewness)
DataFrame is:
   Column1  Column2
0       12       23
1       34       30
2       56       45
3       78       50
4       90       90

Kurtosis is:
Column1   -1.526243
Column2    1.948382
dtype: float64

Asymmetry distribution - skewness is:
Column1   -0.280389
Column2    1.309355
dtype: float64

Interpreting the Results

Column Kurtosis Skewness Interpretation
Column1 -1.526 (negative) -0.280 (slightly negative) Flatter distribution, slightly left-skewed
Column2 1.948 (positive) 1.309 (positive) More peaked distribution, right-skewed

Conclusion

Use df.kurt(axis=0) to measure distribution peakedness and df.skew(axis=0) to measure asymmetry. These metrics help identify data distribution characteristics for better statistical analysis.

Updated on: 2026-03-25T16:24:49+05:30

377 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements