Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How can Seaborn library be used to display kernel density estimations in Python?
Visualizing data is an important step since it helps understand what is going on in the data without actually looking at the numbers and performing complicated computations. Seaborn is a library that helps in visualizing data. It comes with customized themes and a high-level interface.
Kernel Density Estimation (KDE) is a method in which the probability density function of a continuous random variable can be estimated. This method is used for the analysis of the non-parametric values.
Seaborn provides multiple ways to display KDE plots. Let's explore the different approaches ?
Using distplot() with KDE Only
By setting kde=True and hist=False in distplot(), we can visualize only the kernel density estimation ?
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.distplot(df['petal_length'], kde=True, hist=False)
plt.title('KDE Plot using distplot()')
plt.show()
Using kdeplot() Function
Seaborn also provides a dedicated kdeplot() function for creating kernel density plots ?
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.kdeplot(df['petal_length'])
plt.title('KDE Plot using kdeplot()')
plt.show()
Comparing Multiple Variables
You can overlay multiple KDE plots to compare distributions across different variables or categories ?
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
# Plot KDE for different species
for species in df['species'].unique():
subset = df[df['species'] == species]
sb.kdeplot(subset['petal_length'], label=species)
plt.title('KDE Comparison by Species')
plt.legend()
plt.show()
KDE with Histogram
To show both histogram and KDE together, set both kde=True and hist=True ?
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.distplot(df['petal_length'], kde=True, hist=True)
plt.title('Histogram with KDE Overlay')
plt.show()
Customizing KDE Plots
You can customize the appearance by adjusting bandwidth and other parameters ?
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
# Custom bandwidth
sb.kdeplot(df['petal_length'], bw_adjust=0.5, label='bw=0.5')
sb.kdeplot(df['petal_length'], bw_adjust=2, label='bw=2')
plt.title('KDE with Different Bandwidths')
plt.legend()
plt.show()
Key Parameters
| Parameter | Function | Description |
|---|---|---|
kde |
distplot() | Enable/disable KDE line |
hist |
distplot() | Enable/disable histogram |
bw_adjust |
kdeplot() | Adjust bandwidth (smoothness) |
shade |
kdeplot() | Fill area under curve |
Conclusion
Seaborn offers flexible options for KDE visualization through distplot() and kdeplot(). Use KDE to understand data distribution patterns and compare multiple datasets effectively.
