Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to add Regression Line Per Group with Seaborn in Python?
One of the most useful tools provided by Seaborn is the ability to add regression lines to scatterplots. Regression lines help analyze relationships between two variables and identify trends in data, especially when comparing different groups within your dataset.
In this article, we will learn how to add regression lines per group with Seaborn in Python using lmplot() and regplot() functions. These methods allow you to visualize relationships separately for different categories in your data.
What is a Regression Line?
A regression line shows the best-fit trend through a set of data points. It represents the linear relationship between an independent variable (x-axis) and a dependent variable (y-axis). When working with grouped data, separate regression lines for each group help identify how relationships vary across categories.
Using lmplot() for Multiple Groups
The lmplot() function automatically creates separate regression lines for each group when you specify the hue parameter ?
import seaborn as sns
import matplotlib.pyplot as plt
# Load the penguins dataset
penguins = sns.load_dataset('penguins')
# Create scatterplot with regression lines per species
sns.lmplot(x="bill_length_mm",
y="flipper_length_mm",
hue="species",
data=penguins,
height=6,
aspect=1.2)
plt.xlabel("Bill Length (mm)")
plt.ylabel("Flipper Length (mm)")
plt.title("Bill Length vs Flipper Length by Species")
plt.show()
This creates separate regression lines for each penguin species (Adelie, Chinstrap, Gentoo), allowing you to compare how bill length relates to flipper length across different species.
Using regplot() with Subplots
While regplot() doesn't directly support grouping, you can create separate subplots for each group ?
import seaborn as sns
import matplotlib.pyplot as plt
# Load the penguins dataset
penguins = sns.load_dataset('penguins')
# Get unique species
species = penguins['species'].unique()
# Create subplots
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
for i, spec in enumerate(species):
# Filter data for each species
species_data = penguins[penguins['species'] == spec]
# Create regplot for each species
sns.regplot(x="bill_length_mm",
y="flipper_length_mm",
data=species_data,
ax=axes[i])
axes[i].set_title(f'{spec} Penguins')
axes[i].set_xlabel("Bill Length (mm)")
axes[i].set_ylabel("Flipper Length (mm)")
plt.tight_layout()
plt.show()
This approach creates individual plots for each species with their own regression lines, making it easy to compare relationships side by side.
Comparison of Methods
| Method | Groups on Same Plot | Automatic Grouping | Best For |
|---|---|---|---|
lmplot() |
Yes | Yes (with hue) | Comparing groups directly |
regplot() with subplots |
No | No (manual filtering) | Detailed individual analysis |
Key Parameters
Important parameters for group?wise regression lines:
- hue ? Column name for grouping variable
- markers ? Different markers for each group
- palette ? Color scheme for groups
- col ? Create separate subplots by group
Conclusion
Use lmplot() with the hue parameter for easy group?wise regression lines on the same plot. For more detailed analysis, combine regplot() with subplots to examine each group separately.
