How to overplot a line on a scatter plot in Python?

Overplotting a line on a scatter plot combines scattered data points with a trend line or reference line. This technique is useful for showing relationships, trends, or theoretical models alongside actual data points.

Basic Approach

Create the scatter plot first using scatter(), then add the line using plot() on the same axes ?

import matplotlib.pyplot as plt
import numpy as np

# Generate sample data
x_data = np.linspace(0, 10, 20)
y_data = 2 * x_data + 1 + np.random.normal(0, 2, 20)  # Linear with noise

# Create scatter plot
plt.figure(figsize=(8, 6))
plt.scatter(x_data, y_data, color='blue', alpha=0.6, label='Data points')

# Add trend line
x_line = np.linspace(0, 10, 100)
y_line = 2 * x_line + 1  # Theoretical line
plt.plot(x_line, y_line, color='red', linewidth=2, label='Trend line')

plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend()
plt.title('Scatter Plot with Overplotted Line')
plt.show()

Multiple Lines on Scatter Plot

You can add multiple lines to show different relationships or models ?

import matplotlib.pyplot as plt
import numpy as np

# Sample data
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = np.array([2.1, 3.9, 6.2, 8.1, 9.8, 12.2, 14.1, 15.9, 18.2, 20.1])

plt.figure(figsize=(10, 6))

# Scatter plot
plt.scatter(x, y, color='blue', s=50, alpha=0.7, label='Actual data')

# Linear trend line
linear_fit = np.polyfit(x, y, 1)
plt.plot(x, np.polyval(linear_fit, x), color='red', linewidth=2, label='Linear fit')

# Quadratic trend line
quad_fit = np.polyfit(x, y, 2)
x_smooth = np.linspace(1, 10, 100)
plt.plot(x_smooth, np.polyval(quad_fit, x_smooth), color='green', linewidth=2, label='Quadratic fit')

plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend()
plt.title('Scatter Plot with Multiple Trend Lines')
plt.grid(True, alpha=0.3)
plt.show()

Using Seaborn for Enhanced Plots

Seaborn provides scatterplot() and lineplot() functions that can be combined easily ?

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create sample dataset
np.random.seed(42)
data = pd.DataFrame({
    'x': np.linspace(1, 10, 30),
    'y': 3 * np.linspace(1, 10, 30) + np.random.normal(0, 3, 30)
})

plt.figure(figsize=(8, 6))

# Scatter plot with seaborn
sns.scatterplot(data=data, x='x', y='y', color='blue', s=60, alpha=0.7)

# Add regression line
sns.regplot(data=data, x='x', y='y', scatter=False, color='red', line_kws={'linewidth': 2})

plt.title('Scatter Plot with Regression Line (Seaborn)')
plt.show()

Key Parameters

Parameter Function Description
alpha scatter() Controls point transparency (0-1)
linewidth plot() Sets line thickness
label Both Adds legend labels
color Both Sets colors for points/lines

Best Practices

  • Use different colors for scatter points and lines for clarity

  • Set appropriate alpha values to avoid overlapping points hiding the line

  • Add legends to identify different elements

  • Use grid(True, alpha=0.3) for better readability

Conclusion

Overplotting lines on scatter plots is achieved by calling scatter() followed by plot() on the same axes. Use different colors and add legends to distinguish between data points and trend lines for clear visualization.

Updated on: 2026-03-25T18:06:20+05:30

14K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements