Article Categories

Selected Reading

How to overplot a line on a scatter plot in Python?

Python Matplotlib Server Side Programming Programming

Overplotting a line on a scatter plot combines scattered data points with a trend line or reference line. This technique is useful for showing relationships, trends, or theoretical models alongside actual data points.

Basic Approach

Create the scatter plot first using scatter(), then add the line using plot() on the same axes ?

import matplotlib.pyplot as plt
import numpy as np

# Generate sample data
x_data = np.linspace(0, 10, 20)
y_data = 2 * x_data + 1 + np.random.normal(0, 2, 20)  # Linear with noise

# Create scatter plot
plt.figure(figsize=(8, 6))
plt.scatter(x_data, y_data, color='blue', alpha=0.6, label='Data points')

# Add trend line
x_line = np.linspace(0, 10, 100)
y_line = 2 * x_line + 1  # Theoretical line
plt.plot(x_line, y_line, color='red', linewidth=2, label='Trend line')

plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend()
plt.title('Scatter Plot with Overplotted Line')
plt.show()

Multiple Lines on Scatter Plot

You can add multiple lines to show different relationships or models ?

import matplotlib.pyplot as plt
import numpy as np

# Sample data
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = np.array([2.1, 3.9, 6.2, 8.1, 9.8, 12.2, 14.1, 15.9, 18.2, 20.1])

plt.figure(figsize=(10, 6))

# Scatter plot
plt.scatter(x, y, color='blue', s=50, alpha=0.7, label='Actual data')

# Linear trend line
linear_fit = np.polyfit(x, y, 1)
plt.plot(x, np.polyval(linear_fit, x), color='red', linewidth=2, label='Linear fit')

# Quadratic trend line
quad_fit = np.polyfit(x, y, 2)
x_smooth = np.linspace(1, 10, 100)
plt.plot(x_smooth, np.polyval(quad_fit, x_smooth), color='green', linewidth=2, label='Quadratic fit')

plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend()
plt.title('Scatter Plot with Multiple Trend Lines')
plt.grid(True, alpha=0.3)
plt.show()

Using Seaborn for Enhanced Plots

Seaborn provides scatterplot() and lineplot() functions that can be combined easily ?

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# Create sample dataset
np.random.seed(42)
data = pd.DataFrame({
    'x': np.linspace(1, 10, 30),
    'y': 3 * np.linspace(1, 10, 30) + np.random.normal(0, 3, 30)
})

plt.figure(figsize=(8, 6))

# Scatter plot with seaborn
sns.scatterplot(data=data, x='x', y='y', color='blue', s=60, alpha=0.7)

# Add regression line
sns.regplot(data=data, x='x', y='y', scatter=False, color='red', line_kws={'linewidth': 2})

plt.title('Scatter Plot with Regression Line (Seaborn)')
plt.show()

Key Parameters

Parameter	Function	Description
`alpha`	scatter()	Controls point transparency (0-1)
`linewidth`	plot()	Sets line thickness
`label`	Both	Adds legend labels
`color`	Both	Sets colors for points/lines

Best Practices

Use different colors for scatter points and lines for clarity
Set appropriate alpha values to avoid overlapping points hiding the line
Add legends to identify different elements
Use grid(True, alpha=0.3) for better readability

Conclusion

Overplotting lines on scatter plots is achieved by calling scatter() followed by plot() on the same axes. Use different colors and add legends to distinguish between data points and trend lines for clear visualization.

Rishikesh Kumar Rishi

Updated on: 2026-03-25T18:06:20+05:30

14K+ Views

Previous Next