Making matplotlib scatter plots from dataframes in Python's pandas

Creating scatter plots from pandas DataFrames using matplotlib is a powerful way to visualize relationships between variables. We can use the DataFrame structure to organize our data and create colorful scatter plots with proper labeling.

Steps to Create a Scatter Plot

  • Import matplotlib and pandas libraries

  • Create lists for your data variables (x-axis, y-axis, and colors)

  • Build a pandas DataFrame from your data

  • Create figure and axes objects using plt.subplots()

  • Add axis labels using plt.xlabel() and plt.ylabel()

  • Generate the scatter plot using ax.scatter() method

  • Display the plot with plt.show()

Example

Here's how to create a scatter plot showing the relationship between student count and marks obtained ?

from matplotlib import pyplot as plt
import pandas as pd

no_of_students = [1, 2, 3, 5, 7, 8, 9, 10, 30, 50]
marks_obtained_by_student = [100, 95, 91, 90, 89, 76, 55, 10, 3, 19]
color_coding = ['red', 'blue', 'yellow', 'green', 'red', 'blue', 'yellow', 'green', 'yellow', 'green']

df = pd.DataFrame(dict(students_count=no_of_students,
                      marks=marks_obtained_by_student, 
                      color=color_coding))

fig, ax = plt.subplots()

plt.xlabel('Students count')
plt.ylabel('Obtained marks')

ax.scatter(df['students_count'], df['marks'], c=df['color'])

plt.show()

Alternative Method Using DataFrame.plot()

You can also create scatter plots directly from pandas DataFrame using the built-in plot method ?

import matplotlib.pyplot as plt
import pandas as pd

# Create sample data
data = {
    'students_count': [1, 2, 3, 5, 7, 8, 9, 10, 30, 50],
    'marks': [100, 95, 91, 90, 89, 76, 55, 10, 3, 19]
}

df = pd.DataFrame(data)

# Create scatter plot using pandas
df.plot.scatter(x='students_count', y='marks', 
                title='Student Performance', 
                figsize=(8, 6))

plt.show()

Key Parameters

Parameter Description Example
c Color of scatter points c='red' or c=df['color']
s Size of scatter points s=50 or s=df['size']
alpha Transparency (0-1) alpha=0.7

Output

The scatter plot will display individual data points with different colors, showing the relationship between student count and marks obtained. Each point represents one data entry from your DataFrame.

Conclusion

Creating scatter plots from pandas DataFrames is straightforward using ax.scatter() or DataFrame.plot.scatter(). This approach allows you to visualize correlations and patterns in your data effectively with customizable colors and styling.

Updated on: 2026-03-25T18:00:55+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements