Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Making matplotlib scatter plots from dataframes in Python's pandas
Creating scatter plots from pandas DataFrames using matplotlib is a powerful way to visualize relationships between variables. We can use the DataFrame structure to organize our data and create colorful scatter plots with proper labeling.
Steps to Create a Scatter Plot
Import matplotlib and pandas libraries
Create lists for your data variables (x-axis, y-axis, and colors)
Build a pandas DataFrame from your data
Create figure and axes objects using
plt.subplots()Add axis labels using
plt.xlabel()andplt.ylabel()Generate the scatter plot using
ax.scatter()methodDisplay the plot with
plt.show()
Example
Here's how to create a scatter plot showing the relationship between student count and marks obtained ?
from matplotlib import pyplot as plt
import pandas as pd
no_of_students = [1, 2, 3, 5, 7, 8, 9, 10, 30, 50]
marks_obtained_by_student = [100, 95, 91, 90, 89, 76, 55, 10, 3, 19]
color_coding = ['red', 'blue', 'yellow', 'green', 'red', 'blue', 'yellow', 'green', 'yellow', 'green']
df = pd.DataFrame(dict(students_count=no_of_students,
marks=marks_obtained_by_student,
color=color_coding))
fig, ax = plt.subplots()
plt.xlabel('Students count')
plt.ylabel('Obtained marks')
ax.scatter(df['students_count'], df['marks'], c=df['color'])
plt.show()
Alternative Method Using DataFrame.plot()
You can also create scatter plots directly from pandas DataFrame using the built-in plot method ?
import matplotlib.pyplot as plt
import pandas as pd
# Create sample data
data = {
'students_count': [1, 2, 3, 5, 7, 8, 9, 10, 30, 50],
'marks': [100, 95, 91, 90, 89, 76, 55, 10, 3, 19]
}
df = pd.DataFrame(data)
# Create scatter plot using pandas
df.plot.scatter(x='students_count', y='marks',
title='Student Performance',
figsize=(8, 6))
plt.show()
Key Parameters
| Parameter | Description | Example |
|---|---|---|
c |
Color of scatter points |
c='red' or c=df['color']
|
s |
Size of scatter points |
s=50 or s=df['size']
|
alpha |
Transparency (0-1) | alpha=0.7 |
Output
The scatter plot will display individual data points with different colors, showing the relationship between student count and marks obtained. Each point represents one data entry from your DataFrame.
Conclusion
Creating scatter plots from pandas DataFrames is straightforward using ax.scatter() or DataFrame.plot.scatter(). This approach allows you to visualize correlations and patterns in your data effectively with customizable colors and styling.
