Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Pandas - Plot multiple data columns in a DataFrame?
To plot multiple columns from a DataFrame, we use the plot() method with specific column selection. This is useful for comparing different data series visually using various chart types like bar graphs, line plots, and scatter plots.
Import Required Libraries
First, import pandas and matplotlib for data manipulation and plotting ?
import pandas as pd import matplotlib.pyplot as plt
Creating Sample Data
Let's create a DataFrame with cricket team rankings data ?
import pandas as pd
import matplotlib.pyplot as plt
# Sample cricket team data
data = [["Australia", 2500, 85],
["Bangladesh", 1000, 70],
["England", 2000, 80],
["India", 3000, 90],
["Sri Lanka", 1500, 75]]
# Create DataFrame
df = pd.DataFrame(data, columns=["Team", "Rank_Points", "Win_Percentage"])
print(df)
Team Rank_Points Win_Percentage
0 Australia 2500 85
1 Bangladesh 1000 70
2 England 2000 80
3 India 3000 90
4 Sri Lanka 1500 75
Plotting Multiple Columns as Bar Chart
Use the plot() method with kind="bar" to create a bar graph ?
import pandas as pd
import matplotlib.pyplot as plt
data = [["Australia", 2500, 85],
["Bangladesh", 1000, 70],
["England", 2000, 80],
["India", 3000, 90],
["Sri Lanka", 1500, 75]]
df = pd.DataFrame(data, columns=["Team", "Rank_Points", "Win_Percentage"])
# Plot multiple columns
df.plot(x="Team", y=["Rank_Points", "Win_Percentage"], kind="bar", figsize=(10, 6))
plt.title("Team Rankings: Points vs Win Percentage")
plt.ylabel("Values")
plt.show()
Alternative Plotting Methods
Line Plot
For trend analysis, use a line plot ?
import pandas as pd
import matplotlib.pyplot as plt
# Time series data
dates = pd.date_range('2023-01-01', periods=5, freq='M')
sales_data = pd.DataFrame({
'Month': dates,
'Product_A': [100, 120, 140, 110, 160],
'Product_B': [80, 95, 105, 120, 135]
})
sales_data.plot(x='Month', y=['Product_A', 'Product_B'], kind='line', marker='o')
plt.title("Monthly Sales Comparison")
plt.show()
Scatter Plot
To show correlation between two variables ?
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'Height': [160, 165, 170, 175, 180],
'Weight': [55, 60, 65, 70, 75],
'Age': [20, 25, 30, 35, 40]
})
# Scatter plot with color coding
plt.scatter(df['Height'], df['Weight'], c=df['Age'], cmap='viridis')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.colorbar(label='Age')
plt.title('Height vs Weight (Color = Age)')
plt.show()
Plot Customization Options
| Parameter | Description | Example |
|---|---|---|
kind |
Chart type | 'bar', 'line', 'scatter', 'hist' |
figsize |
Figure dimensions | (10, 6) |
color |
Colors for each column | ['red', 'blue'] |
title |
Chart title | 'My Chart' |
Conclusion
Use DataFrame.plot() with column selection to visualize multiple data series. Choose appropriate chart types like bar graphs for comparisons or line plots for trends based on your data analysis needs.
