Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Different plotting using pandas and matplotlib
Pandas and Matplotlib are powerful Python libraries for data analysis and visualization. Pandas excels at data manipulation while Matplotlib provides comprehensive plotting capabilities. Together, they offer various plot types to visualize different aspects of your data.
Line Plot
Line plots are ideal for visualizing data trends over time or continuous variables. The plot() function creates connected line segments between data points ?
Syntax
import matplotlib.pyplot as plt plt.plot(x, y) plt.show()
Example
import matplotlib.pyplot as plt
import pandas as pd
# Create sample data
data = {"year": [1999, 2000, 2002, 2020, 2023], "sales": [34, 20, 19, 4, 25]}
df = pd.DataFrame(data)
print("Data:")
print(df)
# Create line plot
plt.figure(figsize=(8, 5))
plt.plot(df["year"], df["sales"], marker='o')
plt.title("Sales Over Years")
plt.xlabel("Year")
plt.ylabel("Sales")
plt.grid(True)
plt.show()
Data: year sales 0 1999 34 1 2000 20 2 2002 19 3 2020 4 4 2023 25
Scatter Plot
Scatter plots display relationships between two numerical variables, with each point representing an observation ?
Example
import matplotlib.pyplot as plt
import pandas as pd
# Create sample data
data = {"height": [150, 160, 170, 180, 190], "weight": [50, 60, 70, 80, 90]}
df = pd.DataFrame(data)
print("Data:")
print(df)
# Create scatter plot
plt.figure(figsize=(8, 5))
plt.scatter(df["height"], df["weight"], alpha=0.7, s=100)
plt.title("Height vs Weight")
plt.xlabel("Height (cm)")
plt.ylabel("Weight (kg)")
plt.grid(True, alpha=0.3)
plt.show()
Data: height weight 0 150 50 1 160 60 2 170 70 3 180 80 4 190 90
Histogram
Histograms show the frequency distribution of a single numerical variable by dividing data into bins ?
Example
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Create sample data
np.random.seed(42)
scores = np.random.normal(75, 15, 100) # Normal distribution
df = pd.DataFrame({"test_scores": scores})
print("Sample of data:")
print(df.head())
# Create histogram
plt.figure(figsize=(8, 5))
plt.hist(df["test_scores"], bins=15, alpha=0.7, color='skyblue', edgecolor='black')
plt.title("Distribution of Test Scores")
plt.xlabel("Test Scores")
plt.ylabel("Frequency")
plt.grid(True, alpha=0.3)
plt.show()
Sample of data: test_scores 0 82.434452 1 75.235406 2 64.135950 3 78.140515 4 82.130391
Bar Chart
Bar charts compare different categories using rectangular bars with heights proportional to the values ?
Example
import matplotlib.pyplot as plt
import pandas as pd
# Create sample data
data = {"product": ["A", "B", "C", "D", "E"], "revenue": [25000, 35000, 20000, 45000, 30000]}
df = pd.DataFrame(data)
print("Data:")
print(df)
# Create bar chart
plt.figure(figsize=(8, 5))
plt.bar(df["product"], df["revenue"], color=['red', 'green', 'blue', 'orange', 'purple'])
plt.title("Revenue by Product")
plt.xlabel("Product")
plt.ylabel("Revenue ($)")
plt.grid(True, alpha=0.3)
plt.show()
Data: product revenue 0 A 25000 1 B 35000 2 C 20000 3 D 45000 4 E 30000
Comparison of Plot Types
| Plot Type | Best For | Data Requirements |
|---|---|---|
| Line Plot | Trends over time | Continuous x-axis |
| Scatter Plot | Relationships between variables | Two numerical variables |
| Histogram | Data distribution | Single numerical variable |
| Bar Chart | Comparing categories | Categorical and numerical |
Conclusion
Choose line plots for time series data, scatter plots for correlations, histograms for distributions, and bar charts for categorical comparisons. Pandas and Matplotlib together provide a complete toolkit for data visualization in Python.
