Create a grouped bar plot in Matplotlib


What is matplotlib?

Matplotlib is a popular, open-source data visualization library in Python widely used in the scientific, engineering, and data sciences fields. Matplotlib is known for its flexibility and vast range of customizable options, which makes it a great choice for creating complex visualizations for research or data analysis purposes. One of the most popular types of visualization is the grouped bar chart, which allows comparing multiple variables side by side while showing the differences between groups or subcategories. In this tutorial, we will show you how to create a grouped bar chart in Matplotlib using real-world examples.

What is the use and significance of bar plot in data analysis?

A bar plot is a common type of visualization used in data analysis to display the distribution of categorical data. In a bar plot, the categories are typically displayed on the x-axis, while the height of each bar represents the frequency or proportion of observations in each category.

The significance of a bar plot depends on the context and the purpose of the analysis. Here are some examples of how bar plots can be useful −

Comparing frequencies or proportions − Bar plots can be used to compare the frequencies or proportions of observations in different categories. For example, a bar plot could be used to compare the number of patients with different types of medical conditions in a clinical trial, or the proportion of customers who prefer different brands of products in a market research study. The height of each bar represents the frequency or proportion of observations in each category, making it easy to compare them visually.

Displaying trends over time or other variables − Bar plots can also be used to display changes in the frequency or proportion of observations over time or other variables. For example, a bar plot could be used to display the number of customers who purchase different products in a store over different time periods. By using different colors or patterns for each time period, it's easy to see how the distribution of purchases changes over time.

Identifying outliers or unusual observations − Bar plots can also be used to identify outliers or unusual observations. If one category has a much higher or lower frequency or proportion than the other categories, it may be an indication of an unusual observation or an error in the data.

In terms of statistical significance, bar plots are typically used as a descriptive tool rather than a formal statistical test. However, they can still provide useful information about the distribution of categorical data and help identify patterns or trends that may be of interest. It's important to keep in mind that bar plots can be misleading if they are not constructed properly or if the data is not appropriate for this type of visualization. Therefore, it's always important to carefully consider the purpose of the analysis and choose the appropriate visualization tool accordingly.

Prerequisites

Before we dive into the task few things should is expected to be installed onto your system −

List of recommended settings −

  • pip install pandas, matplotlib

  • It is expected that the user will have access to any standalone IDE such as VS-Code, PyCharm, Atom or Sublime text.

  • Even online Python compilers can also be used such as Kaggle.com, Google Cloud platform or any other will do.

  • Updated version of Python. At the time of writing the article I have used 3.10.9 version.

  • Knowledge of the use of Jupyter notebook.

  • Knowledge and application of virtual environment would be beneficial but not required.

  • It is also expected that the person will have a good understanding of statistics and mathematics.

Steps required to accomplish the task

Let’s see the code samples and steps −

Example

import numpy as np
import matplotlib.pyplot as plt

# Define the data for the plot
data = {
   'Group 1': [20, 35, 30, 35, 27],
   'Group 2': [25, 32, 34, 20, 25],
   'Group 3': [12, 20, 22, 30, 15],
}

# Define the x-axis labels and the width of each bar
labels = ['Category 1', 'Category 2', 'Category 3', 'Category 4', 'Category 5']
bar_width = 0.2

# Create a numpy array of the x-axis positions for each group of bars
x_pos = np.arange(len(labels))

# Create a figure and axis object
fig, ax = plt.subplots()

# Loop through each group of bars and create a set of bars for each group
for i, (group, values) in enumerate(data.items()):
   # Calculate the x-axis position for the current group of bars
   pos = x_pos + (i * bar_width)
   # Create a set of bars for the current group
   ax.bar(pos, values, width=bar_width, label=group)

# Set the x-axis labels and tick positions
ax.set_xticks(x_pos + ((len(data) - 1) / 2) * bar_width)
ax.set_xticklabels(labels)

# Add a legend to the plot
ax.legend()

# Add axis labels and a title to the plot
ax.set_xlabel('Categories')
ax.set_ylabel('Values')
ax.set_title('Grouped Bar Plot')

# Show the plot
plt.show()

In this example, we have three groups of bars (Group 1, Group 2, and Group 3) and five categories (Category 1, Category 2, Category 3, Category 4, and Category 5). We define the data for the plot using a Python dictionary, where the keys represent the groups and the values are lists of values for each category. We also define the width of each bar and the labels for the x-axis.

We then create a numpy array of the x-axis positions for each group of bars using the np.arange() function. We create a figure and axis object using the plt.subplots() function. We then loop through each group of bars and create a set of bars for each group using the ax.bar() function. We set the x-axis labels and tick positions using the ax.set_xticks() and ax.set_xticklabels() functions, respectively. We add a legend to the plot using the ax.legend() function. Finally, we add axis labels and a title to the plot using the ax.set_xlabel(), ax.set_ylabel(), and ax.set_title() functions. We show the plot using the plt.show() function.

This code should create a grouped bar plot with three groups of bars, five categories, and appropriate labels, legends, and axis titles. You can modify the data, labels, and other parameters to suit your specific needs.

Output

This picture shows the different grouped bar plots using matplotlib

Conclusion

In this tutorial, we have shown you step by step how to create grouped bar charts in Matplotlib with real-world examples. We covered all the major aspects of grouped bar charts, including basic structure, horizontal grouped bar charts, stacked grouped bar charts, and grouped bar charts with labels and legends. By following the instructions outlined in this tutorial, you should be able to create your own customized grouped bar charts for your research or data analysis purposes.

Updated on: 20-Apr-2023

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements