How to Make Boxplots with Data Points using Seaborn in Python?


A strong visualization tool for summarizing a dataset's distribution is the boxplot. They provide important statistical parameters including the median, quartiles, and possible outliers. Traditional boxplots, on the other hand, simply provide summary statistics, thus they might not offer a complete picture of the data.

In this article, we will learn how to make Boxplots with Data points using Seaborn in Python. A well-liked data visualization library based on matplotlib is called Seaborn. It offers a sophisticated interface for producing beautiful statistics visuals. We may create boxplots with overlapping data points using the functionality of Seaborn and Matplotlib, enabling a deeper examination of the dataset.

Multiple Approaches

To make Boxplots with Data points with Seaborn in Python, we can follow the two methods −

  • By overlaying Data Points on Boxplots.

  • Utilizing the Swarmplot with Boxplots.

Let us investigate both approaches −

Approach-1: By overlaying Data Points on Boxplots

By adding individual data points on top of standard boxplots, we improve them in this method. We better comprehend the distribution of the dataset by visualizing the individual data points as well as the summary statistics. With this method, we can examine outliers and patterns in the data to gain insightful knowledge.

Algorithm

The steps are as follows −

Step 1 − Import seaborn as well as matplotlib.pyplot.

Step 2 − Prepare or load the data set.

Step 3 − With the seaborn.boxplot() function, a boxplot may be created with the help of the dataset and desired settings.

Step 4 − Pick the Axes item out, in the boxplot.

Step 5 − Utilizing the matplotlib.pyplot.scatter() method, iterate through the dataset, and plot each data point.

Step 6 − Adjust the boxplot and data points' appearance as necessary.

Step 7 − Utilizing matplotlib.pyplot.show(), display the plot.

Example

#import the required modules
import seaborn as sns
import matplotlib.pyplot as plt

# The dataset are Loaded and generated
data = [10, 15, 20, 22, 25, 30, 32, 35, 40, 45, 50]

# Construct a boxplot
sns.boxplot(data=data)

# The Axes object is retrieved the Axes object
ax = plt.gca()

# Data points
for i, point in enumerate(data):
   plt.scatter(i, point, color='red', alpha=0.5)

# Customize appearance
ax.set_xticklabels([])  # Hide x-axis labels (optional)
plt.xlabel('Data')
plt.ylabel('Values')

# Display the plot
plt.show()

Output

Approach-2: Utilizing the Swarmplot with Boxplots.

In this method, we utilize a swarmplot and a boxplot together to produce a thorough visualization. The swarmplot arranges each data point so that they do not overlap, giving the dataset a clearer display. We may view the individual data points and the summary statistics simultaneously by overlaying the swarmplot with a boxplot, enabling a more thorough investigation of the data. When working with larger datasets, where overlapping data points might mask patterns and trends, this approach is especially helpful.

Algorithm

The steps are as follows −

Step 1 − import seaborn and matplotlib.pyplot.

Step 2 − Create or load your dataset.

Step 3 − Utilize the seaborn.swarmplot() function to build a swarmplot, supplying the dataset and required settings.

Step 4 − As needed, alter the swarmplot's appearance.

Step 5 − Utilizing the seaborn.boxplot() function, you may overlay a boxplot on a swarmplot.

Step 6 − As needed, alter the boxplot's look.

Step 7 − Utilize matplotlib.pyplot.show() to display the plot.

Example

#import the required modules
import seaborn as sns
import matplotlib.pyplot as plt

# The dataset is Loaded and generated
data = [10, 15, 20, 22, 25, 30, 32, 35, 40, 45, 50]

# Build a swarmplot
sns.swarmplot(data=data, color='grey')

# Customize the appearance of swarmplot
plt.xlabel('Data')
plt.ylabel('Values')

# Overlay a boxplot
sns.boxplot(data=data, width=0.2, color='white')

# Customize the appearance of the boxplot
plt.ylim(0, 60)
# Show the plot
plt.show()

Output

Conclusion

In this article, we looked at two methods for making boxplots employing layered data points in Python using the Seaborn package. The second approach uses the Swarmplot with Boxplots. We improve the visualization by including individual data points, which gives us a more in-depth view of the distribution of the dataset. In order to handle larger datasets, we learned how to combine a swarmplot with a boxplot and how to overlay data points on a boxplot. Due to Seaborn's interface with Matplotlib, the plots can be highly customized to meet your unique requirements. These methods enable you to obtain insightful information and persuasively present your findings through visualizations.

Updated on: 28-Jul-2023

432 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements