Boxplot with variable length data in Matplotlib

To make a boxplot with variable length data in Matplotlib, we can take the following steps −

  • Set the figure size and adjust the padding between and around the subplots.
  • Make a list of data points with different lengths.
  • Make a box and whisker plot using boxplot() method.
  • To display the figure, use show() method.

Basic Example

Here's how to create a boxplot with datasets of different lengths ?

import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True

data = [[2, 4, 1, 3], [0, 4, 3, 2], [0, 0, 1, 0]]

plt.boxplot(data)
plt.show()

Variable Length Data Example

Boxplots handle datasets with different numbers of values automatically ?

import matplotlib.pyplot as plt
import numpy as np

# Create datasets with different lengths
data1 = [1, 2, 3, 4, 5, 6, 7, 8]  # 8 values
data2 = [2, 4, 6, 8, 10]          # 5 values  
data3 = [1, 3, 5]                 # 3 values
data4 = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]  # 10 values

variable_data = [data1, data2, data3, data4]

plt.figure(figsize=(8, 5))
plt.boxplot(variable_data, labels=['Dataset 1', 'Dataset 2', 'Dataset 3', 'Dataset 4'])
plt.title('Boxplot with Variable Length Data')
plt.ylabel('Values')
plt.show()

Customizing the Boxplot

You can enhance the appearance with colors and styling ?

import matplotlib.pyplot as plt
import numpy as np

# Generate random data with different lengths
np.random.seed(42)
data1 = np.random.normal(100, 10, 50)    # 50 values
data2 = np.random.normal(80, 15, 30)     # 30 values
data3 = np.random.normal(90, 12, 75)     # 75 values

variable_data = [data1, data2, data3]

plt.figure(figsize=(8, 6))
box_plot = plt.boxplot(variable_data, 
                       labels=['Group A (n=50)', 'Group B (n=30)', 'Group C (n=75)'],
                       patch_artist=True)

# Customize colors
colors = ['lightblue', 'lightgreen', 'lightcoral']
for patch, color in zip(box_plot['boxes'], colors):
    patch.set_facecolor(color)

plt.title('Customized Boxplot with Variable Length Data')
plt.ylabel('Values')
plt.grid(True, alpha=0.3)
plt.show()

Key Points

  • Automatic handling: Matplotlib's boxplot() automatically handles datasets of different lengths
  • Statistical validity: Each box shows the quartiles regardless of sample size
  • Labels: Use the labels parameter to identify each dataset
  • Customization: Apply colors and styling using patch_artist=True

Conclusion

Matplotlib's boxplot() method seamlessly handles variable length data by calculating statistics independently for each dataset. This makes it perfect for comparing groups with different sample sizes while maintaining statistical accuracy.

Updated on: 2026-03-25T23:21:01+05:30

707 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements