How to save a Librosa spectrogram plot as a specific sized image?

Librosa is a Python package that helps to analyze audio and music files. This package also helps to create music information retrieval systems. In this article, we will see how to save a Librosa spectrogram plot as an image of specific size.

Understanding Spectrogram Parameters

Before creating the spectrogram, we need to understand the key parameters that control the output image dimensions ?

  • hl (hop_length) − Number of samples per time-step in spectrogram

  • hi (height) − Height of the output image (number of mel bins)

  • wi (width) − Width of the output image (time frames)

Creating and Saving a Spectrogram

Here's how to create a mel-scaled spectrogram and save it as a specific sized image ?

import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display

# Set figure size for the output
plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True

# Create figure and subplot
fig, ax = plt.subplots()

# Define image dimensions
hl = 512  # number of samples per time-step in spectrogram
hi = 128  # Height of image (mel bins)
wi = 384  # Width of image (time frames)

# Load demo audio track
y, sr = librosa.load(librosa.ex('trumpet'))

# Create a window of audio data
window = y[0:wi*hl]

# Compute mel-scaled spectrogram
S = librosa.feature.melspectrogram(y=window, sr=sr, n_mels=hi, fmax=8000, hop_length=hl)

# Convert power spectrogram to decibel units
S_dB = librosa.power_to_db(S, ref=np.max)

# Display the spectrogram
img = librosa.display.specshow(S_dB, x_axis='time', y_axis='mel', sr=sr, fmax=8000, ax=ax)

# Add colorbar for better visualization
fig.colorbar(img, ax=ax, format='%+2.0f dB')

# Save the spectrogram as image
plt.savefig("spectrogram_output.png", dpi=150, bbox_inches='tight')
plt.show()

Controlling Output Image Size

To control the exact output dimensions, you can modify the figure size and DPI settings ?

import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display

# Load audio data
y, sr = librosa.load(librosa.ex('trumpet'))

# Define spectrogram parameters
hl = 512
hi = 128
wi = 384
window = y[0:wi*hl]

# Create spectrogram
S = librosa.feature.melspectrogram(y=window, sr=sr, n_mels=hi, fmax=8000, hop_length=hl)
S_dB = librosa.power_to_db(S, ref=np.max)

# Create figure with specific size (width, height in inches)
fig, ax = plt.subplots(figsize=(10, 6))

# Display spectrogram
librosa.display.specshow(S_dB, x_axis='time', y_axis='mel', sr=sr, fmax=8000, ax=ax)

# Save with specific DPI for exact pixel dimensions
# Final image size = figsize * DPI
plt.savefig("custom_size_spectrogram.png", dpi=100, bbox_inches='tight', 
            facecolor='white', edgecolor='none')
plt.show()

Key Parameters

Parameter Purpose Effect on Output
figsize Figure dimensions in inches Controls aspect ratio
dpi Dots per inch Controls final pixel resolution
n_mels Number of mel frequency bins Controls frequency resolution
hop_length Samples between frames Controls time resolution

Conclusion

Use figsize and dpi parameters in savefig() to control the exact output image dimensions. The n_mels and hop_length parameters control the spectrogram resolution and detail level.

Updated on: 2026-03-25T23:54:28+05:30

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements