Article Categories

Selected Reading

How to plot MFCC in Python using Matplotlib?

Matplotlib Python Data Visualization

MFCC (Mel-Frequency Cepstral Coefficients) are widely used features in audio processing and speech recognition. Python's python_speech_features library combined with Matplotlib allows us to extract and visualize these features effectively.

What are MFCCs?

MFCCs represent the shape of the spectral envelope of audio signals. They capture the most important characteristics of audio for speech recognition and audio analysis tasks.

Installing Required Libraries

First, install the necessary packages ?

pip install python_speech_features matplotlib scipy numpy

Step-by-Step Implementation

Here's how to extract and plot MFCC features from an audio file ?

from python_speech_features import mfcc
import scipy.io.wavfile as wav
import matplotlib.pyplot as plt
import numpy as np

# Set figure size and layout
plt.rcParams["figure.figsize"] = [10, 6]
plt.rcParams["figure.autolayout"] = True

# Read the audio file
(sample_rate, audio_signal) = wav.read("audio_file.wav")

# Extract MFCC features
mfcc_features = mfcc(audio_signal, sample_rate)

# Create the plot
fig, ax = plt.subplots()

# Transpose the MFCC data for proper visualization
mfcc_features = np.swapaxes(mfcc_features, 0, 1)

# Display MFCC as a heatmap
cax = ax.imshow(mfcc_features, interpolation='nearest', cmap='viridis', origin='lower')

# Add labels and title
ax.set_xlabel('Time Frames')
ax.set_ylabel('MFCC Coefficients')
ax.set_title('MFCC Features Visualization')

# Add colorbar
plt.colorbar(cax)

plt.show()

Creating Sample Audio for Testing

If you don't have an audio file, you can create a synthetic signal ?

import numpy as np
import matplotlib.pyplot as plt
from python_speech_features import mfcc

# Create synthetic audio signal (sine wave)
sample_rate = 16000  # Hz
duration = 2.0  # seconds
t = np.linspace(0, duration, int(sample_rate * duration))

# Mix of different frequencies to simulate speech-like signal
frequencies = [440, 880, 1320]  # Hz
audio_signal = np.zeros_like(t)

for freq in frequencies:
    audio_signal += np.sin(2 * np.pi * freq * t) * np.exp(-t/2)

# Add some noise
audio_signal += 0.1 * np.random.randn(len(t))

# Extract MFCC features
mfcc_features = mfcc(audio_signal, sample_rate)

# Plot the MFCC
plt.figure(figsize=(12, 8))

# Subplot 1: Original signal
plt.subplot(2, 1, 1)
plt.plot(t[:1000], audio_signal[:1000])
plt.title('Original Audio Signal (first 1000 samples)')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')

# Subplot 2: MFCC features
plt.subplot(2, 1, 2)
mfcc_transposed = np.swapaxes(mfcc_features, 0, 1)
plt.imshow(mfcc_transposed, interpolation='nearest', cmap='viridis', origin='lower', aspect='auto')
plt.title('MFCC Features')
plt.xlabel('Time Frames')
plt.ylabel('MFCC Coefficients')
plt.colorbar()

plt.tight_layout()
plt.show()

[Displays a two-panel plot showing the original audio signal and its corresponding MFCC visualization]

Understanding the Visualization

In the MFCC plot:

X-axis: Represents time frames (windows of audio)
Y-axis: Represents the 13 MFCC coefficients
Colors: Intensity represents the coefficient values
Lower coefficients: Capture overall spectral shape
Higher coefficients: Capture finer spectral details

Key Parameters

Parameter	Description	Default
`numcep`	Number of cepstral coefficients	13
`nfilt`	Number of filters in filterbank	26
`winlen`	Window length in seconds	0.025
`winstep`	Step between windows in seconds	0.01

Conclusion

MFCCs provide a compact representation of audio signals that's ideal for machine learning applications. The visualization helps understand the temporal evolution of spectral characteristics in your audio data.

Rishikesh Kumar Rishi

Updated on: 2026-03-25T23:14:16+05:30

1K+ Views

Previous Next