How to plot MFCC in Python using Matplotlib?

MFCC (Mel-Frequency Cepstral Coefficients) are widely used features in audio processing and speech recognition. Python's python_speech_features library combined with Matplotlib allows us to extract and visualize these features effectively.

What are MFCCs?

MFCCs represent the shape of the spectral envelope of audio signals. They capture the most important characteristics of audio for speech recognition and audio analysis tasks.

Installing Required Libraries

First, install the necessary packages ?

pip install python_speech_features matplotlib scipy numpy

Step-by-Step Implementation

Here's how to extract and plot MFCC features from an audio file ?

from python_speech_features import mfcc
import scipy.io.wavfile as wav
import matplotlib.pyplot as plt
import numpy as np

# Set figure size and layout
plt.rcParams["figure.figsize"] = [10, 6]
plt.rcParams["figure.autolayout"] = True

# Read the audio file
(sample_rate, audio_signal) = wav.read("audio_file.wav")

# Extract MFCC features
mfcc_features = mfcc(audio_signal, sample_rate)

# Create the plot
fig, ax = plt.subplots()

# Transpose the MFCC data for proper visualization
mfcc_features = np.swapaxes(mfcc_features, 0, 1)

# Display MFCC as a heatmap
cax = ax.imshow(mfcc_features, interpolation='nearest', cmap='viridis', origin='lower')

# Add labels and title
ax.set_xlabel('Time Frames')
ax.set_ylabel('MFCC Coefficients')
ax.set_title('MFCC Features Visualization')

# Add colorbar
plt.colorbar(cax)

plt.show()

Creating Sample Audio for Testing

If you don't have an audio file, you can create a synthetic signal ?

import numpy as np
import matplotlib.pyplot as plt
from python_speech_features import mfcc

# Create synthetic audio signal (sine wave)
sample_rate = 16000  # Hz
duration = 2.0  # seconds
t = np.linspace(0, duration, int(sample_rate * duration))

# Mix of different frequencies to simulate speech-like signal
frequencies = [440, 880, 1320]  # Hz
audio_signal = np.zeros_like(t)

for freq in frequencies:
    audio_signal += np.sin(2 * np.pi * freq * t) * np.exp(-t/2)

# Add some noise
audio_signal += 0.1 * np.random.randn(len(t))

# Extract MFCC features
mfcc_features = mfcc(audio_signal, sample_rate)

# Plot the MFCC
plt.figure(figsize=(12, 8))

# Subplot 1: Original signal
plt.subplot(2, 1, 1)
plt.plot(t[:1000], audio_signal[:1000])
plt.title('Original Audio Signal (first 1000 samples)')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')

# Subplot 2: MFCC features
plt.subplot(2, 1, 2)
mfcc_transposed = np.swapaxes(mfcc_features, 0, 1)
plt.imshow(mfcc_transposed, interpolation='nearest', cmap='viridis', origin='lower', aspect='auto')
plt.title('MFCC Features')
plt.xlabel('Time Frames')
plt.ylabel('MFCC Coefficients')
plt.colorbar()

plt.tight_layout()
plt.show()
[Displays a two-panel plot showing the original audio signal and its corresponding MFCC visualization]

Understanding the Visualization

In the MFCC plot:

  • X-axis: Represents time frames (windows of audio)
  • Y-axis: Represents the 13 MFCC coefficients
  • Colors: Intensity represents the coefficient values
  • Lower coefficients: Capture overall spectral shape
  • Higher coefficients: Capture finer spectral details

Key Parameters

Parameter Description Default
numcep Number of cepstral coefficients 13
nfilt Number of filters in filterbank 26
winlen Window length in seconds 0.025
winstep Step between windows in seconds 0.01

Conclusion

MFCCs provide a compact representation of audio signals that's ideal for machine learning applications. The visualization helps understand the temporal evolution of spectral characteristics in your audio data.

Updated on: 2026-03-25T23:14:16+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements