Skip to main content

extractMelSpectrogram

The extractMelSpectrogram function generates a mel spectrogram from an audio file. Mel spectrograms are frequency-domain representations of audio that are particularly useful for machine learning applications and audio visualization.

Syntax

async function extractMelSpectrogram(options: MelSpectrogramOptions): Promise<MelSpectrogramResult>

Parameters

The function accepts a single object with the following properties:

PropertyTypeRequiredDefaultDescription
fileUristringYes-Path to the audio file to analyze
windowSizeMsnumberNo25Window size in milliseconds for the STFT
hopLengthMsnumberNo10Hop length in milliseconds between consecutive frames
nMelsnumberNo40Number of mel bands to generate
fMinnumberNo0Lowest frequency (in Hz)
fMaxnumberNo22050Highest frequency (in Hz). If null, use sampleRate/2
normalizebooleanNotrueWhether to normalize the spectrogram

Return Value

The function returns a Promise that resolves to a MelSpectrogramResult object with the following properties:

PropertyTypeDescription
melSpectrogramnumber[][]2D array containing the mel spectrogram values
timeAxisnumber[]Time values for each frame (in seconds)
freqAxisnumber[]Frequency values for each mel band (in Hz)
durationMsnumberDuration of the audio in milliseconds
sampleRatenumberSample rate of the audio in Hz
windowSizeMsnumberWindow size used for the STFT in milliseconds
hopLengthMsnumberHop length used between consecutive frames in milliseconds
nMelsnumberNumber of mel bands generated

Example

import { extractMelSpectrogram } from '@siteed/expo-audio-studio';

async function generateMelSpectrogram() {
try {
const result = await extractMelSpectrogram({
fileUri: 'path/to/audio.wav',
windowSizeMs: 25,
hopLengthMs: 10,
nMels: 40,
fMin: 20,
fMax: 8000,
normalize: true
});

console.log(`Generated mel spectrogram with ${result.melSpectrogram.length} frames`);
console.log(`Each frame has ${result.melSpectrogram[0].length} mel bands`);
console.log(`Time range: ${result.timeAxis[0]}s to ${result.timeAxis[result.timeAxis.length-1]}s`);
console.log(`Frequency range: ${result.freqAxis[0]}Hz to ${result.freqAxis[result.freqAxis.length-1]}Hz`);

// Use the mel spectrogram data for visualization or machine learning
return result;
} catch (error) {
console.error('Error generating mel spectrogram:', error);
throw error;
}
}

Visualization Example

Here's an example of how to visualize the mel spectrogram using the @siteed/expo-audio-ui package:

import React from 'react';
import { View } from 'react-native';
import { MelSpectrogramVisualizer } from '@siteed/expo-audio-ui';
import { extractMelSpectrogram } from '@siteed/expo-audio-studio';

const SpectrogramView = ({ audioUri }) => {
const [spectrogramData, setSpectrogramData] = React.useState(null);

React.useEffect(() => {
async function loadSpectrogram() {
if (audioUri) {
const data = await extractMelSpectrogram({
fileUri: audioUri,
nMels: 80,
windowSizeMs: 25,
hopLengthMs: 10
});
setSpectrogramData(data);
}
}

loadSpectrogram();
}, [audioUri]);

if (!spectrogramData) {
return <View style={{ height: 200 }} />;
}

return (
<MelSpectrogramVisualizer
data={spectrogramData.melSpectrogram}
height={200}
width="100%"
colorMap="viridis"
/>
);
};

export default SpectrogramView;

Performance Considerations

  • Generating mel spectrograms is computationally intensive, especially for longer audio files
  • Consider using a lower number of mel bands (e.g., 40 instead of 128) for better performance
  • The windowSizeMs and hopLengthMs parameters affect both the resolution and the computation time
  • For real-time applications, process shorter audio segments or use lower resolution parameters