HTDemucs Benchmark Research
Six structured tests on the MUSDB18-7s benchmark dataset, run locally on Apple M4 using MPS acceleration. Tests 1-5 focus on HTDemucs performance across different conditions. Test 6 runs a head-to-head comparison between Spleeter (TensorFlow, CPU) and HTDemucs (PyTorch, MPS) to quantify the quality and speed gap on Apple Silicon.
All raw data is available in the site repository. The benchmark scripts are published at github.com/attackseo/htdemucs-benchmark so results can be independently reproduced. Methodology details are on each individual test page.
Test 4
Stem Reconstruction Fidelity
Splitting into 6 stems vs 4 stems increases reconstruction error by 1.2 dB (-21.6 dB for 4-stem vs -22.8 dB for 6-ste...
View data →
Test 1
Input Format Quality Impact
MP3 128kbps input reduces mean SDR by 0.24 dB compared to WAV 24-bit (7.8 dB vs 8.04 dB)
View data →
Test 2
HTDemucs Model Comparison
HTDemucs (base) achieves the highest mean SDR at 8.38 dB
View data →
Test 3
Performance by Track Type
Separation quality varies significantly across track types. Tracks in the top-quality tier average 9.44 dB mean SDR, ...
View data →
Test 5
What Predicts Separation Quality
The strongest predictor of vocal SDR is 'Chroma variance (harmonic complexity)' (r = 0.522). Tracks with higher harmo...
View data →
Test 6
Spleeter vs HTDemucs on Apple M4
See results table for SDR and speed comparison.
View data →