HTDemucs Performance by Track Characteristics

Key finding Separation quality varies significantly across track types. Tracks in the top-quality tier average 9.44 dB mean SDR, vs 7.22 dB for the lowest tier.

Methodology

This analysis uses per-track SDR results from the model comparison test (Test 2), grouped two ways.

Quality tiers group tracks by their vocals SDR tertile. The top third of tracks by vocal SDR is labelled “High separation quality,” the middle third “Mid,” and the bottom third “Low.” This quantifies how much variance exists across the test set and which stem types benefit most from easier separation conditions.

Genre grouping uses genre metadata from the MUSDB18-7s track files where available. If the dataset sample used does not include genre tags (which varies by download), the genre table will not render and only quality-tier grouping is shown.

Model: htdemucs_ft. Dataset: MUSDB18-7s (50 test tracks).

Interpretation notes

Large differences between quality tiers indicate that HTDemucs performance is highly track-dependent. A low mean SDR on a specific genre or tier does not necessarily mean that genre is “hard” in absolute terms – 7-second clips can include difficult sections (dense choruses, sustained instruments) that bias the sample.

Performance by Quality Tier

Tracks grouped by their overall separation quality (vocals SDR tertile). SDR values in dB.

Tier	Tracks	Vocals SDR	Drums SDR	Bass SDR	Other SDR	Mean SDR
High separation quality (top third)	17	12.55	10.6	10.12	4.51	9.44
Mid separation quality (middle third)	16	8.8	11.28	8.98	5.78	8.71
Low separation quality (bottom third)	17	5.81	10.72	8.76	3.6	7.22