Use MB lookup to resolve ambiguous artist names by OzGav · Pull Request #3862 · music-assistant/server

OzGav · 2026-05-10T12:57:19Z

Resolving multiple artist names has been a perennial problem. In my most recent adjustment to the logic I moved to using the MBID count to try and identify when the heuristic split did not match the expected number of artists. That didnt solve the problem but at least made it visible. This PR takes the next step: when that count mismatch is detected, use the MBIDs to look up canonical names from MusicBrainz instead of just logging a warning and going with the wrong split.

So this PR adds:

When MusicBrainz Artist IDs / Album Artist IDs are present in tags and the parsed artist count does not match the MBID count, resolve canonical names from the MusicBrainz API and use those instead of the heuristic split. Applies to both track artists and album artists.
When counts already match the parsed names are trusted as before, so cleanly-tagged libraries do no network calls.
MusicbrainzProvider.get_artist_details is cached for 30 days so repeat MBIDs across tracks are effectively free.
Failed individual MBID lookups are dropped rather than being substituted from the tag-parsed names — matching by position is unsafe when the counts already disagree. If every lookup in a track fails, fall back to the tag-parsed names so the track still gets stored with something.

" presents " is added to FEATURING_SPLITTERS to handle "Above & Beyond presents OceanLab" and similar. The current heuristic produces the correct count (2) on that string but on the wrong boundary, so the current count-mismatch check would not catch it and the MB lookup would never fire. Thus the splitter addition is needed independently.

marcelveldt · 2026-05-11T09:38:19Z

We should be careful with this;

Its potentially going to do a call to MB for each track in a user's library. That is a lot of calls!
We have always said that users tags are always leading - this change adjusts that a bit

What is the exact issue you are trying to solve here ?

OzGav · 2026-05-11T10:15:53Z

Just had another new edge case where a user has the artist "Above & Beyond presents OceanLab" where we dont have the "presents" in FEATURING_SPLITTERS so the parsing failed.

Whilst I have improved things by comparing the number of MB IDs to the number of parsed artists it is still fragile. If the number of MB IDs doesnt equal the number of parsed artists then we still currently pull the incorrect artists into the database and log a warning which isn't ideal (better but not ideal).

I just feel that if we have the MBIDs we could guarantee to get the artist names right and also solve any naming/ spelling/ language/ diacritics ambiguities.

I agree that this will increase the number of calls but only for new additions to peoples libraries and only once when the track is first added to the database. I considered the further mitigation that the MBID lookup is cached for 30 days so a user with 50 Beatles tracks does 1 lookup not 50.

OzGav · 2026-05-11T12:21:27Z

Here is a good example from a classical album I have:

Artist: Pyotr Ilyich Tchaikovsky
Album Artist: Tchaikovsky
Album artist sort order: Tchaikovsky, Pyotr Ilyich
Artist sort order: Tchaikovsky, Pyotr Ilyich
MB ARTIST ID and MB RELEASE ARTIST ID are the same though so using these IDs will result in a consistent artist on this track

marcelveldt · 2026-05-16T18:13:07Z

We should prevent doing a lookup if the musicbrainz tags are already present.

I agree that this will increase the number of calls but only for new additions to peoples libraries and only once when the track is first added to the database.

And this is exactly what worries me. Local libraries are potentially very large so this may result in 10000s of calls for scanning an initial library. That is a lot of stress for a free service.

What we can potentially do is if the artists tag is already present and matches number of MB id's, we do not have to do any lookup.

OzGav · 2026-05-17T03:33:20Z

Fair. I have switched it to as you suggested and just do the lookup on a mismatch between number of artist MBIDs and parsed number of artist names.

There is still the problem of poor artist name tagging where the first potentially incorrect name is persisted when additional tracks are added. I thought we could maybe have it so that if you do an UPDATE METADATA or REFRESH ITEM on an artist then do the name lookup in that circumstance. That gives the user an internal path to fix this and the existing 30-day MB cache means repeat clicks won't make repeated API calls. Thoughts on this idea?

Two parser improvements for multi-artist resolution: 1. Add " presents " to FEATURING_SPLITTERS so single ARTIST tag strings like "Above & Beyond presents OceanLab" split correctly instead of silently being mis-split on the inner ampersand. 2. When the parsed artist count doesn't match the MusicBrainz Artist ID count, the filesystem_local resolver looks up canonical names via the new MusicbrainzProvider.resolve_artists_from_mbids method. Failed individual lookups are dropped rather than mapped back to a tag name by position (unsafe when counts already disagree); if every lookup fails, the resolver falls back to the tag-parsed names so the track still gets stored. When counts already match, no lookup runs. The mismatch warnings move out of tags.py into the resolver, where they can report what actually happened. Out of scope for this PR: - First-write-wins persistence of misspellings ("Tchaikovsky" vs "Pyotr Ilyich Tchaikovsky"). The count-match short-circuit means the mismatch trigger doesn't help here; this needs a separate user- triggered "refresh canonical names" action so the MB load is opt-in.

OzGav added the enhancement label May 10, 2026

This was referenced May 10, 2026

Switch to using MB IDs as the truth source for artist names music-assistant/music-assistant.io#658

Closed

Switch to using MB IDs as the truth source for artist names music-assistant/music-assistant.io#659

Open

OzGav force-pushed the use-mb-for-multiartist-lookup branch 5 times, most recently from abf7d6c to 140ea9b Compare May 17, 2026 03:16

OzGav force-pushed the use-mb-for-multiartist-lookup branch from 140ea9b to 09007cd Compare May 17, 2026 04:09

OzGav changed the title ~~Switch to using MB IDs as the truth source for artist names~~ Use MB lookup to resolve ambiguous artist names May 17, 2026

OzGav added this to the 2.9.0 milestone May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use MB lookup to resolve ambiguous artist names#3862

Use MB lookup to resolve ambiguous artist names#3862
OzGav wants to merge 1 commit into
devfrom
use-mb-for-multiartist-lookup

OzGav commented May 10, 2026 •

edited

Loading

Uh oh!

marcelveldt commented May 11, 2026

Uh oh!

OzGav commented May 11, 2026

Uh oh!

OzGav commented May 11, 2026

Uh oh!

marcelveldt commented May 16, 2026

Uh oh!

OzGav commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

OzGav commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marcelveldt commented May 11, 2026

Uh oh!

OzGav commented May 11, 2026

Uh oh!

OzGav commented May 11, 2026

Uh oh!

marcelveldt commented May 16, 2026

Uh oh!

OzGav commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OzGav commented May 10, 2026 •

edited

Loading