Album-artist contributors with the same name are split when MBIDs differ, with no opt-out
Summary
Slim::Schema::Contributor::add keys contributors on (name, musicbrainz_id). Two tracks that tag the same ALBUMARTIST (e.g. TPE2 = "Stephen Malkmus") but carry different MusicBrainz Album Artist Id values end up as two separate rows in the contributors table, sharing a display name. In the browse UI this surfaces as a duplicate "Stephen Malkmus" entry, one of which holds a single album while the other holds the rest of the catalog.
There is no server pref to disable the MBID-based disambiguation, and the existing back-compat name-match fallback only fires when the existing row has a NULL MBID — so any user whose canonical contributor row was created from an MBID-tagged file is stuck.
This is related to but distinct from #1545 (multi-value MBID misattribution). That issue concerns a single ID being shared across collaborator rows; this one concerns a single name being split across rows because of a legitimate but unwanted MBID difference (band vs. solo artist credits, collaboration vs. solo, etc.).
Reproduction
Library has an artist with multiple albums. One album is credited on MusicBrainz to a band/collaboration entity that uses the same display string in ALBUMARTIST as the solo artist.
Example from my library (Stephen Malkmus):
- 9 albums tagged
TPE2 = "Stephen Malkmus", MBID eaefd603-84c1-4db4-a72b-0cb718a0cc07 (solo) or no MBID.
- 1 album (
Wig Out At Jagbags) tagged TPE2 = "Stephen Malkmus", MBID 98e4bf9e-9e33-4f07-9962-f7c1c2d773ba (Stephen Malkmus & The Jicks — the band entity on MusicBrainz, even though TPE2 is the solo name string).
After scan, contributors contains two rows:
id 47606 name 'Stephen Malkmus' namesort 'MALKMUS STEPHEN' mbid eaefd603-…
id 49017 name 'Stephen Malkmus' namesort 'MALKMUS STEPHEN AND THE JICKS' mbid 98e4bf9e-…
The browse UI shows two "Stephen Malkmus" entries; clicking the second shows only Wig Out At Jagbags.
A second case in the same library: Mirah's Songs From The Black Mountain Music Project is credited on MusicBrainz as the collaboration "Mirah Yom Tov Zeitlyn, Ginger Brooks Takahashi, and Friends" (its own MB entity), but its TPE2 is "Mirah", same as the solo records. Same outcome — two "Mirah" rows in contributors.
Root cause
Slim::Schema::Contributor::add (public/9.0):
if ($mbid) {
$sth = $dbh->prepare_cached( 'SELECT id, extid, musicbrainz_id FROM contributors WHERE musicbrainz_id = ?' );
$sth->execute($mbid);
($id, $oldExtId, $oldMbid) = $sth->fetchrow_array;
if ( !$id ) {
# name fallback ONLY for rows with NULL musicbrainz_id
$sth = $dbh->prepare_cached( 'SELECT id, extid FROM contributors WHERE name = ? AND musicbrainz_id IS NULL' );
$sth->execute($name);
($id, $oldExtId) = $sth->fetchrow_array;
}
}
else {
$sth = $dbh->prepare_cached( 'SELECT id, extid, musicbrainz_id FROM contributors WHERE name = ? LIMIT 1' );
...
}
When the incoming track has a non-canonical MBID, the MBID lookup misses, the name-fallback misses (existing row has a non-null MBID), and the function INSERTs a new contributor row with the same display name.
Empirical verification
I ran a controlled experiment over both example albums, applying different tag treatments per track and then doing a full clear+rescan. Across both albums the result was identical:
-
TPE2 MBID alone determines album-artist contributor. Every treatment that rewrote TXXX:MusicBrainz Album Artist Id to the canonical solo MBID moved the track to the canonical contributor row. Every treatment that touched only TXXX:MusicBrainz Artist Id, TSO2, or TSOP left the track in the duplicate row.
-
Namesort is not part of the lookup. Variants that rewrote TSO2 and TSOP to the canonical sort string while leaving MBIDs alone did not merge.
-
The "strip MBID" workaround is scan-order dependent. For one of my two duplicates, stripping the TXXX MBID frames moved the track to the canonical row; for the other, it moved it to the duplicate row. The reason is the existing no-MBID path:
SELECT id ... FROM contributors WHERE name = ? LIMIT 1
No ORDER BY. SQLite returns whichever row is encountered first, which in practice is the lowest id — the contributor that was created first historically. That ordering depends on directory iteration during the initial scan, so the strip workaround is not portable advice.
Why this is hard to work around outside LMS
The current options for a user are:
- Rewrite the file's TPE2 MBID to the canonical solo MBID — falsifies tags relative to MusicBrainz, but works deterministically.
- Strip the TPE2 MBID frames — scan-order roulette per above.
- Accept the duplicate row.
(1) is the only reliable workaround, and it requires the user to mutate their files to lie about what MusicBrainz actually credits the release to. For users who use Picard/beets and want their tags to round-trip with MB, that's a significant ask.
Proposed fix
Add an opt-in server preference — proposed name mergeContributorsByName (default OFF) — that, when enabled, makes Contributor::add look up by name first and fall back to musicbrainz_id only if no name match exists. With the pref on, the contributor lookup tie-breaks by preferring rows with non-null MBID (richer data) and then by id ascending (oldest first) for determinism.
The pref is OFF by default so existing behavior is preserved, including the legitimate disambiguation case (two artists with the same display name and different MBIDs, e.g. John Williams the composer vs. John Williams the classical guitarist — those users want the current behavior and would leave the pref off).
A PR implementing this is attached.
Alternative considered
A per-name allowlist ("force these names to merge regardless of MBID") would be more surgical and could coexist with the legitimate-disambiguation case in the same library, but it requires settings-page UX for managing the list. The single global toggle is the smallest patch surface and covers the common case; the allowlist could be a follow-up.
Workaround for users in the meantime
If you hit this and want a deterministic fix without waiting for the pref to land: rewrite TXXX:MusicBrainz Album Artist Id (and ideally TXXX:MusicBrainz Artist Id too) on all tracks of the affected album to match the canonical contributor's MBID. Don't rely on stripping the frames — that path's outcome depends on the order in which LMS originally created the contributor rows, which is not under your control.
Album-artist contributors with the same name are split when MBIDs differ, with no opt-out
Summary
Slim::Schema::Contributor::addkeys contributors on(name, musicbrainz_id). Two tracks that tag the sameALBUMARTIST(e.g.TPE2 = "Stephen Malkmus") but carry differentMusicBrainz Album Artist Idvalues end up as two separate rows in thecontributorstable, sharing a display name. In the browse UI this surfaces as a duplicate "Stephen Malkmus" entry, one of which holds a single album while the other holds the rest of the catalog.There is no server pref to disable the MBID-based disambiguation, and the existing back-compat name-match fallback only fires when the existing row has a
NULLMBID — so any user whose canonical contributor row was created from an MBID-tagged file is stuck.This is related to but distinct from #1545 (multi-value MBID misattribution). That issue concerns a single ID being shared across collaborator rows; this one concerns a single name being split across rows because of a legitimate but unwanted MBID difference (band vs. solo artist credits, collaboration vs. solo, etc.).
Reproduction
Library has an artist with multiple albums. One album is credited on MusicBrainz to a band/collaboration entity that uses the same display string in
ALBUMARTISTas the solo artist.Example from my library (Stephen Malkmus):
TPE2 = "Stephen Malkmus", MBIDeaefd603-84c1-4db4-a72b-0cb718a0cc07(solo) or no MBID.Wig Out At Jagbags) taggedTPE2 = "Stephen Malkmus", MBID98e4bf9e-9e33-4f07-9962-f7c1c2d773ba(Stephen Malkmus & The Jicks — the band entity on MusicBrainz, even though TPE2 is the solo name string).After scan,
contributorscontains two rows:The browse UI shows two "Stephen Malkmus" entries; clicking the second shows only
Wig Out At Jagbags.A second case in the same library: Mirah's
Songs From The Black Mountain Music Projectis credited on MusicBrainz as the collaboration "Mirah Yom Tov Zeitlyn, Ginger Brooks Takahashi, and Friends" (its own MB entity), but itsTPE2is "Mirah", same as the solo records. Same outcome — two "Mirah" rows incontributors.Root cause
Slim::Schema::Contributor::add(public/9.0):When the incoming track has a non-canonical MBID, the MBID lookup misses, the name-fallback misses (existing row has a non-null MBID), and the function INSERTs a new contributor row with the same display name.
Empirical verification
I ran a controlled experiment over both example albums, applying different tag treatments per track and then doing a full clear+rescan. Across both albums the result was identical:
TPE2 MBID alone determines album-artist contributor. Every treatment that rewrote
TXXX:MusicBrainz Album Artist Idto the canonical solo MBID moved the track to the canonical contributor row. Every treatment that touched onlyTXXX:MusicBrainz Artist Id,TSO2, orTSOPleft the track in the duplicate row.Namesort is not part of the lookup. Variants that rewrote
TSO2andTSOPto the canonical sort string while leaving MBIDs alone did not merge.The "strip MBID" workaround is scan-order dependent. For one of my two duplicates, stripping the TXXX MBID frames moved the track to the canonical row; for the other, it moved it to the duplicate row. The reason is the existing no-MBID path:
No
ORDER BY. SQLite returns whichever row is encountered first, which in practice is the lowestid— the contributor that was created first historically. That ordering depends on directory iteration during the initial scan, so the strip workaround is not portable advice.Why this is hard to work around outside LMS
The current options for a user are:
(1) is the only reliable workaround, and it requires the user to mutate their files to lie about what MusicBrainz actually credits the release to. For users who use Picard/beets and want their tags to round-trip with MB, that's a significant ask.
Proposed fix
Add an opt-in server preference — proposed name
mergeContributorsByName(default OFF) — that, when enabled, makesContributor::addlook up bynamefirst and fall back tomusicbrainz_idonly if no name match exists. With the pref on, the contributor lookup tie-breaks by preferring rows with non-null MBID (richer data) and then byidascending (oldest first) for determinism.The pref is OFF by default so existing behavior is preserved, including the legitimate disambiguation case (two artists with the same display name and different MBIDs, e.g. John Williams the composer vs. John Williams the classical guitarist — those users want the current behavior and would leave the pref off).
A PR implementing this is attached.
Alternative considered
A per-name allowlist ("force these names to merge regardless of MBID") would be more surgical and could coexist with the legitimate-disambiguation case in the same library, but it requires settings-page UX for managing the list. The single global toggle is the smallest patch surface and covers the common case; the allowlist could be a follow-up.
Workaround for users in the meantime
If you hit this and want a deterministic fix without waiting for the pref to land: rewrite
TXXX:MusicBrainz Album Artist Id(and ideallyTXXX:MusicBrainz Artist Idtoo) on all tracks of the affected album to match the canonical contributor's MBID. Don't rely on stripping the frames — that path's outcome depends on the order in which LMS originally created the contributor rows, which is not under your control.