[Fix] Merge class lists when concatenating datasets with different types by crawfordxx · Pull Request #12437 · open-mmlab/mmdetection

crawfordxx · 2026-04-01T15:15:42Z

Motivation

When concatenating datasets with different class sets (e.g. CocoDataset + VOCDataset), ConcatDataset stored _metainfo as a list of dicts. This broke downstream evaluation code that expects metainfo to be a single dict with a classes key, causing IndexError in _det2json when cat_ids[label] was accessed with out-of-range labels.

Modification

Added a _merge_metainfo() method to ConcatDataset that:

Merges class lists from all sub-datasets into a unified ordered set (preserving insertion order, deduplicating shared classes)
Merges palette colours from each dataset, with a deterministic fallback for classes without a defined colour
Returns a single merged metainfo dict instead of the list fallback, so evaluation works correctly with heterogeneous dataset concatenation

Also updated the full_init metainfo update logic to use isinstance check instead of the removed is_all_same flag.

BC-breaking (Yes/No)

No. When all datasets have the same classes, behaviour is unchanged. When datasets differ, the merged dict is a strict improvement over the previous list-of-dicts fallback which was unusable by downstream code.

Checklist

Pre-commit hooks pass (pre-commit run --all-files)
Unit test added (tests/test_datasets/test_dataset_wrappers.py)

When concatenating datasets with different class sets (e.g. CocoDataset + VOCDataset), ConcatDataset previously stored metainfo as a list of dicts. This broke downstream evaluation code that expects metainfo to be a single dict with a 'classes' key, causing IndexError in _det2json when cat_ids[label] was accessed with out-of-range labels. This fix adds a _merge_metainfo method that merges class lists from all sub-datasets into a unified set, preserving palette colours from each dataset. The merged metainfo dict is used instead of the list fallback, so evaluation works correctly with heterogeneous dataset concatenation. Fixes open-mmlab#8890

CLAassistant · 2026-04-01T15:16:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

majianhan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

mm-assistant bot assigned jbwang1997 Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Merge class lists when concatenating datasets with different types#12437

[Fix] Merge class lists when concatenating datasets with different types#12437
crawfordxx wants to merge 1 commit intoopen-mmlab:mainfrom
crawfordxx:fix-concat-different-dataset-types

crawfordxx commented Apr 1, 2026

Uh oh!

CLAassistant commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

crawfordxx commented Apr 1, 2026

Motivation

Modification

BC-breaking (Yes/No)

Checklist

Uh oh!

CLAassistant commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants