Skip to content

Should mappings keep using CopyOnWriteHashMap? #80072

Closed
@jpountz

Description

@jpountz

CopyOnWriteHashMap was introduced for mappings at a time when some Logging use-cases would introduce many (as in several thousands) fields dynamically and applying mapping updates would be the bottleneck due to the time it would take to apply mapping updates.

However several things changed since then:

  • Mapping updates are now applied on the master node before indexing the document, which makes dynamic mapping updates slow by nature (mapping updates used to be performed locally first, and then asynchronously propagated to the master node, which would cause big issues when different shards of the same index would make different decisions).
  • We introduced a soft limit of 1,000 fields per index.
  • ECS introduced standardization of field names, making it less likely to have different documents use different field names for the same information.

This CopyOnWriteHashMap optimizes for mapping updates to the detriment of lookups, which can be slower than with regular hash maps, especially in case of hash collisions. We should look into moving back to regular maps that we would fully copy upon mapping updates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions