Skip to content

_uid should be indexed in Lucene in binary form, not base64 #18154

Closed
@mikemccand

Description

@mikemccand

@rmuir had this idea:

Today, when ES auto-generates an ID (TimeBasedUUIDGenerator.getBase64UUID), it uses 15 bytes, but then we immediately Base64 encode that to 20 bytes, a 33% "waste".

This is really a holdover from the past when Lucene could not index fully binary terms.

I think we should explore passing the raw binary form to Lucene instead? We could implement back-compat based on the version as of when the index was created.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions