Skip to content

Merge related HashMaps in FieldInfos#FieldNumbers into one map #13460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 6, 2024

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Jun 6, 2024

This commit merges indexOptions, DocValuesType, FieldDimensions and FieldVectorProperties into one map by introducing the record FieldProperties.

In addition it moves to a record the internal classes FieldDimensions and FieldProperties.

fixes #13459

this.omitNorms = new HashMap<>();
this.fieldProperties = new HashMap<>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add omitNorms and storeTermVectors too? they are only added when they exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add nameToNumber too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like we can add all immutable attributes there indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added those attributes to FieldProperties. I introduced a IndexOptionsProperties record to capture omitNorms and storeTermVectors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should IndexOptionsProperties also capture IndexOptions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, no it should not, because we set it to null when IndexOptions is NONE?

indexOptions.clear();
docValuesType.clear();
dimensions.clear();
fieldProperties.clear();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is maybe a bug that we don't clear vectorProps hashmap?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug indeed. So if you call deleteAll, then you won't be able to index vectors with different options. I don't think it's a common use-case, which is likely why it's never been reported.

@iverase iverase changed the title Merge related HashMaps in FieldInfos into one map Merge related HashMaps in FieldInfos#FieldNumbers into one map Jun 6, 2024
synchronized (this) {
fieldNumber = nameToNumber.get(fieldName);
fieldProperties = this.fieldProperties.get(fieldName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like we might change the behaviour here? Not sure why we are getting the number here on a synchronise block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire class is shared by the IndexWriter so access needs to be protected. The fact that access to docValuesType was not protected is more surprising to me, it seems to rely on subtle happens-before guarantees? Anyway, I like that your change is simplifying this.

synchronized (this) {
fieldNumber = nameToNumber.get(fieldName);
fieldProperties = this.fieldProperties.get(fieldName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire class is shared by the IndexWriter so access needs to be protected. The fact that access to docValuesType was not protected is more surprising to me, it seems to rely on subtle happens-before guarantees? Anyway, I like that your change is simplifying this.

@iverase iverase merged commit 58ab5b7 into apache:main Jun 6, 2024
3 checks passed
@iverase iverase deleted the FieldsInfoMaps branch June 6, 2024 15:08
@iverase iverase added this to the 10.0.0 milestone Jun 7, 2024
@iverase
Copy link
Contributor Author

iverase commented Jun 7, 2024

This class seems to have change quite a bit and some methods did not have a clear path forward for backportso I set the milestone for this change to 10.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should FieldInfo#FieldNumbers hold one map with index properties instead of a map for each property?
3 participants