-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OAK-11131 - indexing-job: AOT Blob downloader may download blobs that are not needed for the indexes #1738
Conversation
…nsidered for download by matching only against the indexes for which the feature is enabled. Previously, it was checking against all indexes, which could lead to downloading blobs for nodes that are not indexed by an index that needs the blob. Add tests for AOT blob downloader.
...commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/CompositeIndexer.java
Outdated
Show resolved
Hide resolved
@@ -86,6 +87,13 @@ public Set<String> getRelativeIndexedNodeNames() { | |||
return result; | |||
} | |||
|
|||
@Override | |||
public String getIndexName() { | |||
// This is susceptible to infinite recursion if a composite indexer contains itself, directly or indirectly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// This is susceptible to infinite recursion if a composite indexer contains itself, directly or indirectly. | |
// Many methods are susceptible to infinite recursion if a composite indexer contains itself, directly or indirectly. | |
// In this case, the methods would throw a StackOverflowException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, it should be a warning in the public documentation of the class. The users of this class have to be aware to avoid this issue. I updated the comment.
@@ -126,6 +126,11 @@ public Set<String> getRelativeIndexedNodeNames() { | |||
return definition.getRelativeNodeNames(); | |||
} | |||
|
|||
@Override | |||
public String getIndexName() { | |||
return definition.getIndexName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the usage of getIndexName() is fine. However, just FYI:
For on-prem and managed services, we support the same node name in multiple locations. For example it's possible to have
- /oak:index/damAssetLucene
- /content/dam/oak:index/damAssetLucene
both in the same repository.
It is unlikely that this causes issues for this case here however (in my view).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definition.getIndexName()
returns the full path of the index, so there is no risk of collisions for indexes that are outside /oak:index
.
The AOT Blob downloader makes several checks on each line/node in the FFS before downloading an non-inlined blob. One of these checks is if the line/node is needed by the indexers for which the feature is enabled. The current implementation is checking if any of the indexers requires the line. This is incorrect, because it may download blobs for lines that are not required by the indexers with AOT blob download enabled. In the worst case, this will slow down significantly the downloader, which may make it fall behind the indexer, thereby preventing it from downloading the blobs for the lines that are actually needed by the indexers.
This PR changes the logic so that the AOT blob downloader to only check against the indexes for which AOT blob download is enabled when deciding if a line/node is to be considered for blob download or not.
This PR also adds additional tests for the AOT blob downloader.