Skip to content

Auto-reload analyzers for specific resource #96986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mayya-sharipova
Copy link
Contributor

@mayya-sharipova mayya-sharipova commented Jun 21, 2023

This PR adds a new optional parameter "resource" for ReloadAnalyzersRequest.
If used, only analyzers that use this specific "resource" will be reloaded.
This parameter is not documented, for internal use only.

PR #96886 introduced auto-reload of analyzers on synonyms index change. The problem
was that reloading was applied broadly for all indices that contained any reloadable
analyzers. This PR improves on it, so when a particular synonyms set changes,
only analyzers that use this synonyms set will auto-reloaded. Note that shard
requests will still be sent to all indices shards, as only on a shard we can
decide if analyzers need to be reloaded.

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jun 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@mayya-sharipova mayya-sharipova force-pushed the reload-analyzers-with-resource branch from 3fbe5df to bc2d21b Compare June 21, 2023 17:15
Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

I think it would be less confusing (and have less code changes) if we don't pass null explicitly. We might create methods / constructors that have the synonymSetId as parameter, or not (and thus invoke internally the other method with null).

*/
public ReloadAnalyzersRequest(String... indices) {
public ReloadAnalyzersRequest(String resource, String... indices) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add another constructor that does not have resource as parameter, to avoid passing null?

Copy link
Contributor Author

@mayya-sharipova mayya-sharipova Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored this, but this is not possible as indices is varargs of the same type of String.
Making two constructors, and doing ReloadAnalyzersRequest(resource1, index1) throws an error:

reference to ReloadAnalyzersRequest is ambiguous
|    both constructor ReloadAnalyzersRequest(java.lang.String...)..  and constructor ReloadAnalyzersRequest(java.lang.String,java.lang.String...) ... match

match:
my_field:
query: salute
- match: { hits.total.value: 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great test! 💯

Is it too redundant to query the other index to ensure nothing changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in de85b76. But indeed looks redundant as we did not change the synonyms set that other index is based on.

@@ -95,7 +95,7 @@ default NamedAnalyzer getDefaultSearchQuoteAnalyzer() {
/**
* Reload any analyzers that have reloadable components
*/
default List<String> reload(AnalysisRegistry analysisRegistry, IndexSettings indexSettings) throws IOException {
default List<String> reload(AnalysisRegistry analysisRegistry, IndexSettings indexSettings, String resource) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add another method that has no resource on it, to avoid invoking this with explicit nulls?

Copy link
Contributor Author

@mayya-sharipova mayya-sharipova Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never call it with explicit "null", so having a single constructor is enough.

@mayya-sharipova mayya-sharipova force-pushed the reload-analyzers-with-resource branch from bc2d21b to f68df16 Compare June 22, 2023 13:14
@mayya-sharipova
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1

@carlosdelest
Copy link
Member

Thanks for the changes, Mayya! LGTM 👍

This PR adds a new optional parameter "resource" for ReloadAnalyzersRequest.
If used, only analyzers that use this specific "resource" will be reload.
This parameter is not documented, for internal use only.

PR elastic#96886 introduced auto-reload of analyzers on synonyms index change. The problem
was that reloading was applied broady for all indices that contained reloadable
analyzers. This PR improves this, so when a particular synonyms set changes,
only analyzers that use this synonyms set  will auto-reloaded. Note that shard
requests will still be sent to all indices shards, as only on a shard we can
decide if analyzers need to be reloaded.
@mayya-sharipova mayya-sharipova force-pushed the reload-analyzers-with-resource branch from f68df16 to 30a2562 Compare June 22, 2023 14:31
@mayya-sharipova mayya-sharipova merged commit 11a3104 into elastic:main Jun 22, 2023
@mayya-sharipova mayya-sharipova deleted the reload-analyzers-with-resource branch June 22, 2023 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Relevance/Analysis How text is split into tokens Team:Search Meta label for search team v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants