Skip to content

[Query rules] Add exclude query rule type #111420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 2, 2024

Conversation

kderusso
Copy link
Member

@kderusso kderusso commented Jul 29, 2024

Adds a new type of query rule, exclude. Unlike the pinned rule, which pins specified documents to the top of result sets, the exclude query rule will identify documents that we never want returned in the result sets.

Here's an example of a ruleset with both a pinned and an exclude rule:

PUT /_query_rules/test-ruleset
{
  "rules": [
    {
      "rule_id": "1",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query-string",
          "values": [
            "foo"
          ]
        }
      ],
      "actions": {
        "ids": [
          "1"
        ]
      }
    },
    {
      "rule_id": "2",
      "type": "exclude",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query-string",
          "values": [
            "bar"
          ]
        }
      ],
      "actions": {
        "ids": [
          "2"
        ]
      }
    }
  ]
}

@kderusso kderusso added >feature :SearchOrg/Relevance Label for the Search (solution/org) Relevance team labels Jul 29, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @kderusso, I've created a changelog YAML for you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved out of PinnedQueryBuilder without additional change.

@kderusso kderusso force-pushed the kderusso/query-rule-exclude branch from 10bc011 to d63b5e3 Compare July 29, 2024 18:51
@kderusso kderusso force-pushed the kderusso/query-rule-exclude branch from 315df19 to dc7186c Compare July 29, 2024 19:18
@kderusso kderusso marked this pull request as ready for review July 29, 2024 20:24
@kderusso kderusso requested a review from a team July 29, 2024 20:24
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ent-search-eng (Team:SearchOrg)

@@ -102,9 +105,9 @@ If multiple matching rules pin more than 100 documents, only the first 100 docum
[[put-query-rule-example]]
==== {api-examples-title}

The following example creates a new query rule with the ID `my-rule1` in a query ruleset called `my-ruleset`.
The following example creates a new query rule in a query ruleset called `my-ruleset`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still mention something regarding my-rule1 and where that comes from (I assume it's an arbitrary ID for the name of the rule within the my-ruleset ruleset?)

Only one of `ids` or `docs` may be specified, and at least one must be specified.
- `docs` (Optional, array of objects) The documents to pin.
- `docs` (Optional, array of objects) The specified documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the max # of documents always hardcoded to 100? If so, I'd mention that here too

throw new ElasticsearchParseException("pinned hits cannot exceed " + MAX_NUM_PINNED_HITS);
throw new ElasticsearchParseException(type + " query rule actions cannot be empty");
} else if (((List<?>) action).size() > MAX_NUM_DOCS_IN_RULE) {
throw new ElasticsearchParseException(type + " documents cannot exceed " + MAX_NUM_DOCS_IN_RULE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be worth also validating to ensure the doc IDs are all unique in the list? Or, keep that onus on the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good callout. My only concern is that technically that might be considered a breaking change, because existing rulesets with duplicate rules could now break. We do dedup the rules later on when they are applied. I'm inclined to leave it as is just to avoid breaking existing rules, but let me know if you'd like to chat about this more.

@kderusso
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Mostly small stuff, except I have a sneaking suspicion there's a bug with multiple exclude rules.

@kderusso kderusso requested review from markjhoy and Mikep86 August 1, 2024 15:43
@kderusso
Copy link
Member Author

kderusso commented Aug 1, 2024

@elasticmachine update branch

Comment on lines +182 to +184
List<SpecifiedDocument> pinnedDocs = pinnedDocsSupplier != null ? pinnedDocsSupplier.get() : null;
if (pinnedDocs != null && pinnedDocs.isEmpty() == false) {
PinnedQueryBuilder pinnedQueryBuilder = new PinnedQueryBuilder(organicQuery, pinnedDocs.toArray(new SpecifiedDocument[0]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I figured out the issue:

As of the rule query rename (#108831), RuleQueryBuilder deserialization was changed to throw away pinnedIds and pinnedDocs values. If I'm reading the code correctly, I think BwC with 8.10.0 - 8.12.1 coordinator nodes was broken here, but it wasn't spotted at the time.

These changes further break BwC (although if you consider "broken" as a binary state, it was already broken) because pinnedDocsSupplier will always be null. The end effect is that any rule query from an 8.10.0 - 8.12.1 coordinator node will be handled as if none of the criteria matched (i.e. just run the organic query).

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small follow-ups, plus maybe BwC is broken? Hope i'm wrong on that.

@kderusso kderusso requested a review from Mikep86 August 1, 2024 19:56
Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the iterations!

@kderusso kderusso merged commit 02c4949 into elastic:main Aug 2, 2024
15 checks passed
@kderusso kderusso deleted the kderusso/query-rule-exclude branch January 24, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants