Skip to content

Wrap scoped search autocomplete values in quotes when they contain a pipe character#1083

Merged
jeremylenz merged 1 commit intotheforeman:developfrom
jeremylenz:scoped-search-tokenizer-pipe
Sep 10, 2025
Merged

Wrap scoped search autocomplete values in quotes when they contain a pipe character#1083
jeremylenz merged 1 commit intotheforeman:developfrom
jeremylenz:scoped-search-tokenizer-pipe

Conversation

@jeremylenz
Copy link
Collaborator

@jeremylenz jeremylenz commented Sep 2, 2025

What are the changes introduced in this pull request?

Insights rule IDs frequently (almost always) contain the pipe character, |. This causes problems when using scoped search to perform a search like

rule_id = dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO 

because scoped search would take the pipe character as an "OR", and do separate searches for the keywords on each side of the pipe. This means that you don't get any results unless you put the keyword in quotes:

rule_id = "dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO"

Which is fine, except that scoped search autocomplete suggests the search without the quotes.

# Without quotes around the rule id
[1] pry(main)> puts InsightsHit.search_for("rule_id = dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO").to_sql
SELECT ... FROM "insights_hits" ...  WHERE ((("insights_hits"."rule_id" = 'dnf_install_update_issue') OR ("insights_hits"."title" ILIKE '%DNF_INSTALL_UPDATE_ISSUE_INFO%' OR "hosts"."name" ILIKE '%DNF_INSTALL_UPDATE_ISSUE_INFO%')))
=> nil

# With quotes around the rule id
[2] pry(main)> puts InsightsHit.search_for("rule_id = \"dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO\"").to_sql
SELECT "insights_hits".* FROM "insights_hits" WHERE (("insights_hits"."rule_id" = 'dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO'))
=> nil

I tried various ways of modifying scoped_search to add the quotes, including external search methods and value_translation. But the problem there was that these receive the search keyword values after scoped search already split up the values on either side of the pipe.

The solution here is to have the scoped search autocomplete builder suggest the quotes. This requires redefining a method.

Considerations taken when implementing this change?

This will affect the behavior of scoped_search application-wide, not just for InsightsHit or foreman_rh_cloud. Ideally scoped_search itself can be updated with this change eventually, and the change would apply to all special operators and not just pipe. For now, I have it only affecting pipe.

What are the testing steps for this pull request?

Non-IoP
get a host with recommendations
Insights > Recommendations
Perform a search for "rule_id = "
Accept one of the autocomplete suggestions

Before: You get 0 search results. If you put the rule ID in quotes, it works.
Now: Autocomplete suggested values are already in quotes, so it works.

Summary by Sourcery

Fix scoped_search to correctly handle Insights rule IDs containing '|' and update the InsightsHit host association to use an explicit class name

Bug Fixes:

  • Remove '|' from scoped_search tokenizer regex to prevent splitting Insights rule IDs on pipe symbol

Enhancements:

  • Specify class_name 'Host::Base' for the host association in InsightsHit model

@sourcery-ai
Copy link

sourcery-ai bot commented Sep 2, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

This PR overrides the scoped_search tokenizer to remove the pipe character from its operator regex—allowing rule IDs containing '|' to be treated as single keywords—and updates the host association to explicitly reference Host::Base.

Sequence diagram for scoped search keyword tokenization with pipe character

sequenceDiagram
actor User
participant InsightsHit
participant ScopedSearchTokenizer
User->>InsightsHit: Search for rule_id with pipe (e.g. dnf_install_update_issue|DNF_INSTALL_UPDATE_ISSUE_INFO)
InsightsHit->>ScopedSearchTokenizer: Tokenize keyword
ScopedSearchTokenizer-->>InsightsHit: Treat entire rule_id (with '|') as single keyword
InsightsHit-->>User: Return correct search results
Loading

File-Level Changes

Change Details Files
Override scoped_search keyword tokenizer to exclude ' ' from splitting regex
  • Define a module ScopedSearch::QueryLanguage::Tokenizer with custom tokenize_keyword method
  • Adjust regex in tokenize_keyword to remove '
Explicitly set class_name for host association
  • Modify belongs_to :host to include class_name 'Host::Base'
app/models/insights_hit.rb

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@adamruzicka
Copy link
Contributor

adamruzicka commented Sep 3, 2025

The catch with this is that this applies application-wide, which means searches will behave differently based on whether rh_cloud is enabled or not.

Query Before (and without rh cloud) After Same
foo | bar foo OR bar foo OR bar yes
foo| bar foo OR bar foo| AND bar no
foo |bar foo OR bar foo OR bar yes
foo|bar foo OR bar foo|bar no

scoped_search already wraps autocompleted values in quotes in case they contain whitespace (link). Wouldn't it be better to wrap the autocompleted values in quotes if they contain any special operators as well?

@jeremylenz jeremylenz force-pushed the scoped-search-tokenizer-pipe branch from 6ca72dd to 8674ca0 Compare September 3, 2025 12:48
@jeremylenz jeremylenz changed the title Remove pipe character from scoped search keyword tokenizer Wrap scoped search autocomplete values in quotes when they contain a pipe character Sep 3, 2025
@jeremylenz
Copy link
Collaborator Author

@adamruzicka updated 👍

@adamruzicka
Copy link
Contributor

Why carry the fix here? Why not send it to https://github.com/wvanbergen/scoped_search ?

@jeremylenz
Copy link
Collaborator Author

I would also like to have it fixed in scoped_search. This PR can be a temporary change until that is in place.

@chris1984
Copy link
Member

Why carry the fix here? Why not send it to https://github.com/wvanbergen/scoped_search ?

I am fine with a fix here for the time being, looking at pull requests and activity in that repo, it seems a fix would sit for a while before being merged.

@jeremylenz
Copy link
Collaborator Author

Opened wvanbergen/scoped_search#230

  for values containing the pipe character
@jeremylenz jeremylenz force-pushed the scoped-search-tokenizer-pipe branch from 8674ca0 to 355fdbf Compare September 3, 2025 17:28
@chris1984 chris1984 self-assigned this Sep 8, 2025
Copy link
Member

@chris1984 chris1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, had to use Gemini to figure out the regex 🤣 will test it tomorrow.

Copy link
Collaborator

@nofaralfasi nofaralfasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the PR and can confirm it works as expected.

Copy link
Member

@chris1984 chris1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine and @nofaralfasi tested it

@jeremylenz
Copy link
Collaborator Author

jeremylenz commented Sep 9, 2025

Thanks for the reviews all! I am going to merge this one unless I hear any objections. Will need to revert it once the scoped_search PR is fully in place.

@jeremylenz jeremylenz merged commit b2ffab5 into theforeman:develop Sep 10, 2025
23 checks passed
chris1984 pushed a commit to chris1984/foreman_rh_cloud that referenced this pull request Sep 15, 2025
for values containing the pipe character

(cherry picked from commit b2ffab5)
chris1984 pushed a commit that referenced this pull request Sep 15, 2025
for values containing the pipe character

(cherry picked from commit b2ffab5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants