Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch in exact search for meilisearch #29671

Merged
merged 16 commits into from
Mar 9, 2024

Conversation

6543
Copy link
Member

@6543 6543 commented Mar 8, 2024

meilisearch does not have an search option to contorl fuzzynes per query right now:

so we have to create a workaround by post-filter the search result in gitea until this is addressed.

For future works I added an option in backend only atm, to enable fuzzynes for issue indexer too.
And also refactored the code so the fuzzy option is equal in logic to code indexer


Sponsored by Kithara Software GmbH

6543 added 3 commits March 8, 2024 02:31
because then we can use same name for issue search where default should be false
@6543 6543 added the type/bug label Mar 8, 2024
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 8, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 8, 2024
@6543 6543 added this to the 1.22.0 milestone Mar 8, 2024
@6543 6543 requested a review from delvh March 8, 2024 14:22
@6543
Copy link
Member Author

6543 commented Mar 8, 2024

@delvh the issue we talked at the TOC meeting about ...

the problem is within meilisearch, so I only did come up with an workaround until meilisearch do address that on there side :/

@delvh delvh changed the title have issue indexer non-fuzzy with meilisearch too Patch in exact search for meilisearch Mar 8, 2024
@6543 6543 requested review from silverwind and delvh March 8, 2024 19:52
@GiteaBot GiteaBot removed the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 8, 2024
@GiteaBot GiteaBot added the lgtm/need 1 This PR needs approval from one additional maintainer to be merged. label Mar 8, 2024
@6543
Copy link
Member Author

6543 commented Mar 9, 2024

@silverwind I optimized and simplified the idea ... now it should be flat and normal maintainers should understand

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Mar 9, 2024
@6543 6543 enabled auto-merge (squash) March 9, 2024 01:16
@6543 6543 merged commit 7fdc048 into go-gitea:main Mar 9, 2024
26 checks passed
@6543 6543 deleted the meilisearch_non-fuzzy branch March 9, 2024 01:51
@6543
Copy link
Member Author

6543 commented Mar 9, 2024

Created a followup issue forexpose fuzzines via WebUI -> #29685

zjjhot added a commit to zjjhot/gitea that referenced this pull request Mar 9, 2024
* upstream/main:
  Patch in exact search for meilisearch (go-gitea#29671)
  Use more specific selector for `name` links (go-gitea#29679)
  Replace more gt- with tw- (go-gitea#29678)

# Conflicts:
#	templates/user/dashboard/issues.tmpl
@Kerollmops
Copy link

Hey! I just realized that Gitea is using Meilisearch. However, I am not sure about the missing feature here. Can you explain your use case in great detail, please? It seems that reducing the typo tolerance is possible. It's also possible to disallow typos on specific fields. But it doesn't seem to fit your needs.

@6543
Copy link
Member Author

6543 commented Mar 12, 2024

@Kerollmops hi 👋

we have two modes "fuzzy" search and non-fuzzy "match".

the code indexer has it exposed to the UI already, as for searching issues/pulls we do not jet and use non-fuzzy search by default.

Meilisearch is currently only possible to be used for the issue/pull search but code-serch will be added ... (#25976).

So the problem here is, that in non-fuzzy mode, that meilisearch returns hits that have no direct hit.

let's say i'm searching abcode i expect to get issues returned that have that exact word (case insensitive) ... but without this patch here I also get a-code ...

If i understand the meilisearch-docs correctly you can adjust that behavior per index. So now we either have to have an index clone one with fuzzynes and one without, or we post-sort it.

In the long run there should be just an query option like we use e.g. with elasticsearch:

searchType := esMultiMatchTypePhrasePrefix
if options.IsFuzzyKeyword {
searchType = esMultiMatchTypeBestFields
}
query.Must(elastic.NewMultiMatchQuery(options.Keyword, "title", "content", "comments").Type(searchType))

@6543
Copy link
Member Author

6543 commented Mar 12, 2024

or as with bleve ... (#29706)

@Kerollmops
Copy link

Hey @6543,

Thank you for the clear explanation. So, exposing the typoTolerance.enabled setting on the search route could fix the issue?

Another quick fix for your issue could be to double-quote every query word. Doing this disallows typos on those words and forces the documents to have all the double-quoted words. However, it's not that easy to put in place on your side.

@6543
Copy link
Member Author

6543 commented Mar 12, 2024

@Kerollmops would that work -> #29740 ?

PS: we should chat onwards in #29740 ... as it's discouraged to chat in merged/closed issues/pulls :)

@6543 6543 added the skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. label Mar 16, 2024
silverwind pushed a commit that referenced this pull request Mar 16, 2024
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Jun 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants