Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: languages with combining characters cannot be searched #3385

Merged
merged 3 commits into from
Apr 9, 2022

Conversation

SychO9
Copy link
Member

@SychO9 SychO9 commented Apr 9, 2022

Changes proposed in this pull request:
The regular expression used in the Fulltext search gambit attempts to remove non-words to avoid triggering MySQL boolean mode, however, it also removes special characters that combine with words in languages such as Telugu and Devanagari.

This pull request tweaks the regular expression to take those into consideration and allow searching in those languages. (Before this fix, searching for example for: नागरी resulted in actually searching for न गर )

Necessity

  • Has the problem that is being solved here been clearly explained?
  • If applicable, have various options for solving this problem been considered?
  • For core PRs, does this need to be in core, or could it be in an extension?
  • Are we willing to maintain this for years / potentially forever?

Confirmed

  • Frontend changes: tested on a local Flarum installation.
  • Backend changes: tests are green (run composer test).
  • Core developer confirmed locally this works as intended.
  • Tests have been added, or are not appropriate here.

@SychO9
Copy link
Member Author

SychO9 commented Apr 9, 2022

for the \p{M} refer to https://www.regular-expressions.info/unicode.html

Copy link
Member

@askvortsov1 askvortsov1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow... This is kind of staggering, not gonna lie: if this has been the main reason completely breaking UTF-8 search, this might genuinely be one of the most impactful PRs in Flarum history.

Copy link
Member

@askvortsov1 askvortsov1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thought: perhaps we should also test that we can search titles for other languages?

Copy link
Member

@luceos luceos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least there's tests now 🙈

@SychO9 SychO9 added this to the 1.3 milestone Apr 9, 2022
@SychO9 SychO9 merged commit 6de1ea0 into main Apr 9, 2022
@SychO9 SychO9 deleted the sm/fix-searching-with-other-languages branch April 9, 2022 22:04
@justoverclockl
Copy link

this will affect also filter options? because actually filter doesn't work with other languages:

 app.store
      .find('discussions', {
        'filter[q]':  ' తెలుగు భాష',

@SychO9
Copy link
Member Author

SychO9 commented Apr 11, 2022

when you use filter[q] you're performing a search, so yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants