Skip to content

Fixed incorrect Telugu normalization of vu వు to ma మ ( #14699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

praveen-d291
Copy link

@praveen-d291 praveen-d291 commented May 22, 2025

Fixes: #14659

Remove incorrect Telugu వు/మ conflation in Indic Normalization. They look similar, but they are distinct with different meanings.

Currently "వు" is mapped to "మ" in IndicNormalizer in decompositions. This causes searches for "వెంకటరావు" to include "వెంకటరామ" even though they are different names.

I am a native speaker of Telugu language.

Remove incorrect Telugu వు/మ conflation in Indic Normalization. They look similar, but they are distinct with different meanings.
Currently search for "వెంకటరావు" lists "వెంకటరామ " even though they are different names, fixed it now
Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

Copy link

github-actions bot commented Jun 6, 2025

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove Telugu normalization of vu వు to ma మ from IndicNormalizer
2 participants