Does the Heuristic Query Matching Crawling Strategies work on other languages than English? #1742
Unanswered
Pencoding1
asked this question in
Forums - Q&A
Replies: 1 comment
-
|
For more clear that I know that with these complicated query I should use Embedding strategies, however I would like to see light-weight approaches first because I pior my crawler works as fast as possible. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I am wondering if standard query matching strategies are effective for languages other than English or those commonly discussed here—specifically Vietnamese.
I am investigating heuristic-based crawling strategies, as my research suggests that current models struggle with Vietnamese due to its inherent complexity (e.g., diacritics/accents and compound word structures where multiple syllables form a single semantic unit).
For example:
Could you suggest alternative ways to improve query scoring algorithms? Additionally, am I mistaken in my assumptions—does BM25 actually perform well for Vietnamese in practice?
Beta Was this translation helpful? Give feedback.
All reactions