[Bug]: Text match has issues with tokenizer in chinese , where punctuation marks and spaces are also treated as separate words #37419
Labels
2.5-features
feature/text match
kind/bug
Issues or changes related a bug
severity/critical
Critical, lead to crash, data missing, wrong result, function totally doesn't work.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Milestone
Is there an existing issue for this?
Environment
Current Behavior
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
failed ci:
https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20HA%20CI/detail/PR-37119/17/pipeline/
'公司 自己 数据 上海 电影 今年 朋友 增加 日期 一起'
can match'有限电脑名称作者. 资源次数过程安全参加详细. 合作各种中国. 开始就是虽然表示. 留言游戏一直加入. 方式完成只有经济加入这是. 你好 生产当前完全留言日本支持决定.'
Anything else?
No response
The text was updated successfully, but these errors were encountered: