Modifying token weights using IDF #354
Unanswered
LiquidGunay
asked this question in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When calculating maxSim we could modify the similiarity score by using the Inverse Document Frequency of the word which the token is a part of, giving it a BM25 esque flavour and try to combine the strenghts of bi-encoders and keyword search. Has anyone tried this approach? I was thinking of this because I haven't found an approach that reliably works on data which is very wordy but also has many numbers.
Beta Was this translation helpful? Give feedback.
All reactions