Modifying token weights using IDF #354

LiquidGunay · 2024-07-03T13:22:32Z

LiquidGunay
Jul 3, 2024

When calculating maxSim we could modify the similiarity score by using the Inverse Document Frequency of the word which the token is a part of, giving it a BM25 esque flavour and try to combine the strenghts of bi-encoders and keyword search. Has anyone tried this approach? I was thinking of this because I haven't found an approach that reliably works on data which is very wordy but also has many numbers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifying token weights using IDF #354

{{title}}

Replies: 0 comments

Select a reply

Modifying token weights using IDF #354

LiquidGunay Jul 3, 2024

Replies: 0 comments

LiquidGunay
Jul 3, 2024