Closed as not planned
Description
From an old dicussion in our forums I just learned about another interesting looking ranking evaluation metric used in some TREC competitions called "bpref" that is advertised to work well with incomplete data.
I'm opening this issue to do some more investigation into this and other evaluation metrics that we haven't considered yet.
Regarding bpref
its atm. unclear to me:
- how widely used it is
- in which use cases it might perform better than the metric we currently offer
- if we can implement it with our current API that is based in msearch or if we would need to change something to make it work