Open
Description
If we can find a standardized way for images to report their expected effectiveness metrics (AP, NDCG, etc.) - the jig could check and verify scores automatically.
That way, we'd be able to do end-to-end regression testing on entire IR systems!
I think this would be a pretty cool feature...
Thoughts?
Metadata
Assignees
Labels
No labels