Closed
Description
openedon Aug 27, 2018
When calculating the top-k accuracy for multi-class classifiers the current evaluation code is pretty slow. Especially if multiple Top-K accuracies are desired (this will re-do a lot of the work unnecessarily).
Current implementation will first sort (N logN
):
Then get the index:
A more efficient algorithm would be to just calculate the rank (O(N)
and no memory needed) of the correct label and keeping track of the seen ranks (0 being the best-case, correct prediction). Then the Top-k accuracy can be easily returned for any k
. Possibly changing the API to returning a vector of top-k predictions.
One issue to discuss: What happens when the score is equal for multiple values? There would be a "best-case" and "worst-case" top-k accuracy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment