Popular repositories Loading
-
ERR-EVAL
ERR-EVAL Public🔍 Evaluate AI models' ability to detect ambiguity and manage uncertainty with the ERR-EVAL benchmark for reliable epistemic reasoning.
Python
-
prorok9898.github.io
prorok9898.github.io Public🔍 Evaluate AI models' reliability against ambiguity and uncertainty with the ERR-EVAL benchmark, ensuring accurate and calibrated responses in challenging scenarios.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.