-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Initial Checks
- I have searched the existing issues and this feature has not been requested
Problem Description
The scripts used to evaluate models for GDPval-AA would be very valuable as a practical usage example of Stirrup, particularly for users who want to evaluate open models on GDPeval.
Proposed Solution
See https://artificialanalysis.ai/methodology/intelligence-benchmarking#gdpval-aa
Alternatives Considered
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request