Open
Description
Tell us about the task you want to perform and are unable to do so because the feature is not available
Develop an end-to-end AI API eval framework and integrate it in API Dash. This framework should (list is suggestive, not exhaustive):
- Provide an intuitive interface for configuring API requests, where users can input test datasets, configure request parameters, and send queries to various AI API services
- Support evaluation AI APIs (text, multimedia, etc) across various industry task benchmarks
- Allow users to add custom dataset/benchmark & criteria for evaluation. This custom scoring mechanisms allow tailored evaluations based on specific project needs
- Visualize the results of API eval via tables, charts, and graphs, making it easy to identify trends, outliers, and performance variations
- Allow execution of batch evaluations
- Work with both offline & online models and datasets