-
Notifications
You must be signed in to change notification settings - Fork 10
Validate feature: setting baseline models #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Running into when setting up a test through the API with
I guess this is some Pydantic issue, right? |
If you have difficulty testing this PR (getting 400 errors), you are probably hitting the production scaleapi server instead of the local feature branch. you would initialize the client like this: To run pytest against the feature branch, do: |
We also need to update the I tried Let's also rename it to |
Renamed add_criterion to add_eval_function |
Tested locally and interaction with the backend works! My bad, forgot this earlier, let's also rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's test after the backend deployed, but lgtm after the rename suggestion
add new set model as baseline functions to client, remove add_criteria in favor of add_eval_function, bump version number and changelog
Changes the interface on Scenario Test Creation to not require setting thresholds up front. Instead of passing in a test criteria, a user instead initializes the test with a list of evaluation functions, from which criteria are created with threshold
null
.Users can later customize this threshold with
metric.set_threshold
This PR also introduces the API interface for setting a model as a baseline for a whole unit test.
Implemented pytest coverage for both.
Shortcut ticket: https://app.shortcut.com/scaleai/story/399936/remove-manual-thresholding-from-api-interface-for-validate
Corresponding scaleapi side PR: https://github.com/scaleapi/scaleapi/pull/37635
Note: build won't pass until scaleapi side is deployed, but we expect pytest to pass when run against those changes locally