Skip to content

Validate feature: setting baseline models #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 29, 2022

Conversation

sasha-scale
Copy link
Contributor

@sasha-scale sasha-scale commented Mar 24, 2022

Changes the interface on Scenario Test Creation to not require setting thresholds up front. Instead of passing in a test criteria, a user instead initializes the test with a list of evaluation functions, from which criteria are created with threshold null.

Users can later customize this threshold with metric.set_threshold

This PR also introduces the API interface for setting a model as a baseline for a whole unit test.

Implemented pytest coverage for both.

Shortcut ticket: https://app.shortcut.com/scaleai/story/399936/remove-manual-thresholding-from-api-interface-for-validate

Corresponding scaleapi side PR: https://github.com/scaleapi/scaleapi/pull/37635

Note: build won't pass until scaleapi side is deployed, but we expect pytest to pass when run against those changes locally

@pfmark
Copy link
Contributor

pfmark commented Mar 25, 2022

Running into
{"status_code":400,"error":"Input schema validation failed: \"evaluationCriteria\" is required When reporting an issue add the request_id:'040ed360-593d-9316-b99a-142b41d3f4c1'"}

when setting up a test through the API with

client.validate.create_scenario_test("new baseline test images", slice_id="slc_c81fxadwzftg238jb5zg", evaluation_functions=[client.validate.eval_functions.bbox_iou()])

I guess this is some Pydantic issue, right?

@sasha-scale
Copy link
Contributor Author

If you have difficulty testing this PR (getting 400 errors), you are probably hitting the production scaleapi server instead of the local feature branch.

you would initialize the client like this: client = nucleus.NucleusClient(os.environ.get("API_KEY"), endpoint='http://localhost:3000/v1/nucleus') where on localhost:3000, you have the other feature branch checked out

To run pytest against the feature branch,

do: export NUCLEUS_ENDPOINT='http://localhost:3000/v1/nucleus in the window where you're running pytest

@sasha-scale sasha-scale requested a review from a team March 25, 2022 23:17
@pfmark
Copy link
Contributor

pfmark commented Mar 28, 2022

We also need to update the scenario_test.add_criterion(...) method.

I tried st.add_criterion(client.validate.eval_functions.bbox_map()) which doesn't work.

Let's also rename it to add_eval_function(...) in order to avoid confusion.

@sasha-scale
Copy link
Contributor Author

Renamed add_criterion to add_eval_function

@pfmark
Copy link
Contributor

pfmark commented Mar 28, 2022

Tested locally and interaction with the backend works!

My bad, forgot this earlier, let's also rename metric.get_criteria() to metric.get_eval_functions() to make it consistent.

@pfmark pfmark self-requested a review March 28, 2022 20:43
Copy link
Contributor

@pfmark pfmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's test after the backend deployed, but lgtm after the rename suggestion

@sasha-scale sasha-scale merged commit c9f1f59 into master Mar 29, 2022
@sasha-scale sasha-scale deleted the validate/set_model_baseline branch March 29, 2022 23:57
gatli pushed a commit that referenced this pull request Mar 30, 2022
add new set model as baseline functions to client, remove add_criteria in favor of add_eval_function, bump version number and changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants