-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config driven detectors - part 2 #462
Config driven detectors - part 2 #462
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just wanted some clarification on the conventions for the pydantic
models, specifically wrt to Optional
values and None
defaults.
I also saw a few TODOs
wrt to conditional checking of required parameters given some other parameters. Is this fairly easy to add (I'm assuming pydantic
validators)?
I believe this is possible. But tbh I haven't looked into it too deeply as wanted to get the basic functionality in first. |
@jklaise I've resolved a few of your comments, feel free to unresolve if you disagree. I've responded to the remaining larger comments, would you be able to take a look and mark as resolved (or not) please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Addition of pydantic validation of detector configs
This is the second of a series of PR's for the config-driven detector functionality. The original PR (#389) has been split into a number of smaller PR's to aid the review process.
Summary of PR
This PR introduces validation of the detector config files using pydantic. Pydantic models are defined for each detector in
alibi_detect.utils.schemas
. These are then used inalibi_detect.utils.loading.validate_config
; a public function to validate a detector's config dictionary.Use cases
The
validate_config
function will be used in a number of ways. Consider theload_detector
workflow` (functionality to be introduced in part III):validate_config
will be used:validate_config(cfg)
.resolve_config
has been called (so artefacts are now runtime objects such as ndarray's) e.g.validate_config(cfg, resolved=True)
.read_config
, and perhaps has manipulated the config dict). For this use case, thevalidate_config
has been made public.Validate config return
A small (but important!) detail -> The
validate_config
returns adict
, which is the output of the pydantic model's.dict()
method. This is the same config dict that was passed in, but with missing fields populated by their default values. This is a somewhat philosophical design choice (I think), in that it means we internally we always deal with fully populated config dicts. We can set additional metadata etc inschemas.py
at a later date, and can set defaults for artefact dictionaries etc (see below). A downside is that we will have to ensure the kwarg defaults set inschemas.py
are kept in sync with the detector kwargs themself (this could also be a positive if we wanted divergence in some cases).Pydantic models for artefacts
Artefacts are specified in a config with dictionaries, which may then have further artefact dictionaries within them (see preprocess_fn
,
modeletc in the
config.tomlexample in schematic above). These artefact dictionaries are also validated e.g. see
PreprocessConfigand
PreprocessConfigResolvedin
schemas.py`.Example
Example notebook:
https://gist.github.com/ascillitoe/8ea123776151368a7a6a91688a1809cb