While being designed with genericity in mind, active_flow has to make assumptions on the objects it manipulates
When one calls
python aflow.[active|interactive] create_model {external_module.model_class} ...
active_flow
will create the model from the create_model
arguments, given the dependencies in the current python environment.
This instance - the model - has to pass certain checks. If not, active_flow
will fail elegantly, and report actions that can
be triggered for the model to be accepted by the CLI.
Estimator to be sumbitted to the aflow.active
CLI must implement a fit method, as defined in the scikit-learn API,
or a partial_fit method, to increment the model once targets have been queried from the user for a given batch.
The API for data mining is currently under progress. For now the main design would require a miner to provide:
- a prefit method, to be called on the input data before the main -interactive -
active_flow
loop - a generate_candidates method that takes no argument, returns an iterator (generator included)
- an update method to udpate the model given a candidate generated by
generate_candidates
, and validated by the oracle
- methods like
evaluate
would not be required, as evaluation is done by the oracle in our typical workflow - "any_time" methods like skmine.itemsets.SLIM generate a new candidate given the - freshly - updated model, thus resulting in some sort of an "online" interactive pattern mining workflow