-
Notifications
You must be signed in to change notification settings - Fork 60
Description
System Information (please complete the following information):
$ mlnet --version
16.2.0+511cc1082bef7a4bbb2f83ad88c1c425c932079c
$ dotnet --version
3.1.402
$ lsb_release -d
Description: Ubuntu 19.10
Describe the bug
I've trained up a regression algorithm using AutoML on Linux, and am now trying to use Permutation Feature Importance, from the documentation. [URLs in "additional context", below]
Things that are problematic to try and follow:
- Given that modelbuilder gave me a ModelInput, it's unclear how to use a ModelInput as the data: parameter. An elegant way to turn a ModelInput, or a collection of them, into an IDataView would be nice. Currently the only path I've found is to re-instantiate a fresh IDataView from my original CSV data using mlContext.Data.LoadFromFile [code that I lifted from the created ModelBuilder.cs]
- Getting the predictionTransformer parameter is currently an awkward cast [which I was only able to figure out by opening a debugger and checking the types] followed by .LastTransformer. Presumably this isn't going to be copy-pastable to the next model, if modelbuilder chose a different best model.
- I still haven't figured out how to map what PFI says, to the original parameters in the IDataView [or ModelInput]
Expected behavior
Personally I think the perfect product here would be one or more of a few things, any of which would improve my ability to access the functionality:
- An additional .cs file in the wizard-created Sample that can do PFI on the created model [this would be ideal, and probably the least invasive change!]
- Additional functions to instantiate a PermutationFeatureImportance that could take a TransformerChain instead of just the singular last transformer [current workflow also assumes that LastTransformer is even the important part of the chain]
- A way to take an autocreated "ModelInput" and end with an IDataView.
- A way to map ModelInput members to feature indices in the final weights/indices from PFI.
Additional context
https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/explain-machine-learning-model-permutation-feature-importance-ml-net
and
https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.permutationfeatureimportanceextensions.permutationfeatureimportance?view=ml-dotnet#Microsoft_ML_PermutationFeatureImportanceExtensions_PermutationFeatureImportance__1_Microsoft_ML_RegressionCatalog_Microsoft_ML_ISingleFeaturePredictionTransformer___0__Microsoft_ML_IDataView_System_String_System_Boolean_System_Nullable_System_Int32__System_Int32_