-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using PlattCalibratorTransformer with custon name for Score Column #4700
Comments
The reason why the exception is thrown after training is because, it seems, that the PlattCalibratorEstimator actually trains correctly, but it's the This happens because when passing the desired machinelearning/src/Microsoft.ML.Data/Prediction/CalibratorCatalog.cs Lines 143 to 145 in 2267f8d
Problem is that when Creating the transformer, it doesn't get to know what was the
Then when creating a mapper out from the transformer, it actually doesn't know what the scoreColumnName was, and it's hardcoded to use the default name (which in this case is "Score") machinelearning/src/Microsoft.ML.Data/Prediction/CalibratorCatalog.cs Lines 236 to 238 in 2267f8d
It's in that line of code that the exception gets thrown. To fix this issue I think that it would be necessary to pass the scoreColumnName to the Create method that creates the transformer, and add a field in the transformer to hold and use to use it in the mapper. I am not sure if this should also be done for LabelColumnName and WeightColumnName but my guess is that it isn't necessary, since the mapper only needs the score column to work. And also these changes should also be checked for the other CalibratorTransformers (Isotonic, Naïve and Fixed). |
Another minor bug I found in the PlattCalibratorTransformer is that even when using a score column named "Score", if the column is the first one in the schema, then the Transformer throws this exception when transforming the input data view:
To reproduce it, take the sample code I left in the first post of this issue, and simply change the class ModelInput
{
public float Score { get; set; } // If this is declared first, it becomes the first column in the Schema
public bool Label { get; set; }
} By simply making that minor change (i.e. declaring Score before Label), then EXAMPLE 1 in my sample also throws the exception I've mentioned. This happens because of the check in CalibratorTransformer's Mapper:
For some reason it was required the index to be bigger than 0, and since "0" refers to the first column in the Schema, it doesn't allow the Score column to be the first one, and the exception is thrown. |
Issue
What did you do?
I tried to create a model with a PlattCalibratorEstimator that uses a
scoreColumnName
with a name different from "Score" (as done through this API)What happened?
After fitting the estimator, and while trying to transform the input dataview, the following exception is thrown:
System.InvalidOperationException: 'The data to calibrate contains no 'Score' column'
What did you expect?
The model to work the same way as if I had used the name "Score" for my score column
Furthermore, I couldn't find any sample or test that actually used the optional parameter
scoreColumnName
of PlattCalibratorEstimator, or the other parameters (such as labelColumnName). So adding such tests might be also necessary (if my PR #4700 gets in, then fixing this issue in here would also require to add onnx tests to check that PlattCalibrator with custom scoreColumnName is saved correctly to onnx). Checking if this problem also occurs in the other CalibratorTransformers would also be relevant.Notice that a simple workaround for this would be to copy the column that holds the score into a new column called Score, and specify Score as the scoreColumnName.
Source code / logs
In EXAMPLE 1 I show that it works if my score column is named "Score". But if I change the name, then it doesn't work.
The text was updated successfully, but these errors were encountered: