Closed
Description
System information
- OS version/distro:Windows 10
- .NET Version (eg., dotnet --info): 4.6
Issue
-
What did you do?
I am trying to follow the sample code from the GitHub article
https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-train-my-model-on-categorical-data
and get an understanding of how to get one-hot encoding to work -
What happened?
Run time error message - Could not find input column 'CategoricalOneHot'
Parameter name: inputSchema' -
What did you expect?
I expected the code to run without errors and then I should be able to examine how the data has been transformed
Source code / logs
// Build several alternative featurization pipelines.
var pipeline =
// Convert each categorical feature into one-hot encoding independently.
mlContext.Transforms.Categorical.OneHotEncoding("CategoricalFeatures", "CategoricalOneHot")
// Convert all categorical features into indices, and build a 'word bag' of these.
.Append(mlContext.Transforms.Categorical.OneHotEncoding("CategoricalFeatures", "CategoricalBag", Microsoft.ML.Transforms.Categorical.OneHotEncodingTransformer.OutputKind.Bag))
// One-hot encode the workclass column, then drop all the categories that have fewer than 10 instances in the train set.
.Append(mlContext.Transforms.Categorical.OneHotEncoding("Workclass", "WorkclassOneHot"))
.Append(mlContext.Transforms.FeatureSelection.SelectFeaturesBasedOnCount("WorkclassOneHot", "WorkclassOneHotTrimmed", count: 10));
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.