Closed
Description
The KeyType attribute has a public parameterless constructor, that initializes the Count to int.maxInt .
When we process the KeyTypes with KeyToMappingTransformer we check that the Count of the KeyType is less than maxInt.
The error message is that the 'counts exceeds int.MaxValue', which for the case described (user annotating with the parameterless KeyType) leads to error.
Either don't expose the parameterless KeyType constructor, or initialize to somethign else (MaxInt- 1?, is such a large number even practical?) or accept MaxInt as a valid value.
@TomFinley @eerhardt @glebuk for suggestions.
Code to repro
class MapKeyToVector
{
/// This example demonstrates the use of the ValueMappingEstimator by mapping strings to other string values, or floats to strings.
/// This is useful to map types to a category.
public static void Example()
{
// Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging,
// as well as the source of randomness.
var mlContext = new MLContext();
// Get a small dataset as an IEnumerable.
var rawData = new[] {
new DataPoint() { Timeframe = 45, Category = 5 },
new DataPoint() { Timeframe = 15, Category = 4 },
new DataPoint() { Timeframe = 65, Category = 4 },
new DataPoint() { Timeframe = 25, Category = 3 },
new DataPoint() { Timeframe = 45, Category = 3 },
new DataPoint() { Timeframe = 45, Category = 5 }
};
var data = mlContext.Data.LoadFromEnumerable(rawData);
// Constructs the ML.net pipeline
var pipeline = mlContext.Transforms.Conversion.MapKeyToVector("TimeframeVector", "Timeframe")
.Append(mlContext.Transforms.Conversion.MapKeyToVector("CategoryVector", "Category", outputCountVector: true));
// Fits the pipeline to the data.
IDataView transformedData = pipeline.Fit(data).Transform(data);
}
private class DataPoint
{
[KeyType]
public uint Timeframe { get; set; }
[KeyType]
public uint Category { get; set; }
}
}