Skip to content

Error for KeyType attribute without initializing the Count  #3207

Closed
@sfilipi

Description

@sfilipi

The KeyType attribute has a public parameterless constructor, that initializes the Count to int.maxInt .

When we process the KeyTypes with KeyToMappingTransformer we check that the Count of the KeyType is less than maxInt.

The error message is that the 'counts exceeds int.MaxValue', which for the case described (user annotating with the parameterless KeyType) leads to error.

Either don't expose the parameterless KeyType constructor, or initialize to somethign else (MaxInt- 1?, is such a large number even practical?) or accept MaxInt as a valid value.

@TomFinley @eerhardt @glebuk for suggestions.

Code to repro

    class MapKeyToVector
    {
        /// This example demonstrates the use of the ValueMappingEstimator by mapping strings to other string values, or floats to strings. 
        /// This is useful to map types to a category. 
        public static void Example()
        {
            // Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging, 
            // as well as the source of randomness.
            var mlContext = new MLContext();

            // Get a small dataset as an IEnumerable.
            var rawData = new[] {
                new DataPoint() { Timeframe = 45, Category = 5 },
                new DataPoint() { Timeframe = 15, Category = 4 },
                new DataPoint() { Timeframe = 65, Category = 4 },
                new DataPoint() { Timeframe = 25, Category = 3 },
                new DataPoint() { Timeframe = 45, Category = 3 },
                new DataPoint() { Timeframe = 45, Category = 5 }
            };

            var data = mlContext.Data.LoadFromEnumerable(rawData);

            // Constructs the ML.net pipeline
            var pipeline = mlContext.Transforms.Conversion.MapKeyToVector("TimeframeVector", "Timeframe")
                           .Append(mlContext.Transforms.Conversion.MapKeyToVector("CategoryVector", "Category", outputCountVector: true));

            // Fits the pipeline to the data.
            IDataView transformedData = pipeline.Fit(data).Transform(data);
        }

        private class DataPoint
        {
            [KeyType]
            public uint Timeframe { get; set; }

            [KeyType]
            public uint Category { get; set; }

        }
}

Metadata

Metadata

Assignees

Labels

P0Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions