Skip to content

NumericColumnNames won't return more than 1 column #6165

Open
@Phoenix-313

Description

@Phoenix-313

Hi,

Although CategoricalColumnNames returns the correct count of the categorical columns with their correct names, NumericColumnNames on the other hand returns the correct count and column name if the dataset has only one numerical column. However, if the dataset has more than one numerical column, it will always return a count of 1, and the column name will always be "Features" for some reason!

For example, imagine the following dataset:

x1, x2, x3, x4
1, T, 3, A
2, T, 4, A
3, L, 4, A
4, L, 4, B

CategoricalColumnNames will return a count of 2 categorical columns with the names x2 and x4. However, NumericColumnNames will return a count of 1 instead of 2, and one column name which is "Features" instead of x1 and x3.

This is how they are implemented:

ColumnInferenceResults columnInference = MLContext.Auto().InferColumns(TrainingDataPath, labelColumnIndex: 4, hasHeader: true);

ColumnInformation columnInformation = columnInference.ColumnInformation;

ICollection CatCols = columnInformation.CategoricalColumnNames;

ICollection NumCols = columnInformation.NumericColumnNames;

Please help. Thanks.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processP2Priority of the issue for triage purpose: Needs to be fixed at some point.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions