Skip to content

Improve usability of AutoML column not found error #5574

Closed
@justinormont

Description

@justinormont

Let's make the error message more actionable.

Error user sees:
image

I would recommend adding similar named column(s):

- $"Provided {columnPurpose} column '{columnName}' not found in training data."
+ $"Provided {columnPurpose} column '{columnName}' not found in training data. Did you mean '{closestNamed}'."

For my current example, this would print: Provided ignored column 'tagMaxTotalItem' not found in training data. Did you mean 'tagMaxTotalItems'.

I'd recommend using Levenshtein distance to find the closest named column (code).

Code location:

var nullableColumn = trainData.Schema.GetColumnOrNull(columnName);
if (nullableColumn == null)
{
throw new ArgumentException($"Provided {columnPurpose} column '{columnName}' not found in training data.");
}

Background:
It took me ~20min to debug why this error was occurring (obvious in retrospect). My column existed in the dataset, it existed in my loader function, it existed in my IDataView, ...; simply was just misspelt ("tagMaxTotalItem" instead of "tagMaxTotalItems").

Improving the usability of this error message will save future users' time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processgood first issueGood for newcomersup-for-grabsA good issue to fix if you are trying to contribute to the projectusabilitySmoothing user interaction or experience

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions