Skip to content

Improve error messaging for non-parsable datasets #5129

Closed
@justinormont

Description

@justinormont

I'd recommend improving the current error message:

if (!splitInference.IsSuccess)
{
throw new InferenceException(InferenceExceptionType.ColumnSplit, "Unable to split the file provided into multiple, consistent columns.");
}

It currently says, "Unable to split the file provided into multiple, consistent columns.", which is rather uninformative and non-actionable.

Perhaps, as I think @briacht is suggesting, have it list the acceptable file formats we can parse: "Unable to split the file provided into multiple, consistent columns. Readable formats include delimited files such as CSV/TSV. Check for a consistent number of columns and proper escaping and quoting.".

This messaging now includes, the problem, and next steps for the user.

I mention delimited as AutoML supports more than CSV/TSV as it tries tab, comma, space, semi-colon as the separator (src). If we run into other common separators, we can trivially augment this list. One candidate is the vertical bar |.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoML.NETAutomating various steps of the machine learning processP3Doc bugs, questions, minor issues, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions