Description
I'd recommend improving the current error message:
machinelearning/src/Microsoft.ML.AutoML/ColumnInference/ColumnInferenceApi.cs
Lines 120 to 123 in e5a19af
It currently says, "Unable to split the file provided into multiple, consistent columns."
, which is rather uninformative and non-actionable.
Perhaps, as I think @briacht is suggesting, have it list the acceptable file formats we can parse: "Unable to split the file provided into multiple, consistent columns. Readable formats include delimited files such as CSV/TSV. Check for a consistent number of columns and proper escaping and quoting."
.
This messaging now includes, the problem, and next steps for the user.
I mention delimited as AutoML supports more than CSV/TSV as it tries tab, comma, space, semi-colon as the separator (src). If we run into other common separators, we can trivially augment this list. One candidate is the vertical bar |
.