Skip to content

[ML] More tolerant delimited file parsing when structure is overridden #38890

@droberts195

Description

@droberts195

Inspired by elastic/kibana#31065.

At present the file structure finder will only detect a delimited file if all rows have the same number of columns. This is sensible when determining the structure from scratch, but when the structure has been explicitly specified as delimited using an override and the exact delimiter is also supplied it makes more sense to believe the user and try to create a structure using the specified format even if it means there are different numbers of columns per row.

Additionally, when doing timestamp format determination for delimited files it would be nice to have an options to detect a timestamp field when a small percentage of rows did not match. We could still default to requiring 100% matches but offer the option to reduce this to, say, 95%.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions