Skip to content

[DataFrame] can't handle separators in data #5647

Closed
@terrajobst

Description

@terrajobst

Related to dotnet/corefxlab#2968

Looks like DataFrame can't handle CSV where the separator appears in the column data.

Repro

var frame = DataFrame.LoadCsv(fileName);

foreach (var row in frame.Rows)
{
    Console.WriteLine(row[0]);
    Console.WriteLine(row[1]);
    Console.WriteLine(row[2]);
    Console.WriteLine();
}

CSV contents:

Name,Age,Description
Paul,34,"Paul lives in Vermont, VA."
Victor,29,"Victor: Funny guy"
Maria,31,

Expected behavior

Prints the contents of the CSV

Actual behavior

Exception:

Unhandled exception. System.FormatException: Line 2 has less columns than expected
   at Microsoft.Data.Analysis.DataFrame.GuessKind(Int32 col, List`1 read)
   at Microsoft.Data.Analysis.DataFrame.LoadCsv(Stream csvStream, Char separator, Boolean header, String[] columnNames, Type[] dataTypes, Int64 numberOfRowsToRead, Int32 guessRows, Boolean addIndexColumn, Encoding encoding)
   at Microsoft.Data.Analysis.DataFrame.LoadCsv(String filename, Char separator, Boolean header, String[] columnNames, Type[] dataTypes, Int32 numRows, Int32 guessRows, Boolean addIndexColumn, Encoding encoding)
   at ConsoleApp49.Program.Main(String[] args)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions