Skip to content

csv_reader with limited number of columns should should completely disregard the unused fields #8985

Closed
@cordeiro

Description

@cordeiro

xref #6710

I have a CSV whose lines may have 11 or 18 fields. I only need to read the first 6 fields, so I use "usecols=range(6)". Even with the limited number of columns, I get the exception:

ValueError: Expected 11 fields in line 776483, saw 18

The csv_reader should completely disregard the unused fields.

Small test case:

csv = '19,29,39\n'*2 + '10,20,30,40\n'
df = pd.read_csv(io.StringIO(csv), engine='python', header=None, usecols=list(range(3)))

It also affects the C engine.

Discussed at the users mailing list at https://groups.google.com/d/topic/pydata/vjhFpHtgnvw/discussion

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugError ReportingIncorrect or improved errors from pandasIO CSVread_csv, to_csv

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions