read_csv problem with delim_whitespace, skiprows and trailing spaces in skipped rows

given this input file with linefeeds indicated by <lf>

```
skip1<lf>
skip2<lf>
0    1    2<lf>
3    4    5<lf>
```

reading with read_csv() in pandas 0.15.0-42-g20be789 and python 3.4.2 works

```
df = pd.read_csv('test.txt', skiprows=2, delim_whitespace=True, header=None)
df
   0  1  2
0  0  1  2
1  3  4  5
```

If I add a space after skip1 so the skipped lines are 

```
skip1 <lf>
skip2<lf>
```

then read_csv() throws an error
`CParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 3`

Adding 1 to skiprows
`df = pd.read_csv('test.txt', skiprows=3, delim_whitespace=True, header=None)`
does not throw an exception and gives the expected DataFrame

Reading with skiprows=2 and without header=None does not throw an exception and produces a DataFrame with a multiindex

```
     skip2
0 1      2
3 4      5
```

If there is a space after skip2 so the skipped lines are

```
skip1<lf>
skip2 <lf>
```

then
`df = pd.read_csv('test.txt', skiprows=2, delim_whitespace=True, header=None)`
does not throw an exception but it does not include the 0 1 2 row in the DataFrame

If there are spaces after skip1 and skip2 so the skipped lines are

```
skip1 <lf>
skip2 <lf>
```

then
`df = pd.read_csv('test.txt', skiprows=2, delim_whitespace=True, header=None)`
throws the CParserError exception but
`df = pd.read_csv('test.txt', skiprows=3, delim_whitespace=True, header=None)`
does not and returns the expected DataFrame

I would expect skiprows to skip the number of lines specified whether or not there are trailing spaces in those lines.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

read_csv problem with delim_whitespace, skiprows and trailing spaces in skipped rows #8661

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

read_csv problem with delim_whitespace, skiprows and trailing spaces in skipped rows #8661

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions