read_csv clobbers values of columns with duplicate names

xref #10577 (has test for duplicates with empty data)

I don't expect this is the correct behavior, although it's always possible I'm doing something wrong. Importing data using the `names` keyword will clobber the values of columns where the name is duplicated. For example:

``` python
from StringIO import StringIO
import pandas as pd

data = """a,1
b,2
c,3"""
names = ['field', 'field']

print pd.read_csv(StringIO(data), names=names, mangle_dupe_cols=True)
print pd.read_csv(StringIO(data), names=names, mangle_dupe_cols=False)
```

returns 

```
   field  field
0      1      1
1      2      2
2      3      3
   field  field
0      1      1
1      2      2
2      3      3
```

However, this produces the correct result:

``` python
df = pd.read_csv(StringIO(data), header=None)
df.columns = names
print df
```

```
   field  field
0      a      1
1      b      2
2      c      3
```

Interestingly, it works if the field names are in the header:

``` python
data_with_header = "field,field\n" + data
print pd.read_csv(StringIO(data_with_header))
```

```
  field  field.1
0     a        1
1     b        2
2     c        3
```

Is this a bug or am I doing something wrong?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

read_csv clobbers values of columns with duplicate names #9424

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

read_csv clobbers values of columns with duplicate names #9424

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions