Description
read_csv behaves oddly when na_values is set to non-string values. Sometimes
it correctly replaces the assigned number with NaN, and sometimes it doesn't. Here are some examples. Note in particular the different behavior of the last two statements:
Create file
df = DataFrame({'A' : [-999, 2, 3], 'B' : [1.2, -999, 4.5]})
df.to_csv('test2.csv', sep=' ', index=False)
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999])
A B
0 NaN 1.2
1 2 -999.0
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999.0])
A B
0 -999 1.2
1 2 NaN
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999.0,-999])
A B
0 -999 1.2
1 2 NaN
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999,-999.0])
A B
0 NaN 1.2
1 2 -999.0