Skip to content

numpy error using read_csv with parse_dates=[...] and index_col=[...] #10245

Closed
@cmeeren

Description

@cmeeren

Consider a file of the following format:

week,sow,prn,rxstatus,az,elv,l1_cno,s4,s4_cor,secsigma1,secsigma3,secsigma10,secsigma30,secsigma60,code_carrier,c_cstdev,tec45,tecrate45,tec30,tecrate30,tec15,tecrate15,tec00,tecrate00,l1_loctime,chanstatus,l2_locktime,l2_cno
1765,68460.00,126,00E80000,0.00,0.00,39.38,0.118447,0.107595,0.252663,0.532384,0.600540,0.603073,0.603309,-13.255543,0.114,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,1692.182,8C023D84,0.000,0.00
1765,68460.00,23,00E80000,0.00,0.00,53.48,0.034255,0.021177,0.035187,0.042985,0.061142,0.061738,0.061801,-22.760003,0.015,24.955111,0.112239,25.115330,-0.119774,25.146603,-0.065852,24.747576,-0.243804,10426.426,08109CC4,10409.660,44.52
1765,68460.00,13,00E80000,0.00,0.00,54.28,0.046218,0.019314,0.037818,0.056421,0.060602,0.060698,0.060735,-20.679035,0.090,25.670250,-0.070761,25.752224,-0.055089,26.045048,-0.180056,25.360369,-0.062119,7553.020,18109CA4,7202.660,47.27

I try to read that with the following code

data = pd.read_csv(FILE, date_parser=GPStime2datetime,
                   parse_dates={'datetime': ['week', 'sow']},
                   index_col=['datetime', 'prn'])

Here I'm parsing week and sow into a datetime column using a custom function (this works properly) and using datetime and the prn column as a MultiIndex. The file is read successfully when index_col='datetime', but not when trying to create the MultiIndex using index_col=['datetime', 'prn'] (or when using column numbers instead of names). I get the following traceback:

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 474, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 260, in _read
    return parser.read()

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 721, in read
    ret = self._engine.read(nrows)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1223, in read
    index, names = self._make_index(data, alldata, names)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 898, in _make_index
    index = self._agg_index(index, try_parse_dates=False)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 984, in _agg_index
    index = MultiIndex.from_arrays(arrays, names=self.index_names)

  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 4410, in from_arrays
    cats = [Categorical.from_array(arr, ordered=True) for arr in arrays]

  File "C:\Anaconda\lib\site-packages\pandas\core\categorical.py", line 355, in from_array
    return Categorical(data, **kwargs)

  File "C:\Anaconda\lib\site-packages\pandas\core\categorical.py", line 271, in __init__
    codes, categories = factorize(values, sort=False)

  File "C:\Anaconda\lib\site-packages\pandas\core\algorithms.py", line 131, in factorize
    (hash_klass, vec_klass), vals = _get_data_algo(vals, _hashtables)

  File "C:\Anaconda\lib\site-packages\pandas\core\algorithms.py", line 412, in _get_data_algo
    mask = com.isnull(values)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 230, in isnull
    return _isnull(obj)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 240, in _isnull_new
    return _isnull_ndarraylike(obj)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 330, in _isnull_ndarraylike
    result = np.isnan(values)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I am using Python 2.7, Pandas 0.16.1 and numpy 1.9.2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDatetimeDatetime data dtypeIO CSVread_csv, to_csv

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions