-
Notifications
You must be signed in to change notification settings - Fork 165
Description
When trying to use pandas df.to_feather() method, it seems to work well with smaller datasets, however with a larger dataset I'm trying to use it with (~210GB) I get various errors.
Error 1.) Columns that contain all np.nan values throw exception ValueError: cannot serialize column named col_name with dtype floating
I looked around about this error and found the issue #295 which seemed to have closed the same issue with the update to feather 0.4, however I am using 0.4 right now and the issue appears to still be there only when using large datasets.
To temporarily get around this, I check each column and drop those that are all np.nan before attempting to feather. Then I run into issue 2:
Error 2.) Cannot feather large datasets. Throwing exception: TypeError: Cannot convert pyarrow.lib.ChunkedArray to pyarrow.lib.Array. This is happening in File "pyarrow/feather.pxi", line 62, in pyarrow.lib.FeatherWriter.write_array