Skip to content

Cannot convert pyarrow.lib.ChunkedArray to pyarrow.lib.Array #325

@mark-hoffmann

Description

@mark-hoffmann

When trying to use pandas df.to_feather() method, it seems to work well with smaller datasets, however with a larger dataset I'm trying to use it with (~210GB) I get various errors.

Error 1.) Columns that contain all np.nan values throw exception ValueError: cannot serialize column named col_name with dtype floating

I looked around about this error and found the issue #295 which seemed to have closed the same issue with the update to feather 0.4, however I am using 0.4 right now and the issue appears to still be there only when using large datasets.
To temporarily get around this, I check each column and drop those that are all np.nan before attempting to feather. Then I run into issue 2:

Error 2.) Cannot feather large datasets. Throwing exception: TypeError: Cannot convert pyarrow.lib.ChunkedArray to pyarrow.lib.Array. This is happening in File "pyarrow/feather.pxi", line 62, in pyarrow.lib.FeatherWriter.write_array

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions