-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERR: HDF5 serialization of datelike-object dtypes should raise #8887
Comments
see here: http://pandas.pydata.org/pandas-docs/stable/gotchas.html#minimum-and-maximum-timestamps these are out of range of the high performance datetime impl, so these revert to object dtypes. An alternative is to use Periods. (though their is an open issue with storing these in HDF5. Its not difficult, just needs a bit of work, see here So this should raise ATM in HDF5. These cannot be serialized in table format at all (Object block is restricted to actual strings). I think fixed format might work. That said if you would like to work on the period repr would be great5. |
Just curious, what are you doing that you need dates out to 2900? |
OK, so I'm thinking that the first problem is that to_datetime ignores errors by default, and I'll put in a pull request to fix that. I might look closer at the Periods thing later this week. |
oh, and @rockg it a database of time travelers |
UPDATE:
It seems this is the source of the problem. I think there may be other dates in my dataset that are breaking the
to_datetime
methodUPDATE 2:
It seems that maybe it's that the date is later than 2900 that's causing the problem?
original issue:
The column in question came from a read_sql query, and the column has datetimes. It consists solely of pandas datetime objects and NoneType objects. I have iterated over the Series to be sure. The column has 11 million rows.
I've tried casting with to_datetime (and the dtype remains object--shouldn't the dtype change after that call?), to no avail.
Here's some stuff I get from poking around after sticking an
import pdb; pdb.set_trace()
into line 3329 of pytables.py (afterexcept (NotImplementedError, ValueError, TypeError) as e:
):My debugging kinds of hits a wall here, because it seems
infer_dtype
seems to be throwing the error, which is in lib.so, which is a compiled binary and I'm not sure how to look into that to figure out what's going on. I would love a suggestion about how to deal with that in the future, in addition to some answers about what's going on in this case.The text was updated successfully, but these errors were encountered: