You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got bit recently again by the mixed types DtypeWarning while processing a CSV file. I assume that at some point, when StringDtype is not experimental anymore, read_csv() will use that and won't need object dtype anymore, and so this potential problem source will go away.
In the meantime though, would it be possible to have an option in read_csv() to use StringDtype instead of the object dtype? Both for early adopters and people who want to try it out... and it would also be a nice migration path for when StringDtype is ready. Then, it would only be a matter of flipping the default for this switch. And for people who need to revert to object dtype for some reason, that would provide them a way to do that too at that time. Thoughts? Or is it still too soon for even experimental usage of StringDtype in read_csv()?
I'd be willing to create a pull request for this (at last for the Python version of the CSV parser).
The text was updated successfully, but these errors were encountered:
Ugh. Apologies, @TomAugspurger. I did try to search for a duplicate issue, but I didn't find that one. Maybe because I'm focussed more on "don't use object dtype" instead of "use types that support NA well". But yes, this discussion should go there. I'll read through the (long-ish) existing issue, and see if I have something to add. Thanks!
I got bit recently again by the mixed types DtypeWarning while processing a CSV file. I assume that at some point, when StringDtype is not experimental anymore,
read_csv()
will use that and won't need object dtype anymore, and so this potential problem source will go away.In the meantime though, would it be possible to have an option in
read_csv()
to use StringDtype instead of the object dtype? Both for early adopters and people who want to try it out... and it would also be a nice migration path for when StringDtype is ready. Then, it would only be a matter of flipping the default for this switch. And for people who need to revert to object dtype for some reason, that would provide them a way to do that too at that time. Thoughts? Or is it still too soon for even experimental usage of StringDtype inread_csv()
?I'd be willing to create a pull request for this (at last for the Python version of the CSV parser).
The text was updated successfully, but these errors were encountered: