Skip to content

BUG: using dtype='int64' argument of Series causes ValueError: values cannot be losslessly cast to int64 for integer strings #45017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed
3 changes: 3 additions & 0 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -2096,6 +2096,9 @@ def maybe_cast_to_integer_array(
)
return casted

if lib.infer_dtype(casted) == "integer":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this always holds. The condition we're interested in is whether the casting was lossy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but this test currently passes the extra test added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the added tests pass w/o this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, because this condition always holds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But before adding this check the testcase in the issue was not passing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But before adding this check the testcase in the issue was not passing.

That's a reason to add some check for this, but this particular check is not the right check. ATM this is equivalent to (but slower than) if True:

return casted

# No known cases that get here, but raising explicitly to cover our bases.
raise ValueError(f"values cannot be losslessly cast to {dtype}")

Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/series/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -1810,6 +1810,19 @@ def test_constructor_bool_dtype_missing_values(self):
expected = Series(True, index=[0], dtype="bool")
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("int_dtype", ["int64"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use this fixture instead: any_int_dtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def test_constructor_int64_dtype(self, int_dtype):
# GH-44923
result = Series(["-1", "0", "1", "2"], dtype=int_dtype)
expected = Series([-1, 0, 1, 2])
tm.assert_series_equal(result, expected)

def test_constructor_float64_dtype(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use any_float_dtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# GH-44923
result = Series(["0", "1", "2"], dtype="float64")
expected = Series([0.0, 1.0, 2.0])
tm.assert_series_equal(result, expected)

@pytest.mark.filterwarnings(
"ignore:elementwise comparison failed:DeprecationWarning"
)
Expand Down