-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: concat losing columns dtypes for join=outer #47586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ | |
) | ||
from pandas.errors import InvalidIndexError | ||
|
||
from pandas.core.dtypes.cast import find_common_type | ||
from pandas.core.dtypes.common import is_dtype_equal | ||
|
||
from pandas.core.algorithms import safe_sort | ||
|
@@ -223,7 +224,7 @@ def union_indexes(indexes, sort: bool | None = True) -> Index: | |
|
||
indexes, kind = _sanitize_and_check(indexes) | ||
|
||
def _unique_indices(inds) -> Index: | ||
def _unique_indices(inds, dtype) -> Index: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the docstring also needs updating at some point. |
||
""" | ||
Convert indexes to lists and concatenate them, removing duplicates. | ||
|
||
|
@@ -243,7 +244,30 @@ def conv(i): | |
i = i.tolist() | ||
return i | ||
|
||
return Index(lib.fast_unique_multiple_list([conv(i) for i in inds], sort=sort)) | ||
return Index( | ||
lib.fast_unique_multiple_list([conv(i) for i in inds], sort=sort), | ||
dtype=dtype, | ||
) | ||
|
||
def _find_common_index_dtype(inds): | ||
""" | ||
Finds a common type for the indexes to pass through to resulting index. | ||
|
||
Parameters | ||
---------- | ||
inds: list of Index or list objects | ||
|
||
Returns | ||
------- | ||
The common type or None if no indexes were given | ||
""" | ||
dtypes = [idx.dtype for idx in indexes if isinstance(idx, Index)] | ||
if dtypes: | ||
dtype = find_common_type(dtypes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume we never pass a mixed list of Indexes and lists? could add type annotations here for clarity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A parent function uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assumed that if we could pass a mixed list, then the logic added here would not account for the types in a list and only use the Indexes to find the common dtype. We could therefore expect this to raise in those cases? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
we may need to change to |
||
else: | ||
dtype = None | ||
|
||
return dtype | ||
|
||
if kind == "special": | ||
result = indexes[0] | ||
|
@@ -283,16 +307,18 @@ def conv(i): | |
return result | ||
|
||
elif kind == "array": | ||
dtype = _find_common_index_dtype(indexes) | ||
index = indexes[0] | ||
if not all(index.equals(other) for other in indexes[1:]): | ||
index = _unique_indices(indexes) | ||
index = _unique_indices(indexes, dtype) | ||
|
||
name = get_unanimous_names(*indexes)[0] | ||
if name != index.name: | ||
index = index.rename(name) | ||
return index | ||
else: # kind='list' | ||
return _unique_indices(indexes) | ||
dtype = _find_common_index_dtype(indexes) | ||
return _unique_indices(indexes, dtype) | ||
|
||
|
||
def _sanitize_and_check(indexes): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue is marked as 1.4.4. ok to leave in 1.5 though (if so just change the issue and ignore this comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the milestones.
initially we though this happens only
for ea dtypes, but this was wrong. Occurs also for numpy dtypes