Skip to content

ENH - Index set operation modifications to address issue #23525 #23538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 75 commits into from
May 21, 2019
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
c2cf269
ENH - first pass at modifying set operations on indexes. Dont ignore …
sds9995 Nov 7, 2018
435e50f
Merge branch 'master' into enh/index_setops
sds9995 Nov 8, 2018
4922fd3
BUG - account for empty index + non-monotonic index, and dont try to …
sds9995 Nov 8, 2018
5e528a1
TST - update existing tests to account for cross type index joins bei…
sds9995 Nov 8, 2018
cdaa5b0
ENH - incompatibility checks and incompatible type unions
sds9995 Nov 9, 2018
40d57ec
TST - update datetime union tets, add tests for inconsistent unions
sds9995 Nov 9, 2018
11fd041
CLN - refactor union -> _union
sds9995 Nov 11, 2018
8f0ace3
TST - add tests for categrorical index, and compatible inconsistent p…
sds9995 Nov 11, 2018
8364c2e
BUG - union -> _union in overriden _union methods
sds9995 Nov 11, 2018
ab329a9
TST - update test_operator raised exception
sds9995 Nov 11, 2018
93486ad
CLN - pep8 line adherence
sds9995 Nov 11, 2018
e435e4c
ENH - reverse polarity of compatibility check and add docstrings
sds9995 Nov 13, 2018
b9787b8
Merge branch 'master' into enh/index_setops
sds9995 Nov 13, 2018
2241b65
TST - add test fixture for index factories and use in test_setops
sds9995 Nov 13, 2018
4daf360
ENH - cast difference result to original dtype to match other index b…
sds9995 Nov 14, 2018
6e5a52b
TST - update interval setop test to account for difference now return…
sds9995 Nov 14, 2018
d344e11
CLN - remove unnecceary code from test
sds9995 Nov 14, 2018
b339bd1
CLN - reorganize some code to make it more readable
sds9995 Nov 14, 2018
85e2db7
CLN - pep8 adherence
sds9995 Nov 29, 2018
cf34960
CLN - pep8 adherence
sds9995 Nov 29, 2018
7150c22
BUG - fix function name
sds9995 Nov 29, 2018
fbb3743
Merge branch 'master' into enh/index_setops
sds9995 Dec 1, 2018
5aa41f6
BUG - fix numeric index compatibility
sds9995 Dec 1, 2018
02d7a3b
BUG - actually fix numeric compatibilty check, with passing index tests
sds9995 Dec 1, 2018
558e182
DOC - initial whatsnew
sds9995 Dec 2, 2018
706f973
ENH - no longer consider category indexes containing different catego…
sds9995 Dec 4, 2018
2ccab59
TST/CLN - no longer need new index_factory fixture and make code more…
sds9995 Dec 4, 2018
c70f1c0
CLN - make code more readable
sds9995 Dec 5, 2018
edb7e9c
CLN - pep8 adherence
sds9995 Dec 5, 2018
84bfbda
Merge branch 'master' into enh/index_setops
sds9995 Dec 5, 2018
aba75fe
DOC - fix whatsnew entry
sds9995 Dec 5, 2018
fc9f138
BUG - chagne object dtype index construction
sds9995 Dec 5, 2018
69cce99
Merge branch 'master' into enh/index_setops
sds9995 Dec 6, 2018
fdfc7d7
CLN/BUG - clean according to failed pandas-dev style checks
sds9995 Dec 6, 2018
42ca70e
CLN - fix imports with isort
sds9995 Dec 7, 2018
5b25645
CLN - refactor tests and remove overriden public union methods
sds9995 Dec 8, 2018
9b1ee7f
Merge branch 'master' into enh/index_setops
sds9995 Dec 8, 2018
fdf9b71
CLN - make code more efficient and cleanup whatsnew
sds9995 Dec 8, 2018
1de3cc8
Merge branch 'master' into enh/index_setops
sds9995 Jan 1, 2019
8ed1093
DOC - fix ipython code block
sds9995 Jan 1, 2019
77ca3a3
DOC - fix whatsnew code blocks again
sds9995 Jan 2, 2019
5921038
CLN - clean up some code, tests and docs
sds9995 Jan 3, 2019
3b94e3b
CLN - reorganize some code and add TODOs
sds9995 Jan 9, 2019
fd4510e
CLN - remove trailing whitespace
ms7463 Jan 14, 2019
345eec1
Merge branch 'master' into enh/index_setops
sds9995 Jan 14, 2019
265a7ee
Merge branch 'enh/index_setops' of https://github.com/ArtinSarraf/pan…
sds9995 Jan 14, 2019
5de3d57
CLN - fix import order
sds9995 Jan 15, 2019
6d82621
CLN - code cleanup, remove unneccesary operations
sds9995 Jan 17, 2019
0af8a24
Merge branch 'master' into enh/index_setops
sds9995 Jan 21, 2019
5a87715
CLN - apply error messages to both statements
sds9995 Jan 21, 2019
a4f9e78
TST - add regex queries
sds9995 Jan 23, 2019
c3c0caa
Merge branch 'master' into enh/index_setops
sds9995 Feb 11, 2019
0bcbdf4
BUG - fix default sort arg
sds9995 Feb 11, 2019
c410625
BUG - remove print
sds9995 Feb 11, 2019
6bb054f
TST/DOC - move to new whatsnew and use local fixture for tests
sds9995 Feb 12, 2019
aea731c
DOC - minor update to get tests to rerun
ms7463 Feb 13, 2019
b5938fc
Merge branch 'master' into enh/index_setops
sds9995 Feb 28, 2019
25452fc
Merge branch 'enh/index_setops' of https://github.com/ArtinSarraf/pan…
sds9995 Feb 28, 2019
0b97a79
Merge branch 'master' into enh/index_setops
ms7463 Mar 1, 2019
bf11c6f
Merge branch 'master' into enh/index_setops
sds9995 Mar 1, 2019
6fd941d
Merge branch 'enh/index_setops' of https://github.com/ArtinSarraf/pan…
sds9995 Mar 1, 2019
32037b5
DOC - fix docstrings and whatsnew
sds9995 Mar 2, 2019
8870006
Merge branch 'master' into enh/index_setops
sds9995 Mar 11, 2019
1d12bc9
DOC - update docstring
sds9995 Mar 11, 2019
92f6707
TST - use tm.assert_index_equal
sds9995 Mar 12, 2019
fbf3242
Merge branch 'master' into enh/index_setops
sds9995 Mar 20, 2019
38d9f74
TST - parametrize union tests
sds9995 Mar 21, 2019
b9e7b18
Merge branch 'master' into enh/index_setops
sds9995 Mar 21, 2019
69aaa93
DOC - add docstring
sds9995 Mar 21, 2019
b57160a
Merge branch 'master' into enh/index_setops
sds9995 Mar 28, 2019
daa1287
Merge branch 'master' into enh/index_setops
sds9995 May 15, 2019
54898c1
CLN/TST - fix super method calls and add error msg
sds9995 May 15, 2019
fa839a9
TST - add Timestamp to regexand fix import sorting
sds9995 May 15, 2019
a36f475
CLN - minor style updates
sds9995 May 16, 2019
b840f49
Merge branch 'master' into enh/index_setops
sds9995 May 21, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 35 additions & 10 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2756,6 +2756,25 @@ def _get_reconciled_name_object(self, other):
return self._shallow_copy(name=name)
return self

def _union_incompatible_dtypes(self, other):
"""
Casts this and other index to object dtype to allow the formation
of a union between incompatible types.
"""
this = self.astype('O')
# call Index for when `other` is list-like
other = Index(other).astype('O')
return Index.union(this, other).astype('O')

def _is_compatible_with_other(self, other):
"""
Check whether this and the other dtype are compatible with each other.
Meaning a union can be formed between them without needing to be cast
to dtype object.
"""
return (type(self) is type(other)
and is_dtype_equal(self.dtype, other.dtype))

def union(self, other):
"""
Form the union of two Index objects and sorts if possible.
Expand All @@ -2778,22 +2797,28 @@ def union(self, other):

"""
self._assert_can_do_setop(other)

if not self._is_compatible_with_other(other):
return self._union_incompatible_dtypes(other)

# This line needs to be after _union_incompatible_dtypes to ensure
# the original type of other is not lost after being cast to Index
other = ensure_index(other)
return self._union(other)

if len(other) == 0 or self.equals(other):
return self._get_reconciled_name_object(other)
def _union(self, other):
"""
Specific union logic should go here. In subclasses union behavior
should be overwritten here rather than in `self.union`
"""

if len(self) == 0:
return other._get_reconciled_name_object(self)
elif len(other) == 0:
return self._get_reconciled_name_object(other)

# TODO: is_dtype_union_equal is a hack around
# 1. buggy set ops with duplicates (GH #13432)
# 2. CategoricalIndex lacking setops (GH #10186)
# Once those are fixed, this workaround can be removed
if not is_dtype_union_equal(self.dtype, other.dtype):
this = self.astype('O')
other = other.astype('O')
return this.union(other)
if self.equals(other):
return self._get_reconciled_name_object(other)

# TODO(EA): setops-refactor, clean all this up
if is_period_dtype(self) or is_datetime64tz_dtype(self):
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/indexes/category.py
Original file line number Diff line number Diff line change
Expand Up @@ -872,6 +872,9 @@ def _delegate_method(self, name, *args, **kwargs):
return res
return CategoricalIndex(res, name=self.name)

def _is_compatible_with_other(self, other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the super method not already do this? if not, can you show the case where it fails (which would be a bug)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to the fact that the Categorical Dtype is specific to the values in the Categorical.
e.g.

pd.CategoricalIndex(['a','b','c']).dtype != pd.CategoricalIndex(['x','y','z']).dtype

That is why the check is checking the type of the dtype, which would not work for other cases since for example:

type(np.dtype('O')) is type(np.dtype(np.int64))

If this is a bug I would imagine that it would be outside the scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is the super method just not enough here? the dtypes must match exactly or we go to object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you're saying. I assumed we wanted to consider categorical indexes with different categories compatible. If not then yes the super method should be sufficient, I will make that change.

return type(self) is type(other) and type(self.dtype) is type(self.dtype)


CategoricalIndex._add_numeric_methods_add_sub_disabled()
CategoricalIndex._add_numeric_methods_disabled()
Expand Down
14 changes: 11 additions & 3 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -562,6 +562,13 @@ def unique(self, level=None):
result = super(DatetimeIndex, naive).unique(level=level)
return self._shallow_copy(result.values)

def _is_compatible_with_other(self, other):
is_compat = super(DatetimeIndex, self)._is_compatible_with_other(other)
if not is_compat:
is_compat = (hasattr(other, 'dtype')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you checking .base here? again why is the super method not sufficient here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that tz aware dtypes can be considered compatible, since the union between mismatched tz dtypes will default to union the arrays as UTC.

pd.date_range('19910905', periods=10, tz='US/Eastern').dtype.base == pd.date_range('19910905', periods=10, tz='US/Central').dtype.base

and self.dtype.base == other.dtype.base)
return is_compat

def union(self, other):
"""
Specialized union for DatetimeIndex objects. If combine
Expand All @@ -576,10 +583,11 @@ def union(self, other):
-------
y : Index or DatetimeIndex
"""
self._assert_can_do_setop(other)
return super(DatetimeIndex, self).union(other)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the logic for custom unions is in the _union method, the overrides of union (as opposed to _union) in the Index subclasses, is solely for the purpose of overriding the publicly exposed docstrings. There's probably a better way to do this, will come back to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this, no?


def _union(self, other):
if len(other) == 0 or self.equals(other) or len(self) == 0:
return super(DatetimeIndex, self).union(other)
return super(DatetimeIndex, self)._union(other)

if not isinstance(other, DatetimeIndex):
try:
Expand All @@ -592,7 +600,7 @@ def union(self, other):
if this._can_fast_union(other):
return this._fast_union(other)
else:
result = Index.union(this, other)
result = Index._union(this, other)
if isinstance(result, DatetimeIndex):
result._tz = timezones.tz_standardize(this.tz)
if (result.freq is None and
Expand Down
12 changes: 11 additions & 1 deletion pandas/core/indexes/interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -1038,7 +1038,16 @@ def overlaps(self, other):

def _setop(op_name):
def func(self, other):
other = self._as_like_interval_index(other)
try:
other = self._as_like_interval_index(other)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like _as_like_interval_index is only used here. Perhaps just including that logic here could help clean things up? Not entirely sure though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed and done, also allows me to get rid of the try except that really should have just been an if condition.

# allow ValueError from this method to raise to catch mixed closed
# except only Non-Interval index mismatches.
except TypeError:
# Currently this will cause difference operations to return
# object dtype as opposed to IntervalIndex, unlike other Index
# objects that return the same type when using `difference` on
# mismatched types
return getattr(self.astype('O'), op_name)(other)

# GH 19016: ensure set op will not return a prohibited dtype
subtypes = [self.dtype.subtype, other.dtype.subtype]
Expand All @@ -1059,6 +1068,7 @@ def func(self, other):

return type(self).from_tuples(result, closed=self.closed,
name=result_name)

return func

@property
Expand Down
9 changes: 9 additions & 0 deletions pandas/core/indexes/numeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,15 @@ def _assert_safe_casting(cls, data, subarr):
raise TypeError('Unsafe NumPy casting, you must '
'explicitly cast')

def _is_compatible_with_other(self, other):
from pandas.core.dtypes.generic import ABCRangeIndex
is_compat = super(Int64Index, self)._is_compatible_with_other(other)
if not is_compat:
is_compat = (type(self) is Int64Index
and isinstance(other, ABCRangeIndex))
return is_compat



Int64Index._add_numeric_methods()
Int64Index._add_logical_methods()
Expand Down
12 changes: 8 additions & 4 deletions pandas/core/indexes/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -833,6 +833,11 @@ def join(self, other, how='left', level=None, return_indexers=False,
"""
self._assert_can_do_setop(other)

if not isinstance(other, PeriodIndex):
return self.astype('O').join(other, how=how, level=level,
return_indexers=return_indexers,
sort=sort)

result = Int64Index.join(self, other, how=how, level=level,
return_indexers=return_indexers,
sort=sort)
Expand All @@ -845,10 +850,9 @@ def join(self, other, how='left', level=None, return_indexers=False,
def _assert_can_do_setop(self, other):
super(PeriodIndex, self)._assert_can_do_setop(other)

if not isinstance(other, PeriodIndex):
raise ValueError('can only call with other PeriodIndex-ed objects')

if self.freq != other.freq:
# *Can't* use PeriodIndexes of different freqs
# *Can* use PeriodIndex/DatetimeIndex
if isinstance(other, PeriodIndex) and self.freq != other.freq:
msg = DIFFERENT_FREQ_INDEX.format(self.freqstr, other.freqstr)
raise IncompatibleFrequency(msg)

Expand Down
15 changes: 12 additions & 3 deletions pandas/core/indexes/range.py
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,13 @@ def _extended_gcd(self, a, b):
old_t, t = t, old_t - quotient * t
return old_r, old_s, old_t


def _is_compatible_with_other(self, other):
is_compat = super(RangeIndex, self)._is_compatible_with_other(other)
if not is_compat:
is_compat = type(other) is Int64Index
return is_compat

def union(self, other):
"""
Form the union of two Index objects and sorts if possible
Expand All @@ -428,9 +435,11 @@ def union(self, other):
-------
union : Index
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed any longer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see related comment in DatetimeIndex module

self._assert_can_do_setop(other)
return super(RangeIndex, self).union(other)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is another case where I was preserving the original public docstring, which differed from the super docstring. Should I just go ahead and remove all the overriden public union methods and allow the base docstring to propagate for all instances?


def _union(self, other):
if len(other) == 0 or self.equals(other) or len(self) == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

return super(RangeIndex, self).union(other)
return super(RangeIndex, self)._union(other)

if isinstance(other, RangeIndex):
start_s, step_s = self._start, self._step
Expand Down Expand Up @@ -469,7 +478,7 @@ def union(self, other):
(end_s - step_o <= end_o)):
return RangeIndex(start_r, end_r + step_o, step_o)

return self._int64index.union(other)
return self._int64index._union(other)

@Appender(_index_shared_docs['join'])
def join(self, other, how='left', level=None, return_indexers=False,
Expand Down
7 changes: 4 additions & 3 deletions pandas/core/indexes/timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,10 +263,11 @@ def union(self, other):
-------
y : Index or TimedeltaIndex
"""
self._assert_can_do_setop(other)
return super(TimedeltaIndex, self).union(other)

def _union(self, other):
if len(other) == 0 or self.equals(other) or len(self) == 0:
return super(TimedeltaIndex, self).union(other)
return super(TimedeltaIndex, self)._union(other)

if not isinstance(other, TimedeltaIndex):
try:
Expand All @@ -278,7 +279,7 @@ def union(self, other):
if this._can_fast_union(other):
return this._fast_union(other)
else:
result = Index.union(this, other)
result = Index._union(this, other)
if isinstance(result, TimedeltaIndex):
if result.freq is None:
result.freq = to_offset(result.inferred_freq)
Expand Down
24 changes: 4 additions & 20 deletions pandas/tests/indexes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,11 +624,7 @@ def test_intersection_base(self):
cases = [klass(second.values)
for klass in [np.array, Series, list]]
for case in cases:
if isinstance(idx, PeriodIndex):
msg = "can only call with other PeriodIndex-ed objects"
with pytest.raises(ValueError, match=msg):
first.intersection(case)
elif isinstance(idx, CategoricalIndex):
if isinstance(idx, CategoricalIndex):
pass
else:
result = first.intersection(case)
Expand All @@ -651,11 +647,7 @@ def test_union_base(self):
cases = [klass(second.values)
for klass in [np.array, Series, list]]
for case in cases:
if isinstance(idx, PeriodIndex):
msg = "can only call with other PeriodIndex-ed objects"
with pytest.raises(ValueError, match=msg):
first.union(case)
elif isinstance(idx, CategoricalIndex):
if isinstance(idx, CategoricalIndex):
pass
else:
result = first.union(case)
Expand All @@ -682,11 +674,7 @@ def test_difference_base(self):
cases = [klass(second.values)
for klass in [np.array, Series, list]]
for case in cases:
if isinstance(idx, PeriodIndex):
msg = "can only call with other PeriodIndex-ed objects"
with pytest.raises(ValueError, match=msg):
first.difference(case)
elif isinstance(idx, CategoricalIndex):
if isinstance(idx, CategoricalIndex):
pass
elif isinstance(idx, (DatetimeIndex, TimedeltaIndex)):
assert result.__class__ == answer.__class__
Expand Down Expand Up @@ -716,11 +704,7 @@ def test_symmetric_difference(self):
cases = [klass(second.values)
for klass in [np.array, Series, list]]
for case in cases:
if isinstance(idx, PeriodIndex):
msg = "can only call with other PeriodIndex-ed objects"
with pytest.raises(ValueError, match=msg):
first.symmetric_difference(case)
elif isinstance(idx, CategoricalIndex):
if isinstance(idx, CategoricalIndex):
pass
else:
result = first.symmetric_difference(case)
Expand Down
26 changes: 26 additions & 0 deletions pandas/tests/indexes/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import pandas.util.testing as tm


# add inteval index?
@pytest.fixture(params=[tm.makeUnicodeIndex(100),
tm.makeStringIndex(100),
tm.makeDateIndex(100),
Expand All @@ -28,6 +29,31 @@ def indices(request):
return request.param


def _make_repeating_index(x=10):
# x should be > 1
return Index(sorted([i for i in range(x//2 + 1)] * 2)[:x])


@pytest.fixture(params=[tm.makeUnicodeIndex,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't the indices fixture work?

Copy link
Contributor Author

@ms7463 ms7463 Dec 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the exact reasoning at this point, I have to go back and check. But I think it mostly had to do with the fact that since the Indices are pre-instantiated, you would get pairs like this:

    >>> idx1 = tm.makePeriodIndex()
    >>> idx2 = tm.makeDateIndex()
    >>> idx1.equals(idx2)
    True

And when the indexes are equal a different logic path is taken. However, now that I'm testing it out, I think that some things have changed (like doing the incompatibility check before checking if the indexes are equal) that might make this not an issue anymore.

I remember there being more examples of this (these are probably bugs that I should have reported, will see if I can find more examples). And this was causing the tests to not be as thorough as I wanted, so being able to instantiate them with different sizes ensured that this would never happen.

In any case I will open an issue for the equals bug and try out the original fixture again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved everything to the existing indices fixture and removed the index_factory fixture. Also added IntervalIndex to the indices fixture (everything in the tests/indexes directory ran fine with this)

tm.makeStringIndex,
tm.makeDateIndex,
tm.makePeriodIndex,
tm.makeTimedeltaIndex,
tm.makeIntIndex,
tm.makeUIntIndex,
tm.makeRangeIndex,
tm.makeFloatIndex,
lambda x=10: Index(np.random.choice([True, False], x)),
tm.makeCategoricalIndex,
lambda x=None: Index([]),
tm.makeMultiIndex,
_make_repeating_index,
tm.makeIntervalIndex],
ids=lambda x: type(x).__name__)
def index_factory(request):
return request.param


@pytest.fixture(params=[1, np.array(1, dtype=np.int64)])
def one(request):
# zero-dim integer array behaves like an integer
Expand Down
6 changes: 3 additions & 3 deletions pandas/tests/indexes/datetimes/test_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,9 +303,9 @@ def test_join_with_period_index(self, join_type):
c_idx_type='p', r_idx_type='dt')
s = df.iloc[:5, 0]

msg = 'can only call with other PeriodIndex-ed objects'
with pytest.raises(ValueError, match=msg):
df.columns.join(s.index, how=join_type)
expected = df.columns.astype('O').join(s.index, how=join_type)
result = df.columns.join(s.index, how=join_type)
tm.assert_index_equal(expected, result)

def test_factorize(self):
idx1 = DatetimeIndex(['2014-01', '2014-01', '2014-02', '2014-02',
Expand Down
9 changes: 6 additions & 3 deletions pandas/tests/indexes/datetimes/test_setops.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,13 @@ def test_union2(self):
assert tm.equalContents(union, everything)

# GH 10149
expected = first.astype('O').union(
pd.Index(second.values, dtype='O')
).astype('O')
cases = [klass(second.values) for klass in [np.array, Series, list]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally can parametrize this here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

for case in cases:
result = first.union(case)
assert tm.equalContents(result, everything)
assert tm.equalContents(result, expected)

@pytest.mark.parametrize("tz", tz)
def test_union(self, tz):
Expand Down Expand Up @@ -256,11 +259,11 @@ def test_datetimeindex_union_join_empty(self):
empty = Index([])

result = dti.union(empty)
assert isinstance(result, DatetimeIndex)
assert result is result
tm.assert_index_equal(result, dti.astype('O'))

result = dti.join(empty)
assert isinstance(result, DatetimeIndex)
tm.assert_index_equal(result, dti)

def test_join_nonunique(self):
idx1 = to_datetime(['2012-11-06 16:00:11.477563',
Expand Down
13 changes: 7 additions & 6 deletions pandas/tests/indexes/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -835,15 +835,16 @@ def test_symmetric_difference(self, closed):

@pytest.mark.parametrize('op_name', [
'union', 'intersection', 'difference', 'symmetric_difference'])
def test_set_operation_errors(self, closed, op_name):
def test_set_incompatible_types(self, closed, op_name):
index = self.create_index(closed=closed)
set_op = getattr(index, op_name)

# non-IntervalIndex
msg = ('the other index needs to be an IntervalIndex too, but '
'was type Int64Index')
with pytest.raises(TypeError, match=msg):
set_op(Index([1, 2, 3]))
# non-IntervalIndexf
expected = getattr(index.astype('O'), op_name)(Index([1, 2, 3]))
result = set_op(Index([1, 2, 3]))
tm.assert_index_equal(result, expected)

# Come back to mixed interval types

# mixed closed
msg = ('can only do set operations between two IntervalIndex objects '
Expand Down
Loading