Description
In [27]: period_index=pd.PeriodIndex(start='2015-01-01',end='2015-03',freq='B')
period_index
Out[27]: <class 'pandas.tseries.period.PeriodIndex'>
[2015-01-01, ..., 2015-03-02]
Length: 43, Freq: B
In [28]: period_index.difference(period_index)
Out[28]: Index([], dtype='object')
I think this should return an empty PeriodIndex object, not an empty Index object.
This happens because if there is an empty set as a result of difference
, the object doesn't check its type before creating an empty version of itself: https://github.com/pydata/pandas/blob/v0.16.0/pandas/core/index.py#L1360. Generally I've seen a better construction for that line be type(self)([])
.
I'm happy to make this PR, although I'm not sure whether I'm missing something on the intention. If I'm not, should this be executed by adding something in Index's difference
method, or overriding that method in PeriodIndex
?
A method that removed items from the index would avoid any subclass-specific code, but the drop method also has some odd behavior:
In [25]:
period_index.drop(period_index)
Out[25]:
Int64Index([], dtype='int64')
So if you're creating a new object, you'd need to check the freq of the PeriodIndex too, given an empty PeriodIndex constructor needs a freq. Something like type(self)([], freq=self.freq, name=self.name)
. Are there cases for other subclasses of Index?