-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Series.reindex() does nothing #17132
Comments
@Pastafarianist : Thanks for the report! Unfortunately, we can't run this code because In addition, if you could copy + paste the output (both actual and expected), that would be useful for anyone who's reading these issues. |
Apologies; please set it to any integer like 10. I've updated the code in the issue description. |
Could you move whichever is your expected output under your sentence regarding the "second print" ? |
Done. |
Did you reverse the two? The expected output should be at the bottom, but I think you put it second-to-last if I understand the issue properly. |
No, I did not. The issue description reads correctly right now. |
Ah, actually, yes I see that now. One other thing: you're missing a definition for |
So sorry for that. I was copy-pasting from a Jupyter Notebook. I've updated the description and made sure that the code runs as-is. |
No worries! Can confirm the code is runnable standalone. That's odd...not sure why you can't reindex when the |
@Pastafarianist not sure what you think this should do. do you simply want
This is reindexing.
|
@jreback : Ah, I see. I also had the implementation mixed up in my mind, but given the behavior, why don't we just get a |
IMO this is certainly a bug. It should never return the original series, as it should either do an actual reindex (if we decide that eg 1 matches Interval(0, 2)) or if no matches our found indeed return a Series of all NaN. |
Actually, on second look, it is reindexing, it just looks identical because you get the same values back (it just doing the same as |
not at all. This is a correct result. The points happen to be reindexer of the intervals. E.g. Not very useful, but the first 2 intervals are picked out because the contain the points 20 and 200. The 2000 gets nan because its not found.
|
xref #16386 |
@jreback can you then explain the docstring? (emphasis mine)
(the docstring could also have been incorrect for a long time of course) I would argue that the IntervalIndex and the integers (the mids) are not equivalent in this case |
what part is not correct? seems ok to me.
this is correct based on what Interval selection does. selection in an interval select an interval. Hence you get back the original index. |
But they are not identical though, which is @jorisvandenbossche point, and if your explanation is expected behavior, I don't see that in the docs. |
Sorry for the late reply. but i seam to have the same problem with INT indexes. import math
from IPython import display
from matplotlib import cm
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset
california_housing_dataframe = pd.read_csv("https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv", sep=",")
print(california_housing_dataframe.head())
california_housing_dataframe = california_housing_dataframe.reindex(np.random.permutation(california_housing_dataframe.index).tolist())
print(california_housing_dataframe.head()) Whenever I use the numpy permutation the result goes wrong. I've tried shell, jupyter notebook and script. ConfigsPython version: 2.7 and 3.6 |
@ceciliassis It's not directly clear what is wrong in your output, or how it is related to this issue. If you think there is a bug, please open a new issue. |
Code Sample, a copy-pastable example if possible
Output:
Problem description
Series.reindex()
returns the originalSeries
even though the index is changed.Expected Output
print(pd.Series(discretization.values, index=mids))
produces:Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.39-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.3
pytest: 3.1.3
pip: 9.0.1
setuptools: 36.2.2
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0
The text was updated successfully, but these errors were encountered: