-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] pd.DataFrame
s modified during ingest/outgest round-trips
#2829
Comments
Our existing unit-test suite is our bulwark against all bugs in this area that have been reported and fixed. As long as we can add new unit-test cases for new fixes, without any old unit-test cases breaking, we should be good. |
pd.DataFrame
s modified during ingest/outgest round-trips
@ryan-williams given the merged PRs here are we good to close this, or are there more PRs upcoming? |
None of the "examples" above have been addressed. They apply to DataFrames in #2874 fixed two separate issues that were specific to
These applied to all |
Thanks @ryan-williams ! |
(Factored out of #2804)
Describe the bug
Certain
pd.DataFrame
column/index names can be shuffled or dropped during an ingest/outgest round-trip (e.g.{from,to}_anndata
). In some cases, a column ordf.index
can be dropped.To Reproduce
See test_dataframe_io_roundtrips.py.
Versions (please complete the following information):
Examples
(see test_dataframe_io_roundtrips.py; obs-roundtrip.py does similar and is used in the code snippets below)
1.-
df.index
is named "index":df.index.name
is dropped (index becomes unnamed)2. DataFrame has a column named
- Original
obs_id
:obs_id
column becomes unnamed indexdf.index
droppedDataFrame has a column named "index"
3.- "index" column is promoted to
- Original (unnamed) index becomes a column named
df.index
is unnamed:df.index
(unnamed)level_0
4.-
"index" column is dropped
df.index
is namedid_column_name
(defaultobs_id
):5.-
"index" column renamed to
df.index
has another name:obs_id
DataFrame also has a column named
id_column_name
(default:obs_id
)6.- unnamed index → column named
- "index" column dropped
df.index
is unnamed:level_0
obs_id
column → unnamed index7.-
"index" column is dropped
df.index
has a name:It's not clear what backwards-compatibility concerns might exist from fixing each case.
The text was updated successfully, but these errors were encountered: