Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

xref #39146

This fastpath calls self.values and thus is only a fastpath for BlockManager (we only get here with all columns having the same dtype).

@jorisvandenbossche jorisvandenbossche added Performance Memory or execution speed performance ArrayManager labels Dec 7, 2021
@jorisvandenbossche jorisvandenbossche added this to the 1.4 milestone Dec 7, 2021
@jbrockmendel
Copy link
Member

I've been looking at this code path as, for reasons related to the CI failures, it appears fragile. e.g. in TestDataFrameSelectReindex.test_reindex_datetimelike_to_object setting df["b"] = 9 (following df = ser.unstack()) breaks similarly. Another degree of separation xref #42921.

@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented Dec 7, 2021

It also seems that our deprecation is not working if you only reindex the rows (and not the rows/columns together):

In [1]: arr = date_range("2016-01-01", periods=6).values.reshape(3, 2)
   ...: df = DataFrame(arr, columns=["A", "B"], index=range(3))
   ...: ts = df.iloc[0, 0]
   ...: fv = ts.date()

In [2]: df.reindex(index=range(4), columns=["A", "B", "C"], fill_value=fv)
<ipython-input-2-fdba941b8324>:1: FutureWarning: Using a `date` object for fill_value with `datetime64[ns]` dtype is deprecated. In a future version, this will be cast to object dtype. Pass `fill_value=Timestamp(date_obj)` instead.
  df.reindex(index=range(4), columns=["A", "B", "C"], fill_value=fv)
Out[2]: 
           A          B          C
0 2016-01-01 2016-01-02 2016-01-01
1 2016-01-03 2016-01-04 2016-01-01
2 2016-01-05 2016-01-06 2016-01-01
3 2016-01-01 2016-01-01 2016-01-01

In [3]: df.reindex(index=range(4), fill_value=fv)
...
TypeError: value should be a 'Timestamp' or 'NaT'. Got 'date' instead.

The second should also work and trigger the deprecation warning (the tests here are running into this because I am now basically taking the reindex path in separate steps for AM).

Which is probably indeed the same as #42921

@jbrockmendel
Copy link
Member

It also seems that our deprecation is not working

IIRC this code path goes through maybe_promote, which doesn't quite match the DTA behavior. There was some work a few months ago that brought them closer into alignment, but off the top of my head I don't remember how far that got.

@jreback
Copy link
Contributor

jreback commented Dec 23, 2021

seems reasonable can you merge master

@jreback jreback removed this from the 1.4 milestone Dec 27, 2021
if (
row_indexer is not None
and col_indexer is not None
and not isinstance(self._mgr, ArrayManager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think a more general condition would be self._can_fast_transpose. The thing we care about is self.values being cheap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche can you address comments and rebase

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2022

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@simonjayhawkins
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Memory or execution speed performance Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants