Description
As a follow-up to my question, I believe the current approach is that pandas assumes that the programmer intends assignment to a child DF/S to be propagated to the parent DF/S as long as the parent has a non-zero reference count.
The problems like the ones I described would be avoided if child DataFrame always warned on assignment regardless of the reference count of the parent object - UNLESS a copy() method
or DataFrame
constructor was explicitly applied to the slice. (If desired query
and/or filter
can be documented as copy
methods as well.)
(Making ref count incremented when the child is a view is not enough because the code that works normally might one day stop working - with SettingWithCopyWarning that but might be ignored if it appear on the client site - simply due to the change in the input data.)
If this approach is followed, the documentation can also be clarified as follows:
When applying indexing,
filter
(???) and some other operations (???) to a parent DF/S pandas may create a view or copy of the parent depending on conditions that are too complicated to describe. This creates a possibility of a subtle bug in the code when assigning to the child DF/S. To help catch such potential bugs, pandas assumes that the programmer intends assignment to a child DF/S to be propagated to the parent DF/S unless it is explicitly copied (using .copy(), .query(),DataFrame
constructor, etc??? ); when pandas cannot guarantee such behavior, it generatesSettingWithCopyWarning
. Note that this happens regardless of whether the parent DF/S is reachable in the current context or even whether it still exists as an object.