Request: setting for raising error when Loc inputs are not present in indexes/column #16630
Description
Hello
I have such a dataframe: signalperstation
signal_int
Entry name
Station1 998
Station2 837
Station3 870
signalperstation.loc[["Station1","Station3"]] yields
signal_int
Entry name
Station1 998
Station3 870
but
signalperstation.loc[["Station1","Station4"]] yields
signal_int
Entry name
Station1 998
Station4 NaN
Still, sometimes, you don't get to see that Station4 is not present in the dataframe, because you use an expression such as
signalperstation.loc[["Station1","Station4"],:].sum(axis=0), where sum will pass on NaN silently.
Hence, you might do some sum with wrong results (at least partly, because you should be aware of this).
The only time loc will raise an error (KeyError) is when no element of the list is in the index or columns.
Is there a way to make some settings to raise an error for this?
Passing the absence silently and by default outputting a dataframe with NaN lines might prove troublesome, and it is also against Python philosophy of EAFP, demanding for LBYL if you really want to be sure there is no error in your calculation.
In addition, that would be nice as outputting those could allow more finely tuned post error processing.
My thought about this problem's resolution is that indeed we should be able to do so, but since bracket/slicing notation may prove unfitting to pass argument, maybe we should add a parameter in the dataframe or _LocIndexer object that we could set beforehand (even if default behavior would be the current one).
I don't think it would prove too difficult as these not in indexes inputs are handled with NaN rows or columns creation.
I might be interested in trying to adapt the code to fit to my idea, if I knew which code parts I should refactor (and if it is not a C or Cython part of the code, as I am still not skilled enough in these).