Description
I am currently working with this package in combination with pandas on an assignment for study exercises in particle physics and have encountered an unexpected behavior. For a homogenized structure of my data I use np.nan
to fill some missing values (particles). Consequently, it often comes to a situation where an initialization of a vector obj/array/... with np.nan
occurs. This works perfectly fine, especially the property query, except for pseudorapidity
, rapidity
and other related quantities.
A small example:
import vector
import numpy as np
# initially, this part is stored inside a pandas DataFrame
v = np.array([[4.0, 3.0, 2.0, 1.0], [np.nan, np.nan, np.nan, np.nan]]).view(
[("E", float), ("px", float), ("py", float), ("pz", float)]).view(
vector.MomentumNumpy4D)
# this creates a [True, False] mask; as expected
mask1 = np.abs(v.rapidity) < 2.5
# This creates a [True, True] mask, since the second value is not np.nan but 0.0
mask2 = np.abs(v.pseudorapidity) < 2.5
My expectation would be that, in this case, any quantity of an object that is initialized with (only) np.nan
should also return a np.nan
and not an actual number.
Is this behavior, especially for a case described above, intended?
P.S.: I know that via vector.awk
there is a possibility to work with inhomogeneous data structures, but in our workgroup we decided against that and for an approach using pandas.