Skip to content

Unexpected behavior with np.nan only initialized vectors #138

Closed
@a-monsch

Description

@a-monsch

I am currently working with this package in combination with pandas on an assignment for study exercises in particle physics and have encountered an unexpected behavior. For a homogenized structure of my data I use np.nan to fill some missing values (particles). Consequently, it often comes to a situation where an initialization of a vector obj/array/... with np.nan occurs. This works perfectly fine, especially the property query, except for pseudorapidity, rapidity and other related quantities.

A small example:

import vector
import numpy as np

# initially, this part is stored inside a pandas DataFrame
v = np.array([[4.0, 3.0, 2.0, 1.0], [np.nan, np.nan, np.nan, np.nan]]).view(
                        [("E", float), ("px", float), ("py", float), ("pz", float)]).view(
                        vector.MomentumNumpy4D)


# this creates a [True, False] mask; as expected 
mask1 = np.abs(v.rapidity) < 2.5   

# This creates a [True, True] mask, since the second value is not np.nan but 0.0
mask2 = np.abs(v.pseudorapidity) < 2.5  

My expectation would be that, in this case, any quantity of an object that is initialized with (only) np.nan should also return a np.nan and not an actual number.

Is this behavior, especially for a case described above, intended?

P.S.: I know that via vector.awk there is a possibility to work with inhomogeneous data structures, but in our workgroup we decided against that and for an approach using pandas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions