Skip to content

Issue with explanation of shared memory in 01_numpy_performance #2

Closed
@mdboom

Description

@mdboom

Thanks for writing the cookbook and sharing it online. This is certain to become a great resource for our users.

In cookbook 01_numpy_performance, it is recommended to use x.__array_interface__['data'][0] to determine if an array is sharing data with another array. This is only useful if the offset of the arrays are the same, not if one array is a subarray/slice/view of another.

For example, here, two arrays are sharing the same data but they have different starting pointers.

In [1]: import numpy as np

In [2]: x = np.arange(10)

In [3]: y = x[1::2]

In [4]: x.__array_interface__['data'][0]
Out[4]: 46090608

In [5]: y.__array_interface__['data'][0]                                                                                                                                                                                            
Out[5]: 46090616

You could probably figure out that their data areas are overlapping, but that’s kind of expensive/complex.

The best way I’ve found to find out if two arrays share the same data:

def get_data_base(arr):
    base = arr.base
    while base is not None:
        base = base.base
    return base

>>> get_data_base(x) is get_data_base(y)
True

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions