Skip to content

Slow performance in arrayUtil functions with variable length data #188

Open

Description

The functions that deal with converting numpy arrays of variable length elements to a buffer and back can be quite slow.
Running the test program: https://github.com/HDFGroup/hsds/blob/master/tests/perf/arrayperf/bytes_to_array.py with a million element array gave this output:

$ python bytes_to_array.py 
getByteArraySize - elapsed: 0.3334 for 1000000 elements, returned 7888327
arrayToBytes - elapsed: 3.1166 for 1000000 elements
bytesToArray - elapsed: 1.1793

Not surprising since it's iterating over each element in a loop.

Looked into using numba, but numba doesn't work with numpy arrays of object type.
Cython version of arrayUtil?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions