Skip to content

Using custom tensor type #36

@ramcdona

Description

@ramcdona

I've recently started working with libpny. Thanks for making it available.

In my use case, I am only reading some *.npz files. I do not need to write any files.

I would like to use my own tensor type so I can read the data directly into the data structures that will be used in my program.

Unfortunately, I don't see how this is possible. The documentation states that all you need to implement in your tensor class are the methods:

data
shape
size
dtype
fortran_order

I can imagine that this would suffice if you were writing your data to file. However, without any constructor or ability to specify the size of the data structure, I can't see how you will be able to read into a custom tensor type. Likewise, it seems that you must provide a tensor type that can support both row and column major ordering. This seems like a lot of requirements -- largely making it impractical to use your own tensor type for reading.

Perhaps it is impossible to use a custom tensor type for reading -- and I should just stick to the supplied tensor for read, copying the data out as soon as the read is complete.

I initially implemented a test program using the default tensor class. The test program read in a large matrix from file and then performed a simple matrix / vector multiplication. Profiling this test application revealed that (i,j) indexing the tensor was by far the most expensive part of the operation -- copying the data to vector<vector> improved the matrix/vector multiply time from .05 seconds to .0005 seconds. Clearly using a custom tensor class is worth doing.

Another challenge is

const std::vector<size_t> &shape() const

Since this returns a reference to a std::vector<size_t>, I think this effectively forces you to store the shape of your tensor in a std::vector<size_t> in your class.

I am developing a library (that uses libpny) and some of my expected users prefer to avoid using the STL. So, my preference is to implement a tensor class using only raw C++ data types -- i.e. raw C-Style arrays.

If libpny is going to force STL, one solution would be for me to create the required size vector on the fly -- if it was returned by a copy, or passed as an output reference as an argument. However, since it is returned as a reference, then I believe I would be forced to keep it around as a member variable of the class (so it remains valid after the call to size()).

Any suggestions are appreciated.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions