-
Notifications
You must be signed in to change notification settings - Fork 3
Description
I've recently started working with libpny. Thanks for making it available.
In my use case, I am only reading some *.npz files. I do not need to write any files.
I would like to use my own tensor type so I can read the data directly into the data structures that will be used in my program.
Unfortunately, I don't see how this is possible. The documentation states that all you need to implement in your tensor class are the methods:
data
shape
size
dtype
fortran_order
I can imagine that this would suffice if you were writing your data to file. However, without any constructor or ability to specify the size of the data structure, I can't see how you will be able to read into a custom tensor type. Likewise, it seems that you must provide a tensor type that can support both row and column major ordering. This seems like a lot of requirements -- largely making it impractical to use your own tensor type for reading.
Perhaps it is impossible to use a custom tensor type for reading -- and I should just stick to the supplied tensor for read, copying the data out as soon as the read is complete.
I initially implemented a test program using the default tensor class. The test program read in a large matrix from file and then performed a simple matrix / vector multiplication. Profiling this test application revealed that (i,j) indexing the tensor was by far the most expensive part of the operation -- copying the data to vector<vector> improved the matrix/vector multiply time from .05 seconds to .0005 seconds. Clearly using a custom tensor class is worth doing.
Another challenge is
const std::vector<size_t> &shape() const
Since this returns a reference to a std::vector<size_t>, I think this effectively forces you to store the shape of your tensor in a std::vector<size_t> in your class.
I am developing a library (that uses libpny) and some of my expected users prefer to avoid using the STL. So, my preference is to implement a tensor class using only raw C++ data types -- i.e. raw C-Style arrays.
If libpny is going to force STL, one solution would be for me to create the required size vector on the fly -- if it was returned by a copy, or passed as an output reference as an argument. However, since it is returned as a reference, then I believe I would be forced to keep it around as a member variable of the class (so it remains valid after the call to size()).
Any suggestions are appreciated.