Open
Description
I want to use this issue to share a heads-up on a big refactoring that I plan for the linalg
module.
During last couple of month I've seen on multiple occasions limitations and shortcomings imposed by the current design of the BaseVector
and BaseMatrix
. To mention a couple here:
- It is not possible to define an instance of
BaseMatrix
that holds string, integer type values. BaseMatrix
is not designed to hold values that belong to multiple types- Some algorithms, e.g. RandomForest, does not use most methods defined in the
BaseMatrix
andBaseVector
. Some preprocessing methods that we plan for future, like LabelEncoder will not need linear algebra routines defined for both classes. - Some basic operations, like get row or get column, perform unnecessary copy. This problem stems from the fact that both structs do not provide views or iterators that lets developer access an internal structure of the data.
- All operations are defined as functions. While this is not a big deal it leads to a clumsy looking code. Instead it would be nice to use more traits defined in std::ops
As a result, I'd like to see how can we use Rust's type system to design a better container for data that solves all these shortcomings.
I am open to any suggestions you have. Feel free to post your ideas here.