Closed
Description
Proposal: introduce a 2D linear layer to be reused as a building block
It is crucial to have a simple 2D matmul layer in order to be able to build more complex algorithms such as transformers without the need to copypaste back propagation for each matrix multiplication.
Suggestion: I port my implementation here.
Back Story
I have recently started developing ML algorithms in Fortran as a hobby project. My goal is transformers. I have already started to code a bit of MultiHead Attention for neural-fortran
. But decided to stop and first think about implementing a linear layer separately, not as a part of MultiHead Attention. Should I a create also a separate issue for Attention?
Discussion: Maybe make it batched right away?
Metadata
Metadata
Assignees
Labels
No labels