Skip to content

Multihead attention #199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 73 commits into from
Feb 21, 2025
Merged
Changes from 1 commit
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
49e8507
linear2d_layer forward implementation
OneAdder Feb 2, 2025
feb7112
linear2d_layer: temporarily remove api
OneAdder Feb 14, 2025
8f320f0
Don't expose the concrete layer type via nf
milancurcic Feb 16, 2025
af4a5d7
Plumbing of linear2d with input2d and linear2d
milancurcic Feb 16, 2025
549d4e6
linear2d_layer: add flatten2d layer
OneAdder Feb 16, 2025
3218be0
linear2d_layer: make linear2d layer work with input2d and flatten2d
OneAdder Feb 16, 2025
39636f4
update cmake
OneAdder Feb 16, 2025
4cc7d1d
linear2d_layer: remove flatten2d layer
OneAdder Feb 16, 2025
d863ce7
linear2d_layer: remove public api
OneAdder Feb 16, 2025
78eb17a
linear2d_layer: update cmakelists
OneAdder Feb 16, 2025
567abc4
Add linear2d example
milancurcic Feb 17, 2025
32ac10d
linear2d_layer: remove redundant constructor args
OneAdder Feb 17, 2025
edd169d
linear2d_layer: make example converge
OneAdder Feb 17, 2025
aa5b83f
linear2d_layer: add loss stopping and more iterations
OneAdder Feb 17, 2025
dd3ce33
start impementing MultiHeadAttention
OneAdder Jan 31, 2025
0ed77e8
scaled dot product attention
OneAdder Jan 31, 2025
d6e6f3e
combine attention heads
OneAdder Jan 31, 2025
eb58006
forward (not working)
OneAdder Jan 31, 2025
452032e
rearrange attention dimensions in more efficient way
OneAdder Feb 5, 2025
e06d39b
initial forward implementation for multi-head attention
OneAdder Feb 5, 2025
519a6c8
tests for multihead_attention%forward
OneAdder Feb 6, 2025
9fdc7ae
multihead_attention: move most logic to subroutines (performance)
OneAdder Feb 6, 2025
bc67331
multihead_attention: update tests
OneAdder Feb 6, 2025
a0a6fc4
multihead_attention: concurrency
OneAdder Feb 6, 2025
f8101af
multihead_attention: proof of concept backward (works, but not mathem…
OneAdder Feb 8, 2025
63cce11
multihead_attention: fix minor scaling issue
OneAdder Feb 9, 2025
dfb8842
multihead_attention: complete backward implementation
OneAdder Feb 9, 2025
adcf5e6
multihead_attention: add comments for forward prop
OneAdder Feb 9, 2025
650e47c
multihead_attention: add tests for backward
OneAdder Feb 9, 2025
3d16161
multihead_attention: adjust expected test values for updated scaling
OneAdder Feb 9, 2025
dcae5d6
multihead_attention: calculate scaling factor only once
OneAdder Feb 9, 2025
9fceae7
multihead_attention: use heap-allocated arrays during back prop
OneAdder Feb 9, 2025
248e124
multihead_attention: use heap-allocated arrays in forward
OneAdder Feb 9, 2025
4693028
multihead_attention: set values from correct shape to tests
OneAdder Feb 9, 2025
32dd628
multihead_attention: fix issues with shapes (softmax prime became eve…
OneAdder Feb 9, 2025
33c33b9
multihead_attention: minor refactoring and optimization
OneAdder Feb 9, 2025
40c3f2b
multihead_attention: fix comments
OneAdder Feb 9, 2025
6a607b0
multihead_attention: tests, add checks for attention weights
OneAdder Feb 9, 2025
5fc5a5b
multihead_attention: remove some of the copypaste comments
OneAdder Feb 9, 2025
65fd88d
multihead_attention: optimize shapes
OneAdder Feb 12, 2025
fbc132d
multihead_attention: params api
OneAdder Feb 14, 2025
5422e4c
multihead_attention: fix incorrect dw bug
OneAdder Feb 14, 2025
39637e7
multihead_attention: tests for updated parameters
OneAdder Feb 14, 2025
60a49db
multihead_attention: remove reshape crutches
OneAdder Feb 16, 2025
7ab7769
multihead_attention: rename common forward and backward calls
OneAdder Feb 16, 2025
20c5eb0
multihead_attention: tidy mha up
OneAdder Feb 16, 2025
6098533
multihead_attention: self attention
OneAdder Feb 16, 2025
66b5023
multihead_attention: add cross attention
OneAdder Feb 16, 2025
ac813aa
multihead_attention: add more comments
OneAdder Feb 16, 2025
6b70f6b
multihead_attention: arrange attention into submodule
OneAdder Feb 16, 2025
b622d55
multihead_attention: update cmakelists
OneAdder Feb 16, 2025
ce03b39
multihead_attention: update attention in accordance with linear2d
OneAdder Feb 17, 2025
41a80cd
multihead_attention: remove redundand constructor args for attention …
OneAdder Feb 17, 2025
a84efd3
multihead_attention: use pure and elemental where necessary
OneAdder Feb 17, 2025
52c94c4
multihead_attention: plumbing
OneAdder Feb 17, 2025
66b539b
multihead_attention: add reference
OneAdder Feb 17, 2025
992da67
multihead_attention: remove rebase artifact
OneAdder Feb 17, 2025
d93be41
multihead_attention: remove redundant args
OneAdder Feb 19, 2025
70272cb
multihead_attention: update tests
OneAdder Feb 19, 2025
cb717f5
multihead_attention: add the most important lines to tests
OneAdder Feb 19, 2025
b7a6d06
multihead_attention: simple MHA example
OneAdder Feb 19, 2025
cb26afb
multihead_attention: update cmake
OneAdder Feb 19, 2025
4c92e9c
multihead_attention: remove debug line from tests
OneAdder Feb 19, 2025
df5f4cf
multihead_attention: set slightly higher margin for fp imprecision (d…
OneAdder Feb 19, 2025
46786d6
Merge upstream/main
milancurcic Feb 21, 2025
6162783
Rename mha_simple example
milancurcic Feb 21, 2025
89abf22
Update src/nf/nf_multihead_attention.f90
milancurcic Feb 21, 2025
29b7d2e
Update src/nf/nf_multihead_attention.f90
milancurcic Feb 21, 2025
e901479
Update src/nf/nf_multihead_attention.f90
milancurcic Feb 21, 2025
1eaee95
Update src/nf/nf_multihead_attention.f90
milancurcic Feb 21, 2025
588ecb1
Tidy up
milancurcic Feb 21, 2025
20ffe05
Add self_attention to the layers table
milancurcic Feb 21, 2025
e4c6548
Merge branch 'multihead_attention' of github.com:OneAdder/neural-fort…
milancurcic Feb 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Rename mha_simple example
  • Loading branch information
milancurcic committed Feb 21, 2025
commit 6162783ffdb5380cd94633b46a3c97acbd758caa
4 changes: 2 additions & 2 deletions example/mha_simple.f90
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
program simple
program mha_simple
use nf, only: dense, input, network, sgd, self_attention, flatten
implicit none
type(network) :: net
Expand Down Expand Up @@ -34,4 +34,4 @@ program simple

end do

end program simple
end program mha_simple