-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Adding some new content, better credits, missing authors #17
Conversation
dc53665
to
d8690a5
Compare
3fbff73
to
7814c96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!! Looks really thorough and detailed to me!
HOWTO.md
Outdated
lower_tril_attention = BlockSparseAttention(layout=layout, block_size=BLOCK_SIZE, dropout=0.1) | ||
causal_mask = torch.tril(torch.ones(SEQ, SEQ)).bool().cuda() | ||
... | ||
att = lower_tril_attention(k, q, v, att_mask=causal_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth showing that a key padding mask can be passed in as well and separately? Maybe not necessary for just this quick example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this could do with a second and third pass :) Would you mind adding that in another PR, re-reading some of this with an NLP perspective for instance ? I must have had quite a few blind spots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can do another pass :)
515ed60
to
07ef1fc
Compare
07ef1fc
to
0f69a0e
Compare
[refactor] 3 lanes: `factory` for CI or programatic model building, `models` for presets and `components` block zoo. Dummy "Linformer" model example
Apply the existing linters (1/n)
What does this PR do?
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.