Skip to content

Conversation

@Mr-Grin
Copy link
Contributor

@Mr-Grin Mr-Grin commented Mar 24, 2025

#23

@lucidrains
Copy link
Owner

cool! tests are not passing though 🤔

was this LLM assisted? some of the code looks strange

@Mr-Grin
Copy link
Contributor Author

Mr-Grin commented Mar 25, 2025

hi I think this is because I didn't update the test on alternate MLP, so it missing the newly added argument compress block sliding stride. It passed on NSA test though.

Part of the code is written by LLM, I think I shouldn't have use it because LLM missed to modify on some part of the code so I spend sometime debugging it.

I will add the argument on alternative mlp later this day

@lucidrains
Copy link
Owner

@Mr-Grin it is close! as long as tests pass and train.py script works, we can merge!

@Mr-Grin
Copy link
Contributor Author

Mr-Grin commented Mar 25, 2025

@lucidrains it passed the GitHub automatic check, train.py takes 5 hours to run on my computer, should I run all the way through it?

@lucidrains
Copy link
Owner

@Mr-Grin no just the first 1k steps will do! this is great, thank you so much!

@lucidrains lucidrains merged commit 832050a into lucidrains:main Mar 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants