Skip to content

Conversation

@aviralgarg05
Copy link

Previously MHAEinsum initialized weight matrices with shape (d_out, d_in) and used inappropriate einsum notation, causing failures for non-square input-output dimensions. This commit corrects weight initialization to shape (d_in, d_out), updates einsum notation to 'bnd,do->bno', and adds three unit tests to verify parity across different d_in and d_out settings. All tests pass successfully.

Fixing the issue #857

Previously MHAEinsum initialized weight matrices with shape (d_out, d_in) and used inappropriate einsum notation, causing failures for non-square input-output dimensions. This commit corrects weight initialization to shape (d_in, d_out), updates einsum notation to 'bnd,do->bno', and adds three unit tests to verify parity across different d_in and d_out settings. All tests pass successfully.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@rasbt
Copy link
Owner

rasbt commented Oct 27, 2025

Thanks a lot for the PR and sorry about the late response, I was out of town last week. I'll have a look soon.

@rasbt
Copy link
Owner

rasbt commented Oct 27, 2025

The fix looks great, and thanks for adding those tests. I just moved over the tests to a separate python script for pytest similar to what I've done with some other notebooks here. This way, it's easier to test via the CI runners, and it keeps the code notebook more readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants