Skip to content

Conversation

gpolovets1
Copy link
Collaborator

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

  • why is this change being made,
  • the problem being solved and any relevant context,
  • why this is a good solution,
  • some information about the specific implementation,
  • shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure:

  • I have performed a self-review of my code.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have made or will make corresponding changes to any relevant documentation.

Signed-off-by: George Polovets <gpolovets@gmail.com>
…y are downloaded remotely. Also adding llama4_attention.py because I previously forgot.

Signed-off-by: George Polovets <gpolovets@gmail.com>
…g. Also fixed RMSNorm to divide by the std instead of multiply
…ply router scores before inputing to experts. Still getting gibberish output though.

Signed-off-by: George Polovets <gpolovets@gmail.com>
… to reshape the correct HF axis (it is the inverse of the QKV shapes). Now not producing gibberish but need to confirm that decode and MMLU quality is maintained.

Signed-off-by: George Polovets <gpolovets@gmail.com>
…Llama3, and adding printing of the model architecture.
…rings from latest mainline changes and brought back forking into model_loader to support random weights. Accuracy/perf is roughly the same as before the rebase. (code also only works with recent vLLM pinned checkpoint)
…tting in half. Now the logits match with pytorch implementation.
…ed by both systems are similar but usually not exact.

Signed-off-by: George Polovets <gpolovets@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant