Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llama rotary pos emb issue for transformers 4.38 #813

Merged
merged 4 commits into from
Mar 18, 2024

Conversation

libinta
Copy link
Collaborator

@libinta libinta commented Mar 18, 2024

Transformer 4.38 changes the rotary embedding implementation, we changed rotaryembedding class to be the same as transformers4.37, but not apply_rotary_pos_emb. We should use apply_rotary_pos_emb as 4.37 too.

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

libinta added 2 commits March 17, 2024 05:46
…JSON serializable"

Transformer 4.38 logs the grad_norm in log_history. But FSDP doesn't have global grad norm function.
When logging non-scalar tensor, the .item fails. The solution now is not to log grad_norm in
logging_history for FSDP.
@libinta libinta requested a review from mandy-li as a code owner March 18, 2024 06:34
@libinta libinta requested a review from a user March 18, 2024 06:34
@libinta libinta requested a review from regisss as a code owner March 18, 2024 06:34
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@regisss regisss merged commit dcb2ffc into main Mar 18, 2024
9 checks passed
@regisss regisss deleted the fix_llama_rotary_pos_emb branch March 18, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants