Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lora : raise error if lm_head is ignored #9103

Merged
merged 3 commits into from
Sep 12, 2024

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Aug 20, 2024

Resolve #9065

We will now raise an error if lm_head is ignored in base model.

TODO: @ltoniazzi can you test this? Thanks.


@ngxson ngxson requested a review from compilade August 20, 2024 13:07
@github-actions github-actions bot added the python python script changes label Aug 20, 2024
Copy link
Contributor

@ltoniazzi ltoniazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Tested it on Gemma2 with (raise) and without (succeed) lm_head in adapter.

@@ -363,7 +363,11 @@ def get_tensors(self) -> Iterator[tuple[str, Tensor]]:
yield (name, cast(torch.Tensor, LoraTorchTensor(tensor.A, tensor.B)))

def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]:
dest = super().modify_tensors(data_torch, name, bid)
dest = list(super().modify_tensors(data_torch, name, bid))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed to recast to a list super().modify_tensors(data_torch, name, bid)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@compilade compilade Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is needed because it can also be a Generator (when modify_tensors uses yield).

@@ -363,7 +363,11 @@ def get_tensors(self) -> Iterator[tuple[str, Tensor]]:
yield (name, cast(torch.Tensor, LoraTorchTensor(tensor.A, tensor.B)))

def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]:
dest = super().modify_tensors(data_torch, name, bid)
dest = list(super().modify_tensors(data_torch, name, bid))
# for now, we cannot convert archs that use the same tensor for tok_embd and output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# for now, we cannot convert archs that use the same tensor for tok_embd and output
# for now, we cannot convert adapters with lm_head for archs that use the same tensor for tok_embd and output

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll rephrase this to void mixing naming scheme

@ggerganov ggerganov merged commit d4c3c10 into master Sep 12, 2024
12 checks passed
@ggerganov ggerganov deleted the xsn/lora_convert_ignore_lm_head branch September 12, 2024 11:33
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* lora : raise error if lm_head is ignored

* fix style

* clarify comment
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* lora : raise error if lm_head is ignored

* fix style

* clarify comment
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* lora : raise error if lm_head is ignored

* fix style

* clarify comment
Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Feb 25, 2025
* lora : raise error if lm_head is ignored

* fix style

* clarify comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Gemma2 adapter weights lm_head skipped on gguf conversion
4 participants