-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lora : raise error if lm_head is ignored #9103
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Tested it on Gemma2 with (raise) and without (succeed) lm_head
in adapter.
@@ -363,7 +363,11 @@ def get_tensors(self) -> Iterator[tuple[str, Tensor]]: | |||
yield (name, cast(torch.Tensor, LoraTorchTensor(tensor.A, tensor.B))) | |||
|
|||
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]: | |||
dest = super().modify_tensors(data_torch, name, bid) | |||
dest = list(super().modify_tensors(data_torch, name, bid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed to recast to a list super().modify_tensors(data_torch, name, bid)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, pyright will not pass: https://github.com/ggerganov/llama.cpp/actions/runs/10472198558/job/29001100132
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is needed because it can also be a Generator
(when modify_tensors
uses yield
).
convert_lora_to_gguf.py
Outdated
@@ -363,7 +363,11 @@ def get_tensors(self) -> Iterator[tuple[str, Tensor]]: | |||
yield (name, cast(torch.Tensor, LoraTorchTensor(tensor.A, tensor.B))) | |||
|
|||
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]: | |||
dest = super().modify_tensors(data_torch, name, bid) | |||
dest = list(super().modify_tensors(data_torch, name, bid)) | |||
# for now, we cannot convert archs that use the same tensor for tok_embd and output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# for now, we cannot convert archs that use the same tensor for tok_embd and output | |
# for now, we cannot convert adapters with lm_head for archs that use the same tensor for tok_embd and output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ll rephrase this to void mixing naming scheme
* lora : raise error if lm_head is ignored * fix style * clarify comment
* lora : raise error if lm_head is ignored * fix style * clarify comment
* lora : raise error if lm_head is ignored * fix style * clarify comment
* lora : raise error if lm_head is ignored * fix style * clarify comment
Resolve #9065
We will now raise an error if
lm_head
is ignored in base model.TODO: @ltoniazzi can you test this? Thanks.