Skip to content

Conversation

@vvvdwbvvv
Copy link
Contributor

Summary

This PR adds support for GLM4.1V (GLM-4 Vision) models to the Liger Kernel #855
https://huggingface.co/zai-org/GLM-4.5
This model have been merged in huggingface/transformers#39805

Testing Done

Found that python3 -m pytest test/convergence/bf16/test_mini_models.py -k 'glm4v_moe' -rF
has AssertionError: [Loss]Number of mismatched elements: 14
with

Test result
AssertionError: [Loss]Number of mismatched elements: 14
Mismatch at index (0, 5): tensor1[(0, 5)] = 8.733983993530273, tensor2[(0, 5)] = 8.52511215209961
Mismatch at index (0, 8): tensor1[(0, 8)] = 7.2776618003845215, tensor2[(0, 8)] = 7.524500846862793
Mismatch at index (0, 9): tensor1[(0, 9)] = 6.917590618133545, tensor2[(0, 9)] = 7.175967216491699
 Mismatch at index (0, 13): tensor1[(0, 13)] = 5.685216426849365, tensor2[(0, 13)] = 5.427236557006836
Mismatch at index (0, 14): tensor1[(0, 14)] = 5.337466239929199, tensor2[(0, 14)] = 5.049449443817139
... and 9 more mismatched elements.

vvvdwbvvv and others added 20 commits August 21, 2025 02:40
…_for_glm4v_moe for decoder and vision blocks
…erts in Glm4vMoe and enhance test assertions
…for_glm4v_moe to reference decoder layer attributes
…for_glm4v_moe to reference post_attention_layernorm
@vvvdwbvvv
Copy link
Contributor Author

@Tcc0403 Modify loss_atol in test_mini_model of convergence/bf16/test_mini_models.py, avoid difference under moe structure with bf16

@vvvdwbvvv vvvdwbvvv mentioned this pull request Sep 2, 2025
@shimizust
Copy link
Collaborator

@vvvdwbvvv lgtm, can you fix checkstyle first?

@vvvdwbvvv
Copy link
Contributor Author

@shimizust Thank you, fixed in ae6aac5

@shimizust shimizust merged commit 454e3d2 into linkedin:main Sep 5, 2025
3 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants