-
Notifications
You must be signed in to change notification settings - Fork 31.2k
4.1V Model and GLM-4.5V Model Conversion Code Updates #41784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zRzRzRzRzRzRzR
commented
Oct 22, 2025
- Fixed weight conversion issues for some model providers and removed some debug logs
- Simplified some functions
|
Hi, is there a reference somewhere for the issues this is fixing? |
|
No, but I encountered some bugs when converting the open-source model at https://github.com/thu-coai/Glyph, so I fixed them here together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's delete all the test files
| "n_shared_experts": text_config.get("n_shared_experts", 1), | ||
| "norm_topk_prob": text_config.get("norm_topk_prob", True), | ||
| "num_experts_per_tok": text_config.get("num_experts_per_tok", 8), | ||
| "rope_scaling": {"type": "default", "mrope_section": [8, 12, 12]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the main branch it is now called rope_parameters and also includes the theta inside the dict. So maybe
"rope_scaling": {"rope_type": "default", "rope_theta": 10000.0, "mrope_section": [8, 12, 12]},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed the updates, is this how I understood it?
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't review in detail and I will trust it converts correctly from the original format. Just left a few nits about config attributes naming
| @@ -1,4 +1,3 @@ | |||
| # coding=utf-8 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accidental deletion?
| "rope_theta": model_config.get("rotary_base", 10000.0), | ||
| "image_token_id": model_config.get("image_token_id", 151363), | ||
| "video_token_id": model_config.get("video_token_id", 151364), | ||
| "tie_word_embeddings": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is also part of text config
| "use_cache": text_config.get("use_cache", True), | ||
| "vocab_size": text_config.get("vocab_size", 151552), | ||
| "partial_rotary_factor": 0.5, | ||
| "rope_scaling": {"rope_type": "default", "rope_theta": 10000.0, "mrope_section": [8, 12, 12]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets' call this key rope_parameters to align with recent changes
| "video_token_id": model_config.get("video_token_id", 151344), | ||
| "image_token_id": model_config.get("image_token_id", 151363), | ||
| "video_token_id": model_config.get("video_token_id", 151364), | ||
| "tie_word_embeddings": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here for tie_word_embeddings and rope_parameters
|
[For maintainers] Suggested jobs to run (before merge) run-slow: glm4v, glm4v_moe |
|
Thanks, lets merge! |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
) * update for new model convert * Update convert_glm4v_moe_mgt_weights_to_hf.py * restore * Update convert_glm4v_mgt_weights_to_hf.py * update * 1 * Update convert_glm4v_moe_mgt_weights_to_hf.py * Update convert_glm4v_mgt_weights_to_hf.py * finish * update * 2 * 2 * 1 * Update convert_glm4v_moe_mgt_weights_to_hf.py * update * update with tie_word_embeddings place