Those are not in the current leaderboard but things has already changed since last quarter - [ ] Qwen3 Coder series - [ ] Qwen3 dense and MoE models - [ ] Qwen3 distilled models - [ ] GLM 4.5 both original and Air models - [ ] DeepSeek v3.1 base - [ ] DeepSeek-R1-0528 Other suggestions - [ ] Kimi K2 and Kimi Coder is probably worth it even if they fail on some tasks - [ ] GPT-OSS model series are good to comare as US representation - [ ] maybe LG's Exaone, Meta's Llama 4, and MiniMax if there is enough time for this