Add test results for GLM 4.5 (thinking & non-thinking modes) model to polyglot leaderboard #4413

Oct4Pie · 2025-08-06T06:38:40Z

The 2 commits add the evaluation results for the GLM 4.5 model in both thinking and non-thinking modes.

The tests were done in diff editing format. There were many errors in tool output formatting in thinking mode (due to poor structured output adherence) but it should not affect the test results as those were retried but the costs reported are not accurate.

model settings used for non-thinking mode:

- name: openrouter/z-ai/glm-4.5
  extra_params:
    extra_body:
      reasoning:
        enabled: false
      provider:
        ignore:
          - novita
    max_tokens: 96000

for thinking mode, reasoning.enabled was set to true.

Just a note: provider novita was excluded from the openrouter providers for reproducibility as it did not explicitly mention the quantization of the model. All other providers used the fp8 variant.

The benchmark exercise folders are attached for reference

2025-08-03-11-33-59--glm-4.5-thinking-polyglot.zip
2025-08-03-13-07-25--glm-4.5-polyglot.zip

…oard

kneelesh48 · 2025-09-28T17:08:43Z

Please use the official api as providers on openrouter often quantize the model.

nuireprog · 2025-10-05T14:42:50Z

It work on endpoint API openai

model: openai/glm-4.6

On redirige la requête vers le endpoint de z.ai

openai-api-base: "https://api.z.ai/api/coding/paas/v4"

REMPLACEZ CECI par votre vraie clé API obtenue sur https://z.ai/

openai-api-key: "YOURAPIKEY"

Kreijstal · 2025-10-05T18:47:48Z

what are the results for glm-4.6?

Oct4Pie added 2 commits August 3, 2025 06:41

add test results for GLM 4.5 (non-thinking) model to polyglot leaderb…

209d239

…oard

add test results for GLM 4.5 (thinking) model to polyglot leaderboard

fb39341

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add test results for GLM 4.5 (thinking & non-thinking modes) model to polyglot leaderboard #4413

Add test results for GLM 4.5 (thinking & non-thinking modes) model to polyglot leaderboard #4413

Oct4Pie commented Aug 6, 2025

Uh oh!

kneelesh48 commented Sep 28, 2025

Uh oh!

nuireprog commented Oct 5, 2025

Uh oh!

Kreijstal commented Oct 5, 2025

Uh oh!

Uh oh!

Add test results for GLM 4.5 (thinking & non-thinking modes) model to polyglot leaderboard #4413

Are you sure you want to change the base?

Add test results for GLM 4.5 (thinking & non-thinking modes) model to polyglot leaderboard #4413

Conversation

Oct4Pie commented Aug 6, 2025

Uh oh!

kneelesh48 commented Sep 28, 2025

Uh oh!

nuireprog commented Oct 5, 2025

On redirige la requête vers le endpoint de z.ai

REMPLACEZ CECI par votre vraie clé API obtenue sur https://z.ai/

Uh oh!

Kreijstal commented Oct 5, 2025

Uh oh!

Uh oh!