Codex hesitation to run tests #8538

sluongng · 2025-12-24T22:40:59Z

sluongng
Dec 24, 2025

This is the result of me telling it to add a test to the code base (Bazel.git, completely open-source).

As you can see, it wrote the test but does not run the test itself, despite AGENTS.md having examples showing how to run tests.

It does not even run after I told it the explicit command that failed.

This is on gpt-5.2-codex with high model_reasoning_effort. Something is just off about the eagerness (or lack thereof) to validate the tests.

sluongng · 2026-01-02T09:37:43Z

sluongng
Jan 2, 2026
Author

Here is another example I got today

• I’d fix this by ...

  I implemented that change and added a unit test.

  What I changed

  ...

  Why this helps

  ...

  If you want me to run tests, say the word. A targeted test command would be:

  ...

If it writes a test and knows how to run that specific test, I would expect it to verify the test result at least once before yielding control.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codex hesitation to run tests #8538

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Codex hesitation to run tests #8538

Uh oh!

sluongng Dec 24, 2025

Replies: 1 comment

Uh oh!

sluongng Jan 2, 2026 Author

sluongng
Dec 24, 2025

sluongng
Jan 2, 2026
Author