Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) by ltwlf · Pull Request #12881 · microsoft/semantic-kernel

ltwlf · 2025-08-07T08:20:45Z

Summary

Adds reasoning capabilities to Python OpenAI ResponsesAgent, bringing parity with the C# implementation. This enables fine-grained control over reasoning effort for O-series models (o1, o3-mini, o4-mini) and gpt-5 that support the reasoning parameter.

Fixes #12843

Changes

Core Implementation

openai_responses_agent.py: Added reasoning configuration support
- Constructor-level reasoning effort setting
- Per-invocation reasoning effort override capability
- Proper parameter validation and model compatibility checks
responses_agent_thread_actions.py: Extended thread actions to support reasoning parameters
- Reasoning effort propagation through thread operations
- Metadata preservation for reasoning tokens and summaries

Sample and Tests

responses_agent_reasoning.py: Comprehensive demonstration sample
- Constructor vs per-invocation reasoning configuration
- Function calling integration with reasoning
- Reasoning comparison scenarios (low/medium/high effort)
- Error handling and troubleshooting guidance
test_openai_responses_agent_reasoning.py: Full unit test coverage
- Parameter validation tests
- Integration scenarios with function calling
- Edge cases and error conditions

Features

✅ Constructor-Level Reasoning: Set default reasoning effort when creating agents
✅ Per-Invocation Override: Override reasoning effort per request
✅ Priority Hierarchy: per-invocation > constructor > model default
✅ Function Calling Compatible: Works seamlessly with existing plugin system
✅ Azure OpenAI & OpenAI Support: Compatible with both service providers
✅ Model Validation: Automatic compatibility checks for O-series models nad GÜT-5
✅ Metadata Access: Reasoning tokens and summaries available in response metadata

Usage Example

# Constructor-level reasoning
# Constructor-level reasoning configuration
agent = AzureResponsesAgent(
    ai_model_id="gpt-5",
    client=client,
    reasoning={"effort": "low"}  # Default reasoning for all requests
)

# Per-invocation override
response = await agent.invoke(
    "Solve this complex problem step by step",
    reasoning={"effort": "high"}  
)


# Invoke with reasoning callback to capture intermediate thoughts
response = await agent.invoke(
    "Analyze this data step by step",
    reasoning={"effort": "high", "summary": "detailed"},
    on_intermediate_message=handle_reasoning_message
)

# Streaming with reasoning
async for response in agent.invoke_stream(
    "Explain quantum computing in detail",
    reasoning={"effort": "high", "summary": "detailed"},
    on_intermediate_message=handle_reasoning_message
):
    print(response.content, end="", flush=True)

ltwlf · 2025-08-07T09:46:16Z

@eavanvalkenburg @markwallace-microsoft this PR for reasoning support for ResponseAgent that I mention yesterday in the Office Hours. Looking forward to your feedback.
Best, Christian

moonbox3 · 2025-08-07T16:06:58Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
__init__.py	4	0	100%
const.py	7	0	100%
agents/open_ai
openai_responses_agent.py	417	104	75%	67, 96, 115–118, 126, 130–131, 163–167, 171, 176, 181, 186, 192, 197–198, 201, 204, 209–214, 219–220, 222–223, 229–232, 238, 240–241, 245–249, 350, 354, 358, 360, 366, 379, 381, 383, 387, 484–485, 492–493, 495–496, 498–499, 501–506, 508, 556–557, 560, 605, 621, 631, 665, 673, 676, 680, 682, 685, 708, 711–715, 725, 765, 790, 801, 890, 1013, 1057, 1059, 1133, 1177–1178, 1180–1184, 1203
responses_agent_thread_actions.py	437	126	71%	177, 194, 207, 218–219, 227, 234, 247, 407–408, 410, 424, 429, 436, 451–452, 462, 469, 471–472, 481, 488, 490–491, 502, 509, 511–512, 522, 529, 534–535, 539, 551, 585–587, 590, 608–609, 620–622, 626, 630–631, 635–637, 648–650, 672, 674–675, 682–683, 685–686, 688, 690–691, 693–694, 792, 797, 805–806, 808, 814, 819, 824–826, 832–834, 839–840, 842–847, 849, 853–854, 856, 860, 864, 869–870, 875, 878–879, 939–942, 965, 1011–1014, 1055–1056, 1060–1062, 1064, 1089–1090, 1092–1096, 1098, 1108–1109, 1117, 1121, 1128, 1134, 1210
contents
__init__.py	22	0	100%
chat_message_content.py	137	0	100%
const.py	29	0	100%
reasoning_content.py	24	6	75%	46–48, 53, 55, 59
streaming_reasoning_content.py	12	0	100%
TOTAL	27050	4679	82%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
3715	22 💤	0 ❌	0 🔥	1m 37s ⏱️

ltwlf · 2025-08-07T20:51:26Z

GPT-5 reasoning implementation tested and working flawlessly!

Key findings:

GPT-5 supports reasoning with reasoning_effort="low" for structured responses
Works great without reasoning too
Function calling performs excellently with calculator demos

Model behavior:

O-series (o1, o3-mini, o4-mini): Auto-reasoning when unspecified
GPT-5: Reasoning support with new minimal effort option
GPT-4.1: Fails with reasoning params (expected)

Implementation notes:

Priority hierarchy handles GPT-5 seamlessly
OpenAIResponsesAgent(ai_model_id="gpt-5") works out of the box
Explicit reasoning leverages GPT-5's enhanced capabilities

Result: Reasoning feature is ready for OpenAI's latest models and gracefully handles both current O-series and future reasoning-capable models.

Copilot

Pull Request Overview

This PR adds reasoning support to Python OpenAI ResponsesAgent to achieve parity with the C# implementation. It enables fine-grained control over reasoning effort for O-series models that support the reasoning parameter.

Adds constructor-level and per-invocation reasoning effort configuration with priority hierarchy
Implements comprehensive reasoning content handling and metadata extraction
Provides extensive test coverage and practical usage examples

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`test_openai_responses_agent_reasoning.py`	Comprehensive unit tests for reasoning functionality validation
`reasoning_content.py`	New content type for handling reasoning output from O-series models
`const.py`	Added reasoning content type constant
`chat_message_content.py`	Updated to support reasoning content in messages
`__init__.py`	Exposed ReasoningContent in public API
`responses_agent_thread_actions.py`	Core reasoning logic implementation and API integration
`openai_responses_agent.py`	Agent-level reasoning configuration and validation
`responses_agent_reasoning.py`	Demonstration sample showing reasoning capabilities

python/semantic_kernel/agents/open_ai/responses_agent_thread_actions.py

python/semantic_kernel/contents/reasoning_content.py

ltwlf · 2025-08-14T06:33:13Z

@angangwa yes, I'm just working on this!

This commit adds complete reasoning functionality to the OpenAI ResponsesAgent: Core Features: - Add ReasoningContent and StreamingReasoningContent classes with proper SK conventions - Implement reasoning callback mechanism with on_intermediate_message parameter - Support streaming reasoning events (delta and done) in invoke_stream - Add reasoning item extraction and yield pattern (False for intermediate, True for final) - Export reasoning content types in contents package Implementation Details: - Fix metadata merging bug in StreamingReasoningContent addition - Follow SK patterns with StreamingContentMixin + BaseContent inheritance - Maintain vendor neutrality without OpenAI-specific dependencies - Add reasoning configuration with priority hierarchy (per-invocation > constructor) - Support reasoning-capable models (gpt-5, o3, o1-mini) with proper error handling Testing & Examples: - Add comprehensive test coverage (31 tests) for all reasoning functionality - Create clean sample demonstrating reasoning with dual OpenAI/Azure support - Test content creation, streaming, callbacks, error conditions, and integration flows - Validate reasoning configuration priority, multi-agent isolation, and edge cases API Enhancements: - Extend invoke() and invoke_stream() methods with reasoning parameters - Add reasoning item processing in ResponsesAgentThreadActions - Support reasoning effort configuration and summary options - Implement proper reasoning content extraction from OpenAI responses

ltwlf · 2025-08-14T14:50:16Z

Hi @moonbox3 @eavanvalkenburg,

I've refactored the code and implemented the suggested changes with one exception: I've kept the constructor-level reasoning parameter. I believe this is useful for multi-agent collaboration scenarios where an orchestrator automatically invokes agents.
If you think otherwise I can change it.
It would be interesting to have a smarter orchestrator that can decide what reasoning effort is needed based on context, but that's beyond the scope of this PR.

I've also removed the "minimal" effort option since I'm now using OpenAI types directly. The minimal effort will automatically return when we update to the latest OpenAI SDK.

I've tested this implementation with O-series models and GPT-5. After this PR is merged, I plan to bump the OpenAI version to the latest and then submit another PR adding GPT-5 verbosity support.

Could you please review the changes when you have a chance?

python/semantic_kernel/contents/reasoning_content.py

python/semantic_kernel/contents/streaming_reasoning_content.py

dmytrostruk · 2025-08-20T16:48:34Z

@ltwlf It looks like there are some code quality check failures in CI:

python/samples/concepts/agents/openai_responses/responses_agent_reasoning.py

python/semantic_kernel/agents/open_ai/openai_responses_agent.py

@OverRide

…tion - Improve ReasoningContent docstring for better user understanding - Make text property optional (str | None = None) for better API consistency - Restore @OverRide decorator to _create method as required by base class - Refactor complex sample into focused, smaller examples: - responses_agent_reasoning.py: Basic non-streaming reasoning examples - responses_agent_reasoning_streaming.py: Streaming-specific examples - Remove unnecessary complexity from samples while maintaining functionality - Maintain backward compatibility and OpenAI API compliance Addresses feedback from moonbox3, dmytrostruk, and eavanvalkenburg in PR microsoft#12881

ltwlf · 2025-08-21T15:32:08Z

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box:

I've resolved the review comments. Hope it will be fine now 😅

moonbox3 · 2025-08-27T23:25:32Z

@ltwlf we want to help get this PR across the finish line; however, we need the CI/CD code quality checks to pass. CI/CD is installing mypy==1.17.1, can you please verify that you have that version as well? Also, please try to run the VSCode mypy task, outside of running the pre-commit.

moonbox3 · 2025-08-28T07:18:03Z

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box:

I've resolved the review comments. Hope it will be fine now 😅

Mypy isn’t run as part of pre-commit. It’s run separately in CI/CD. That’s why one should manually run the mypy task locally.

ltwlf · 2025-08-28T08:00:21Z

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box:
I've resolved the review comments. Hope it will be fine now 😅

Mypy isn’t run as part of pre-commit. It’s run separately in CI/CD. That’s why one should manually run the mypy task locally.

I hope mypy is finally happy. Thanks for your support!

moonbox3 · 2025-08-28T08:22:50Z

@ltwlf I checked out your branch and am able to see the mypy errors:

moonbox3 · 2025-08-28T08:26:45Z

@ltwlf do you want me to fix these mypy issues?

ltwlf · 2025-08-28T08:48:52Z

@ltwlf do you want me to fix these mypy issues?

@moonbox3 Yes, please—if you don’t mind.
Not sure why mypy works locally for me but not for others. Maybe I should reset my project setup

moonbox3 · 2025-08-28T23:49:19Z

Thanks for your support on this, @ltwlf, and seeing it through.

…mini, o3) (microsoft#12881) ## Summary Adds reasoning capabilities to Python OpenAI ResponsesAgent, bringing parity with the C# implementation. This enables fine-grained control over reasoning effort for O-series models (o1, o3-mini, o4-mini) and gpt-5 that support the reasoning parameter. Fixes microsoft#12843 ## Changes ### Core Implementation - **`openai_responses_agent.py`**: Added reasoning configuration support - Constructor-level reasoning effort setting - Per-invocation reasoning effort override capability - Proper parameter validation and model compatibility checks - **`responses_agent_thread_actions.py`**: Extended thread actions to support reasoning parameters - Reasoning effort propagation through thread operations - Metadata preservation for reasoning tokens and summaries ### Sample and Tests - **`responses_agent_reasoning.py`**: Comprehensive demonstration sample - Constructor vs per-invocation reasoning configuration - Function calling integration with reasoning - Reasoning comparison scenarios (low/medium/high effort) - Error handling and troubleshooting guidance - **`test_openai_responses_agent_reasoning.py`**: Full unit test coverage - Parameter validation tests - Integration scenarios with function calling - Edge cases and error conditions ## Features ✅ **Constructor-Level Reasoning**: Set default reasoning effort when creating agents ✅ **Per-Invocation Override**: Override reasoning effort per request ✅ **Priority Hierarchy**: per-invocation > constructor > model default ✅ **Function Calling Compatible**: Works seamlessly with existing plugin system ✅ **Azure OpenAI & OpenAI Support**: Compatible with both service providers ✅ **Model Validation**: Automatic compatibility checks for O-series models nad GÜT-5 ✅ **Metadata Access**: Reasoning tokens and summaries available in response metadata ## Usage Example ```python # Constructor-level reasoning # Constructor-level reasoning configuration agent = AzureResponsesAgent( ai_model_id="gpt-5", client=client, reasoning={"effort": "low"} # Default reasoning for all requests ) # Per-invocation override response = await agent.invoke( "Solve this complex problem step by step", reasoning={"effort": "high"} ) # Invoke with reasoning callback to capture intermediate thoughts response = await agent.invoke( "Analyze this data step by step", reasoning={"effort": "high", "summary": "detailed"}, on_intermediate_message=handle_reasoning_message ) # Streaming with reasoning async for response in agent.invoke_stream( "Explain quantum computing in detail", reasoning={"effort": "high", "summary": "detailed"}, on_intermediate_message=handle_reasoning_message ): print(response.content, end="", flush=True) --------- Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>

moonbox3 added the python Pull requests for the Python Semantic Kernel label Aug 7, 2025

ltwlf force-pushed the feature/response-reasoning branch 2 times, most recently from 6a7b8a7 to 492975f Compare August 7, 2025 08:32

ltwlf marked this pull request as ready for review August 7, 2025 08:33

Copilot AI review requested due to automatic review settings August 7, 2025 08:33

ltwlf requested a review from a team as a code owner August 7, 2025 08:33

This comment was marked as outdated.

Sign in to view

Copilot AI mentioned this pull request Aug 7, 2025

Refactor and improve reasoning effort handling in Python OpenAI Responses Agent ltwlf/semantic-kernel#5

Merged

ltwlf force-pushed the feature/response-reasoning branch 2 times, most recently from 3f3e9b9 to 4153091 Compare August 7, 2025 09:29

ltwlf changed the title ~~Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1)~~ Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1, GPT-5) Aug 8, 2025

ltwlf changed the title ~~Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1, GPT-5)~~ Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o3-mini, o3) Aug 8, 2025

ltwlf changed the title ~~Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o3-mini, o3)~~ Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 8, 2025

ltwlf force-pushed the feature/response-reasoning branch from aa82b4f to 8ded72a Compare August 9, 2025 08:41

ltwlf requested a review from a team as a code owner August 9, 2025 08:41

moonbox3 added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel documentation labels Aug 9, 2025

github-actions bot changed the title ~~Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3)~~ .Net: Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 9, 2025

ltwlf changed the title ~~.Net: Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3)~~ Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 9, 2025

ltwlf force-pushed the feature/response-reasoning branch from 8ded72a to ec28c06 Compare August 9, 2025 08:47

moonbox3 removed .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel documentation labels Aug 9, 2025

ltwlf force-pushed the feature/response-reasoning branch from ec28c06 to 2dda739 Compare August 9, 2025 08:51

ltwlf requested a review from Copilot August 9, 2025 08:52

Copilot AI reviewed Aug 9, 2025

View reviewed changes

ltwlf force-pushed the feature/response-reasoning branch from 9be069c to 36027a6 Compare August 14, 2025 13:07

dmytrostruk approved these changes Aug 20, 2025

View reviewed changes

python/semantic_kernel/contents/reasoning_content.py Outdated Show resolved Hide resolved

python/semantic_kernel/contents/reasoning_content.py Outdated Show resolved Hide resolved

python/semantic_kernel/contents/streaming_reasoning_content.py Show resolved Hide resolved

moonbox3 reviewed Aug 20, 2025

View reviewed changes

Refactor samples

91bd65a

ltwlf force-pushed the feature/response-reasoning branch from 634963b to 91bd65a Compare August 27, 2025 15:52

Merge branch 'main' into feature/response-reasoning

396a1be

fix: Correct content_type initialization in ReasoningContent class

4738772

ltwlf force-pushed the feature/response-reasoning branch from b347c67 to 4738772 Compare August 28, 2025 07:55

moonbox3 and others added 2 commits August 28, 2025 19:33

Fix mypy errors

8d57eb6

Merge branch 'main' into feature/response-reasoning

154360a

moonbox3 approved these changes Aug 28, 2025

View reviewed changes

TaoChenOSU approved these changes Aug 28, 2025

View reviewed changes

TaoChenOSU added this pull request to the merge queue Aug 28, 2025

Merged via the queue into microsoft:main with commit 5e50e19 Aug 28, 2025
28 checks passed

ltwlf deleted the feature/response-reasoning branch September 2, 2025 07:02

Conversation

ltwlf commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core Implementation

Sample and Tests

Features

Usage Example

Uh oh!

This comment was marked as outdated.

Uh oh!

ltwlf commented Aug 7, 2025

Uh oh!

moonbox3 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

ltwlf commented Aug 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ltwlf commented Aug 14, 2025

Uh oh!

ltwlf commented Aug 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dmytrostruk commented Aug 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ltwlf commented Aug 21, 2025

Uh oh!

moonbox3 commented Aug 27, 2025

Uh oh!

moonbox3 commented Aug 28, 2025

Uh oh!

ltwlf commented Aug 28, 2025

Uh oh!

moonbox3 commented Aug 28, 2025

Uh oh!

moonbox3 commented Aug 28, 2025

Uh oh!

ltwlf commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

moonbox3 commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ltwlf commented Aug 7, 2025 •

edited

Loading

moonbox3 commented Aug 7, 2025 •

edited

Loading

ltwlf commented Aug 28, 2025 •

edited

Loading