Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3)#12881
Conversation
6a7b8a7 to
492975f
Compare
3f3e9b9 to
4153091
Compare
|
@eavanvalkenburg @markwallace-microsoft this PR for reasoning support for ResponseAgent that I mention yesterday in the Office Hours. Looking forward to your feedback. |
|
GPT-5 reasoning implementation tested and working flawlessly! Key findings:
Model behavior:
Implementation notes:
Result: Reasoning feature is ready for OpenAI's latest models and gracefully handles both current O-series and future reasoning-capable models. |
aa82b4f to
8ded72a
Compare
8ded72a to
ec28c06
Compare
ec28c06 to
2dda739
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR adds reasoning support to Python OpenAI ResponsesAgent to achieve parity with the C# implementation. It enables fine-grained control over reasoning effort for O-series models that support the reasoning parameter.
- Adds constructor-level and per-invocation reasoning effort configuration with priority hierarchy
- Implements comprehensive reasoning content handling and metadata extraction
- Provides extensive test coverage and practical usage examples
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
test_openai_responses_agent_reasoning.py |
Comprehensive unit tests for reasoning functionality validation |
reasoning_content.py |
New content type for handling reasoning output from O-series models |
const.py |
Added reasoning content type constant |
chat_message_content.py |
Updated to support reasoning content in messages |
__init__.py |
Exposed ReasoningContent in public API |
responses_agent_thread_actions.py |
Core reasoning logic implementation and API integration |
openai_responses_agent.py |
Agent-level reasoning configuration and validation |
responses_agent_reasoning.py |
Demonstration sample showing reasoning capabilities |
python/semantic_kernel/agents/open_ai/responses_agent_thread_actions.py
Outdated
Show resolved
Hide resolved
python/semantic_kernel/agents/open_ai/responses_agent_thread_actions.py
Outdated
Show resolved
Hide resolved
python/semantic_kernel/agents/open_ai/responses_agent_thread_actions.py
Outdated
Show resolved
Hide resolved
|
@angangwa yes, I'm just working on this! |
This commit adds complete reasoning functionality to the OpenAI ResponsesAgent: Core Features: - Add ReasoningContent and StreamingReasoningContent classes with proper SK conventions - Implement reasoning callback mechanism with on_intermediate_message parameter - Support streaming reasoning events (delta and done) in invoke_stream - Add reasoning item extraction and yield pattern (False for intermediate, True for final) - Export reasoning content types in contents package Implementation Details: - Fix metadata merging bug in StreamingReasoningContent addition - Follow SK patterns with StreamingContentMixin + BaseContent inheritance - Maintain vendor neutrality without OpenAI-specific dependencies - Add reasoning configuration with priority hierarchy (per-invocation > constructor) - Support reasoning-capable models (gpt-5, o3, o1-mini) with proper error handling Testing & Examples: - Add comprehensive test coverage (31 tests) for all reasoning functionality - Create clean sample demonstrating reasoning with dual OpenAI/Azure support - Test content creation, streaming, callbacks, error conditions, and integration flows - Validate reasoning configuration priority, multi-agent isolation, and edge cases API Enhancements: - Extend invoke() and invoke_stream() methods with reasoning parameters - Add reasoning item processing in ResponsesAgentThreadActions - Support reasoning effort configuration and summary options - Implement proper reasoning content extraction from OpenAI responses
9be069c to
36027a6
Compare
|
Hi @moonbox3 @eavanvalkenburg, I've refactored the code and implemented the suggested changes with one exception: I've kept the constructor-level reasoning parameter. I believe this is useful for multi-agent collaboration scenarios where an orchestrator automatically invokes agents. I've also removed the "minimal" effort option since I'm now using OpenAI types directly. The minimal effort will automatically return when we update to the latest OpenAI SDK. I've tested this implementation with O-series models and GPT-5. After this PR is merged, I plan to bump the OpenAI version to the latest and then submit another PR adding GPT-5 verbosity support. Could you please review the changes when you have a chance? |
|
@ltwlf It looks like there are some code quality check failures in CI: |
python/samples/concepts/agents/openai_responses/responses_agent_reasoning.py
Outdated
Show resolved
Hide resolved
python/samples/concepts/agents/openai_responses/responses_agent_reasoning.py
Outdated
Show resolved
Hide resolved
python/semantic_kernel/agents/open_ai/openai_responses_agent.py
Outdated
Show resolved
Hide resolved
…tion - Improve ReasoningContent docstring for better user understanding - Make text property optional (str | None = None) for better API consistency - Restore @OverRide decorator to _create method as required by base class - Refactor complex sample into focused, smaller examples: - responses_agent_reasoning.py: Basic non-streaming reasoning examples - responses_agent_reasoning_streaming.py: Streaming-specific examples - Remove unnecessary complexity from samples while maintaining functionality - Maintain backward compatibility and OpenAI API compliance Addresses feedback from moonbox3, dmytrostruk, and eavanvalkenburg in PR microsoft#12881
@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box: I've resolved the review comments. Hope it will be fine now 😅 |
634963b to
91bd65a
Compare
|
@ltwlf we want to help get this PR across the finish line; however, we need the CI/CD code quality checks to pass. CI/CD is installing
|
Mypy isn’t run as part of pre-commit. It’s run separately in CI/CD. That’s why one should manually run the mypy task locally. |
b347c67 to
4738772
Compare
|
|
@ltwlf I checked out your branch and am able to see the mypy errors:
|
|
@ltwlf do you want me to fix these mypy issues? |
|
Thanks for your support on this, @ltwlf, and seeing it through. |
…mini, o3) (microsoft#12881) ## Summary Adds reasoning capabilities to Python OpenAI ResponsesAgent, bringing parity with the C# implementation. This enables fine-grained control over reasoning effort for O-series models (o1, o3-mini, o4-mini) and gpt-5 that support the reasoning parameter. Fixes microsoft#12843 ## Changes ### Core Implementation - **`openai_responses_agent.py`**: Added reasoning configuration support - Constructor-level reasoning effort setting - Per-invocation reasoning effort override capability - Proper parameter validation and model compatibility checks - **`responses_agent_thread_actions.py`**: Extended thread actions to support reasoning parameters - Reasoning effort propagation through thread operations - Metadata preservation for reasoning tokens and summaries ### Sample and Tests - **`responses_agent_reasoning.py`**: Comprehensive demonstration sample - Constructor vs per-invocation reasoning configuration - Function calling integration with reasoning - Reasoning comparison scenarios (low/medium/high effort) - Error handling and troubleshooting guidance - **`test_openai_responses_agent_reasoning.py`**: Full unit test coverage - Parameter validation tests - Integration scenarios with function calling - Edge cases and error conditions ## Features ✅ **Constructor-Level Reasoning**: Set default reasoning effort when creating agents ✅ **Per-Invocation Override**: Override reasoning effort per request ✅ **Priority Hierarchy**: per-invocation > constructor > model default ✅ **Function Calling Compatible**: Works seamlessly with existing plugin system ✅ **Azure OpenAI & OpenAI Support**: Compatible with both service providers ✅ **Model Validation**: Automatic compatibility checks for O-series models nad GÜT-5 ✅ **Metadata Access**: Reasoning tokens and summaries available in response metadata ## Usage Example ```python # Constructor-level reasoning # Constructor-level reasoning configuration agent = AzureResponsesAgent( ai_model_id="gpt-5", client=client, reasoning={"effort": "low"} # Default reasoning for all requests ) # Per-invocation override response = await agent.invoke( "Solve this complex problem step by step", reasoning={"effort": "high"} ) # Invoke with reasoning callback to capture intermediate thoughts response = await agent.invoke( "Analyze this data step by step", reasoning={"effort": "high", "summary": "detailed"}, on_intermediate_message=handle_reasoning_message ) # Streaming with reasoning async for response in agent.invoke_stream( "Explain quantum computing in detail", reasoning={"effort": "high", "summary": "detailed"}, on_intermediate_message=handle_reasoning_message ): print(response.content, end="", flush=True) --------- Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com> Co-authored-by: Evan Mattson <evan.mattson@microsoft.com>





Summary
Adds reasoning capabilities to Python OpenAI ResponsesAgent, bringing parity with the C# implementation. This enables fine-grained control over reasoning effort for O-series models (o1, o3-mini, o4-mini) and gpt-5 that support the reasoning parameter.
Fixes #12843
Changes
Core Implementation
openai_responses_agent.py: Added reasoning configuration supportresponses_agent_thread_actions.py: Extended thread actions to support reasoning parametersSample and Tests
responses_agent_reasoning.py: Comprehensive demonstration sampletest_openai_responses_agent_reasoning.py: Full unit test coverageFeatures
✅ Constructor-Level Reasoning: Set default reasoning effort when creating agents
✅ Per-Invocation Override: Override reasoning effort per request
✅ Priority Hierarchy: per-invocation > constructor > model default
✅ Function Calling Compatible: Works seamlessly with existing plugin system
✅ Azure OpenAI & OpenAI Support: Compatible with both service providers
✅ Model Validation: Automatic compatibility checks for O-series models nad GÜT-5
✅ Metadata Access: Reasoning tokens and summaries available in response metadata
Usage Example