matdev83
diff --git a/‎QWEN_REASONING_EFFORT_FEATURE.md‎
Lines changed: 0 additions & 56 deletions b/‎QWEN_REASONING_EFFORT_FEATURE.md‎
Lines changed: 0 additions & 56 deletions
diff --git a/‎data/test_suite_state.json‎
Lines changed: 1 addition & 1 deletion b/‎data/test_suite_state.json‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/QWEN_REASONING_EFFORT_FEATURE.md‎
Lines changed: 90 additions & 0 deletions b/‎docs/QWEN_REASONING_EFFORT_FEATURE.md‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎scripts/proxy_test.py‎
Lines changed: 41 additions & 0 deletions b/‎scripts/proxy_test.py‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎scripts/zai_direct_test.py‎
Lines changed: 38 additions & 0 deletions b/‎scripts/zai_direct_test.py‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎src/connectors/qwen_oauth.py‎
Lines changed: 15 additions & 12 deletions b/‎src/connectors/qwen_oauth.py‎
Lines changed: 15 additions & 12 deletions
@@ -1,4 +1,4 @@
 {
-  "test_count": 4528,
+  "test_count": 4543,
   "last_updated": "1761604243.5415785"
 }
@@ -0,0 +1,90 @@
+# Qwen OAuth Reasoning Effort Feature
+
+## Overview
+Enhanced the Qwen OAuth connector to automatically append " /think" to messages to trigger Qwen's extended reasoning mode by default. The suffix is only skipped when reasoning effort is explicitly set to "low".
+
+## Implementation Details
+
+### Changes Made
+1. **Modified `src/connectors/qwen_oauth.py`**:
+   - Updated `chat_completions()` method to detect `reasoning_effort` parameter
+   - **By default**, appends " /think" to the last client message
+   - Only skips appending when `reasoning_effort` is explicitly set to "low"
+   - Only appends to user or system messages, not tool responses
+   - Handles both Pydantic models and dict message formats
+
+### How It Works
+- The connector checks if `reasoning_effort` is explicitly set to "low"
+- If NOT "low" (including None, empty string, or any other value), it appends " /think"
+- It finds the last client message (user or system role, skipping tool responses)
+- Appends " /think" to the content of that message
+- This triggers Qwen's extended reasoning mode for more thoughtful responses
+
+### Usage Examples
+
+**Default behavior (appends " /think"):**
+```python
+request = ChatRequest(
+    model="qwen-turbo",
+    messages=[
+        ChatMessage(role="user", content="What is 2+2?")
+    ]
+    # No reasoning_effort specified - will append " /think"
+)
+```
+Result: "What is 2+2? /think"
+
+**Explicitly disable reasoning mode:**
+```python
+request = ChatRequest(
+    model="qwen-turbo",
+    messages=[
+        ChatMessage(role="user", content="Simple question")
+    ],
+    reasoning_effort="low"  # Only "low" prevents appending
+)
+```
+Result: "Simple question" (no modification)
+
+**Explicit reasoning modes (also append):**
+```python
+request = ChatRequest(
+    model="qwen-turbo",
+    messages=[
+        ChatMessage(role="user", content="Complex problem")
+    ],
+    reasoning_effort="high"  # or "medium"
+)
+```
+Result: "Complex problem /think"
+
+### Test Coverage
+Created comprehensive test suite in `tests/unit/test_qwen_oauth_reasoning_effort.py`:
+- ✅ Test default (no reasoning_effort) appends " /think"
+- ✅ Test reasoning_effort="medium" appends " /think"
+- ✅ Test reasoning_effort="high" appends " /think"
+- ✅ Test reasoning_effort="low" does NOT append
+- ✅ Test reasoning_effort=None appends " /think"
+- ✅ Test reasoning_effort="" (empty string) appends " /think"
+- ✅ Test skips tool response messages
+- ✅ Test works with system messages
+- ✅ Test works with multiple messages (only last user message modified)
+- ✅ Test works with Pydantic ChatMessage objects
+
+All qwen-related tests pass, including the 10 new tests.
+
+## Behavior Summary
+- **Default (no reasoning_effort)**: Appends " /think" ✅
+- **reasoning_effort=None**: Appends " /think" ✅
+- **reasoning_effort=""**: Appends " /think" ✅
+- **reasoning_effort="low"**: Does NOT append ❌
+- **reasoning_effort="medium"**: Appends " /think" ✅
+- **reasoning_effort="high"**: Appends " /think" ✅
+- **Any other value**: Appends " /think" ✅
+
+## Notes
+- The " /think" suffix is only appended to regular messages, not tool call responses
+- The modification happens before the message is sent to the Qwen API
+- This feature is specific to the Qwen OAuth connector and leverages Qwen's native reasoning capabilities
+- The default behavior enables extended reasoning for better response quality
+- Users can opt-out by explicitly setting `reasoning_effort="low"`
@@ -0,0 +1,41 @@
+#!/usr/bin/env python
+"""
+Quick check of the local proxy with OpenAI-compatible client.
+"""
+
+from __future__ import annotations
+
+import sys
+
+from openai import OpenAI
+
+
+def main() -> None:
+    sys.stdout.reconfigure(encoding="utf-8")
+    client = OpenAI(
+        api_key="test-placeholder",
+        base_url="http://127.0.0.1:8000/v1",
+    )
+    import httpx
+
+    request_payload = {
+        "model": "glm-4.6",
+        "messages": [
+            {"role": "system", "content": "You are a concise assistant."},
+            {"role": "user", "content": "Return the string `ok` and nothing else."},
+        ],
+        "stream": False,
+    }
+    with httpx.Client(base_url="http://127.0.0.1:8000") as client_raw:
+        resp = client_raw.post(
+            "/v1/chat/completions",
+            json=request_payload,
+            headers={"Authorization": "Bearer test-placeholder"},
+        )
+        print("status", resp.status_code)
+        print("headers", resp.headers)
+        print("body bytes", resp.content[:200])
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+"""
+Minimal OpenAI-compatible client for testing ZAI Coding Plan access.
+
+This script sends a simple non-streaming chat completion request directly
+to https://api.z.ai/api/coding/paas/v4 using the provided API key.
+"""
+
+from __future__ import annotations
+
+import sys
+
+from openai import OpenAI
+
+
+def main() -> None:
+    sys.stdout.reconfigure(encoding="utf-8")
+
+    client = OpenAI(
+        api_key="your-zai-api-key-here",
+        base_url="https://api.z.ai/api/coding/paas/v4",
+    )
+
+    response = client.chat.completions.create(
+        model="glm-4.6",
+        messages=[
+            {"role": "system", "content": "You are a concise assistant."},
+            {"role": "user", "content": "Return the string `ok` and nothing else."},
+        ],
+        max_tokens=64,
+        stream=False,
+    )
+
+    print(response.model_dump_json(indent=2))
+
+
+if __name__ == "__main__":
+    main()
@@ -1013,10 +1013,10 @@ async def chat_completions(
         """Handle chat completions using Qwen OAuth API.
 
         This overrides the parent class method to ensure credentials are valid before API call.
-        
+
         Special handling for reasoning_effort:
-        - When reasoning_effort is set to "medium" or "high", this method appends " /think"
-          to the last client message (user or system role, not tool responses).
+        - By default, this method appends " /think" to the last client message (user or system role).
+        - The suffix is NOT appended only when reasoning_effort is explicitly set to "low".
         - This triggers Qwen's extended reasoning mode for more thoughtful responses.
         - The " /think" suffix is only appended to regular messages, not tool call responses.
         """
@@ -1041,13 +1041,17 @@ async def chat_completions(
             )
 
         # Handle reasoning_effort by appending " /think" to the last user message
+        # Append by default unless explicitly set to "low"
         reasoning_effort = None
         if hasattr(request_data, "reasoning_effort"):
             reasoning_effort = request_data.reasoning_effort
         elif isinstance(request_data, dict):
             reasoning_effort = request_data.get("reasoning_effort")
 
-        if reasoning_effort in ("medium", "high") and processed_messages:
+        # Append " /think" unless reasoning_effort is explicitly "low"
+        should_append_think = reasoning_effort != "low"
+
+        if should_append_think and processed_messages:
             # Find the last message from the client (user or system role, not tool responses)
             last_client_message_idx = None
             for idx in range(len(processed_messages) - 1, -1, -1):
@@ -1057,24 +1061,24 @@ async def chat_completions(
                     role = msg.role
                 elif isinstance(msg, dict):
                     role = msg.get("role")
-                
+
                 # Skip tool response messages
                 if role in ("user", "system"):
                     last_client_message_idx = idx
                     break
-            
+
             if last_client_message_idx is not None:
                 # Append " /think" to the content of the last client message
                 msg = processed_messages[last_client_message_idx]
-                
+
                 # Handle different message formats
                 if hasattr(msg, "content"):
                     content = msg.content
                     if isinstance(content, str):
                         # Create a modified copy of the message
                         if hasattr(msg, "model_copy"):
-                            processed_messages[last_client_message_idx] = msg.model_copy(
-                                update={"content": content + " /think"}
+                            processed_messages[last_client_message_idx] = (
+                                msg.model_copy(update={"content": content + " /think"})
                             )
                         elif hasattr(msg, "copy"):
                             modified_msg = msg.copy()
@@ -1084,7 +1088,7 @@ async def chat_completions(
                             # Fallback: modify in place
                             msg.content = content + " /think"
                         logger.info(
-                            f"Appended ' /think' to last client message due to reasoning_effort={reasoning_effort}"
+                            f"Appended ' /think' to last client message (reasoning_effort={reasoning_effort or 'default'})"
                         )
                 elif isinstance(msg, dict):
                     content = msg.get("content")
@@ -1094,10 +1098,9 @@ async def chat_completions(
                         modified_msg["content"] = content + " /think"
                         processed_messages[last_client_message_idx] = modified_msg
                         logger.info(
-                            f"Appended ' /think' to last client message due to reasoning_effort={reasoning_effort}"
+                            f"Appended ' /think' to last client message (reasoning_effort={reasoning_effort or 'default'})"
                         )
 
-
         try:
             # Use the effective model and properly extract just the model name part
             # Strip any backend prefix (like "qwen-oauth:", "gemini-cli-oauth-personal:", etc.)
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`	`1`	`{`
`2`		`- "test_count": 4528,`
	`2`	`+ "test_count": 4543,`
`3`	`3`	`"last_updated": "1761604243.5415785"`
`4`	`4`	`}`