Skip to content

Commit 8ac8f02

Browse files
author
matdev83
committed
WIP; fixes
1 parent f6beb2e commit 8ac8f02

21 files changed

+1604
-288
lines changed

DEBUGGING_EMPTY_RESPONSES.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# Debugging Empty Streaming Responses from ZAI
2+
3+
## Current Issue
4+
5+
Client reports: "The model's response ended unexpectedly (no assistant messages). This may be a sign of rate limiting."
6+
7+
## Evidence from Logs
8+
9+
### Wire Capture (`logs/wire_capture.log`)
10+
Shows streaming responses contain only empty deltas:
11+
```json
12+
{
13+
"id": "chatcmpl-...",
14+
"object": "chat.completion.chunk",
15+
"created": 1761854814,
16+
"model": "claude-3-opus-20240229",
17+
"choices": [{
18+
"index": 0,
19+
"delta": {}, // ← EMPTY!
20+
"finish_reason": null
21+
}]
22+
}
23+
```
24+
25+
Each request receives exactly 2 chunks, both with empty deltas, then stream ends.
26+
27+
### Proxy Log (`logs/proxy.log`)
28+
Shows the request is being processed and streaming response is returned, but no errors are logged.
29+
30+
## Root Cause Analysis
31+
32+
The translation is working (chunks are in correct OpenAI format), but the chunks contain no content. This means:
33+
34+
1. **Either**: ZAI backend is sending events without content (ping events, metadata events, etc.)
35+
2. **Or**: Our translation function is not extracting content from the ZAI response format
36+
3. **Or**: ZAI backend is ending the stream prematurely without sending actual content
37+
38+
## Debugging Steps Added
39+
40+
Added logging to `src/connectors/anthropic.py` to capture:
41+
- Raw chunks from ZAI backend before translation
42+
- Translated chunk deltas after translation
43+
44+
## Resolution
45+
46+
**FOUND THE ISSUE**: ZAI backend is returning error events instead of content:
47+
48+
```
49+
event: error
50+
data: {"type": "error", "error": {"type": "1113", "message": "Insufficient balance or no resource package. Please recharge."}, "request_id": "..."}
51+
```
52+
53+
### Root Cause
54+
The ZAI API account has insufficient balance or no resource package. This is a **billing/account issue**, not a code issue.
55+
56+
### Why Client Shows "No Assistant Messages"
57+
1. ZAI returns error events instead of content events
58+
2. Our translation correctly converts error events to empty deltas (no content)
59+
3. Client receives only empty chunks and reports "no assistant messages"
60+
61+
### Solution
62+
**Recharge the ZAI API account** or ensure it has an active resource package.
63+
64+
### Code Status
65+
The streaming translation fix is working correctly. The translation properly handles:
66+
- ✅ Anthropic SSE format parsing
67+
- ✅ Error event handling (now raises BackendError with clear message)
68+
- ✅ Content event handling (would extract text if present)
69+
70+
### Improvements Made
71+
Added proper error handling in `src/connectors/anthropic.py`:
72+
- Detects error events in streaming responses
73+
- Extracts error message and type from error events
74+
- Raises `BackendError` with clear error message instead of silently returning empty responses
75+
- Includes error details for debugging
76+
77+
Now when ZAI returns an error like "Insufficient balance", the client will receive a proper error message instead of "no assistant messages".
78+
79+
### Tests Added
80+
Created `tests/unit/connectors/test_anthropic_error_handling.py` with 3 tests:
81+
- ✅ Test error event handling in Anthropic connector
82+
- ✅ Test generic error handling
83+
- ✅ Test zai-coding-plan inherits error handling
84+
85+
All tests pass.
86+
87+
## Expected Anthropic SSE Format
88+
89+
Standard Anthropic streaming should include events like:
90+
```
91+
event: message_start
92+
data: {"type":"message_start","message":{"role":"assistant"}}
93+
94+
event: content_block_delta
95+
data: {"type":"content_block_delta","delta":{"type":"text_delta","text":"Hello"}}
96+
97+
event: message_stop
98+
data: {"type":"message_stop"}
99+
```
100+
101+
If ZAI is sending a different format, we need to adjust the translation accordingly.

STREAMING_TRANSLATION_FIX.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Anthropic/ZAI Streaming Translation Fix
2+
3+
## Problem
4+
5+
The Anthropic connector (and by inheritance, the zai-coding-plan connector) was not translating streaming chunks to the internal domain format. This caused Anthropic-formatted SSE chunks to flow through the system untranslated, breaking cross-API compatibility.
6+
7+
### Root Cause
8+
9+
**OpenAI Connector** (`src/connectors/openai.py` lines 629-637):
10+
- ✅ Translates each streaming chunk using `translation_service.to_domain_stream_chunk()`
11+
- Converts OpenAI/Responses API format → domain format (OpenAI-compatible)
12+
13+
**Anthropic Connector** (`src/connectors/anthropic.py` lines 530-540):
14+
- ❌ Did NOT translate streaming chunks
15+
- Just passed through raw Anthropic SSE chunks wrapped in `ProcessedResponse`
16+
17+
**zai-coding-plan Connector**:
18+
- Inherits from `AnthropicBackend`
19+
- Does not override `_handle_streaming_response()`
20+
- Therefore inherited the broken streaming behavior
21+
22+
## Solution
23+
24+
### 1. Fixed Anthropic Connector Streaming (`src/connectors/anthropic.py`)
25+
26+
Updated the `event_stream()` function to translate each chunk:
27+
28+
```python
29+
async def event_stream() -> AsyncGenerator[ProcessedResponse, None]:
30+
try:
31+
async for chunk in response.aiter_text():
32+
_capture_message_id(chunk)
33+
34+
# Translate Anthropic SSE chunk to domain format
35+
domain_chunk = self.translation_service.to_domain_stream_chunk(
36+
chunk, "anthropic"
37+
)
38+
yield ProcessedResponse(content=domain_chunk)
39+
40+
# Translate final [DONE] marker
41+
done_chunk = self.translation_service.to_domain_stream_chunk(
42+
"data: [DONE]\n\n", "anthropic"
43+
)
44+
yield ProcessedResponse(content=done_chunk)
45+
```
46+
47+
### 2. Enhanced Translation Function (`src/core/domain/translation.py`)
48+
49+
Updated `anthropic_to_domain_stream_chunk()` to handle SSE format:
50+
51+
**Before**: Only accepted parsed JSON dicts
52+
**After**: Accepts both SSE-formatted strings and JSON dicts
53+
54+
Key improvements:
55+
- Parses multi-line SSE events (with `event:` and `data:` lines)
56+
- Extracts JSON from `data:` lines
57+
- Handles all Anthropic event types:
58+
- `message_start` → sets role
59+
- `content_block_delta` → extracts text content
60+
- `message_delta` → maps stop_reason to finish_reason
61+
- `message_stop` → marks completion
62+
- Maps Anthropic stop reasons to OpenAI equivalents:
63+
- `end_turn``stop`
64+
- `max_tokens``length`
65+
- `tool_use``tool_calls`
66+
- Handles `[DONE]` markers
67+
- Backward compatible with dict format
68+
69+
## Tests Created
70+
71+
### Translation Layer Tests (`tests/unit/core/domain/test_translation_anthropic_streaming.py`)
72+
73+
16 comprehensive tests covering:
74+
- SSE content deltas
75+
- Message start/stop events
76+
- Stop reason mapping
77+
- [DONE] marker handling
78+
- Event line parsing
79+
- Multi-line SSE format
80+
- Invalid JSON handling
81+
- Backward compatibility with dict format
82+
- OpenAI structure preservation
83+
84+
### Connector Tests (`tests/unit/connectors/test_anthropic_streaming_translation.py`)
85+
86+
4 integration tests covering:
87+
- End-to-end Anthropic streaming translation
88+
- SSE format handling in connector
89+
- [DONE] marker translation
90+
- zai-coding-plan inheritance verification
91+
92+
## Impact
93+
94+
### Fixed
95+
- ✅ Anthropic connector now emits domain-formatted chunks
96+
- ✅ zai-coding-plan connector inherits the fix automatically
97+
- ✅ Cross-API translation works correctly for streaming
98+
- ✅ Downstream processors receive consistent OpenAI-style format
99+
100+
### Verified
101+
- ✅ All 20 new tests pass
102+
- ✅ All 15 existing translation tests still pass
103+
- ✅ Backward compatibility maintained
104+
105+
## Why Tests Didn't Catch This
106+
107+
The existing tests mocked the translation service or didn't verify the actual format of streaming chunks. The new tests:
108+
1. Test the actual translation function with SSE input
109+
2. Test the connector's streaming handler end-to-end
110+
3. Verify the output format matches OpenAI structure
111+
4. Ensure zai-coding-plan inherits the correct behavior
112+
113+
## Files Modified
114+
115+
1. `src/connectors/anthropic.py` - Added streaming translation
116+
2. `src/core/domain/translation.py` - Enhanced SSE parsing
117+
3. `tests/unit/connectors/test_anthropic_streaming_translation.py` - New connector tests
118+
4. `tests/unit/core/domain/test_translation_anthropic_streaming.py` - New translation tests

data/test_suite_state.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
{
2-
"test_count": 4494,
2+
"test_count": 4502,
33
"last_updated": "1761604243.5415785"
44
}
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# ZAI Backend Max Tokens Implementation
2+
3+
## Overview
4+
5+
Both ZAI connectors (`zai` and `zai-coding-plan`) now enforce a 128K (131,072 tokens) maximum output limit as specified by the ZAI API provider.
6+
7+
## Implementation Details
8+
9+
### Default Behavior
10+
- **Default max_tokens**: 131,072 (128K)
11+
- This is the maximum supported by ZAI's backend models
12+
- Used when client doesn't explicitly specify max_tokens or provides invalid values (None, 0, negative)
13+
14+
### Client Override Rules
15+
Clients can override the default by explicitly setting `max_tokens` in their request:
16+
17+
1. **Valid Range**: 1,024 to 131,072 tokens
18+
- Values below 1K are clamped to 1,024
19+
- Values above 128K are clamped to 131,072
20+
- Values within range are preserved as-is
21+
22+
2. **Invalid Values**: None, 0, or negative numbers
23+
- Automatically use the 128K default
24+
- Ensures requests never fail due to missing/invalid max_tokens
25+
26+
### Code Locations
27+
28+
#### ZaiCodingPlanBackend
29+
- File: `src/connectors/zai_coding_plan.py`
30+
- Method: `_prepare_anthropic_payload()`
31+
- Inherits from: `AnthropicBackend`
32+
33+
#### ZAIConnector
34+
- File: `src/connectors/zai.py`
35+
- Method: `_prepare_payload()`
36+
- Inherits from: `OpenAIConnector`
37+
38+
## Examples
39+
40+
### Example 1: No max_tokens specified
41+
```python
42+
request = {
43+
"model": "zai-coding-plan:claude-sonnet-4-20250514",
44+
"messages": [{"role": "user", "content": "Hello"}],
45+
# max_tokens not specified
46+
}
47+
# Result: max_tokens = 131072 (128K)
48+
```
49+
50+
### Example 2: Explicit valid value
51+
```python
52+
request = {
53+
"model": "zai-coding-plan:claude-sonnet-4-20250514",
54+
"messages": [{"role": "user", "content": "Hello"}],
55+
"max_tokens": 4096
56+
}
57+
# Result: max_tokens = 4096 (preserved)
58+
```
59+
60+
### Example 3: Value below minimum
61+
```python
62+
request = {
63+
"model": "zai-coding-plan:claude-sonnet-4-20250514",
64+
"messages": [{"role": "user", "content": "Hello"}],
65+
"max_tokens": 512
66+
}
67+
# Result: max_tokens = 1024 (clamped to minimum)
68+
```
69+
70+
### Example 4: Value above maximum
71+
```python
72+
request = {
73+
"model": "zai-coding-plan:claude-sonnet-4-20250514",
74+
"messages": [{"role": "user", "content": "Hello"}],
75+
"max_tokens": 200000
76+
}
77+
# Result: max_tokens = 131072 (clamped to maximum)
78+
```
79+
80+
## Testing
81+
82+
Comprehensive test suite in `tests/unit/connectors/test_zai_max_tokens.py` covers:
83+
- Default behavior (None, 0, negative values)
84+
- Explicit valid values preservation
85+
- Minimum boundary clamping
86+
- Maximum boundary clamping
87+
- Exact boundary values
88+
89+
All tests pass successfully.
90+
91+
## Benefits
92+
93+
1. **Prevents 422 Errors**: Ensures max_tokens is always valid
94+
2. **Maximizes Output**: Uses 128K by default for agentic coding tasks
95+
3. **Client Control**: Allows explicit override within valid range
96+
4. **Robust**: Handles edge cases (None, 0, negative, out-of-range)
97+
5. **Consistent**: Same logic across both ZAI connectors

0 commit comments

Comments
 (0)