feat(feishu): CardKit streaming cards and SSE hang fixes (#287)#292
Merged
wisdomqin merged 5 commits intodataelement:mainfrom Apr 5, 2026
Merged
Conversation
…ataelement#287) Replace the im.message.patch-based streaming approach with CardKit streaming APIs (create_card_entity, stream_card_content, set_card_streaming_mode, update_cardkit_card) for silky smooth typewriter-style streaming output in Feishu cards. Key changes: - Add CardKit API methods to FeishuService (create_card_entity, send_card_by_card_id, stream_card_content, set_card_streaming_mode, update_cardkit_card) using lark-oapi SDK - Refactor streaming output in feishu.py to use CardKit as primary path with automatic fallback to IM patch when CardKit is unavailable - Use schema 2.0 card format with streaming_mode and element_id for incremental content updates (typewriter animation handled by Feishu) - Add collapsible thinking panel in final card - Reduce streaming flush interval to 0.5s for CardKit path (vs 1.0s for IM patch fallback) Refs: dataelement#287
websockets >= 13 auto-detects macOS system proxy settings. When a local proxy is configured but unable to handle WSS upgrade, the connection fails with 'did not receive a valid HTTP response from proxy'. Force proxy=None to bypass this. Refs: dataelement#287
Three root causes fixed: 1. AnthropicClient.stream() - break on stop_reason instead of waiting for message_stop event. Zhipu's Anthropic-compatible API may not send message_stop, causing aiter_lines() to hang forever. 2. _heartbeat_task cancel - CancelledError inherits BaseException in Python 3.9+, so except Exception does not catch it. This caused the final card update to be skipped after LLM completion. 3. httpx client hardening - proxy=None to avoid system proxy issues with SSE streams, and asyncio.wait_for timeout on aclose() to prevent indefinite blocking when closing connections.
…eams OpenAICompatibleClient now breaks on finish_reason in addition to [DONE], protecting all providers (Minimax, Custom, DeepSeek, Qwen, etc.) from hanging if [DONE] is never sent. GeminiClient now breaks on both [DONE] and finishReason instead of relying solely on connection close to end the SSE stream.
Reduce log noise in production by downgrading verbose SSE/streaming diagnostic logs from info to debug level. Only warnings and errors remain at info level.
wisdomqin
approved these changes
Apr 5, 2026
Contributor
wisdomqin
left a comment
There was a problem hiding this comment.
Great work! The CardKit streaming integration is well-structured with a solid 3-tier fallback design (CardKit -> IM Patch -> plain text), and the SSE hang fixes address real root causes across all three LLM clients. Approving for merge.
We will address the following in a follow-up commit after merge:
- Add lark-oapi to requirements.txt
- Scope the websockets proxy patch (avoid global monkey-patch)
- Add tool call status display to the CardKit streaming path
- Add a size cap to _lark_clients cache to prevent unbounded growth
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements CardKit streaming card API for smooth typewriter-style output in Feishu (Lark), and fixes multiple SSE stream hanging issues that caused the bot to become unresponsive.
Closes #287
What's New
CardKit Streaming Card Integration
create_card_entity()— Create CardKit card entities for streamingsend_card_by_card_id()— Send cards by card_id via IM APIstream_card_content()— Element-level streaming content push (500ms refresh interval)set_card_streaming_mode()— Enable/disable streaming mode on cardsupdate_cardkit_card()— Full card content update after streaming completesDual-Path Design with Graceful Degradation
Streaming Flush Control
asyncio.Lock-protected_flush_stream()— prevents sequence conflicts_SerialPatchQueue— serializes IM patch requests to prevent out-of-order overwritesBug Fixes
1. Anthropic SSE Stream Hang (Root Cause)
AnthropicClient.stream()waited indefinitely for amessage_stopevent that Zhipu's Anthropic-compatible API never sends. After receivingmessage_deltawithstop_reason, theaiter_lines()loop hung forever.Fix: Break immediately on
stop_reasoninmessage_delta, before waiting formessage_stop.2.
CancelledErrorNot Caught in Heartbeat CleanupIn Python 3.9+,
asyncio.CancelledErrorinherits fromBaseException, notException. Theexcept Exception: passblock in the heartbeat task cleanup did not catch it, causing the exception to propagate and skip the final card update entirely.Fix: Changed to
except (Exception, asyncio.CancelledError).3. httpx System Proxy Interference
httpx.AsyncClientauto-detects macOS system proxy settings, which can interfere with long-lived SSE connections to LLM APIs.Fix: Added
proxy=Noneto allhttpx.AsyncClientconstructors.4. httpx
aclose()Indefinite BlockingWhen a streaming connection was terminated early (via
break),httpx.AsyncClient.aclose()could hang indefinitely waiting for the server to finish sending.Fix: Wrapped
aclose()inasyncio.wait_for(..., timeout=5.0)for all client classes.5. OpenAI/Gemini SSE Stream Termination
[DONE], not onfinish_reason. If a provider sendsfinish_reasonwithout[DONE], the stream hangs.[DONE]was onlycontinued (not breaking), andfinishReasonwas recorded but never acted on. The client relied solely on HTTP connection close.Fix: Added
finish_reasonbreak protection to both clients, matching the Anthropic client's defensive pattern.Files Changed
backend/app/api/feishu.pybackend/app/api/websocket.pybackend/app/services/feishu_service.pybackend/app/services/feishu_ws.pybackend/app/services/llm_client.pyTesting