Fix: add sources after web search in stream mode #156

blefo · 2025-10-02T09:42:23Z

This pull request fixes the streaming chat completion endpoint to include sources when web search is on. It also adds a unit test to verify that sources are properly included in the final streamed payload.

Streaming response improvements:

Refactored the streaming logic in chat_completion_stream_generator to buffer payloads, append sources only on the final chunk (when finish_reason == "stop"), and yield JSON-encoded SSE events; also improved token usage tracking and error handling in the stream. [1] [2] [3]

Testing enhancements:

Added a new unit test test_chat_completion_stream_includes_sources to verify that web search sources are only included in the final streamed payload, ensuring correct SSE formatting and payload structure.

Code cleanup:

Removed unused imports (asyncio) and added necessary ones (json) in both the main router and test files. [1] [2]

…ayload

Copilot

Pull Request Overview

This pull request fixes a streaming chat completion endpoint to properly include web search sources in the final streamed payload when web search is enabled. The implementation ensures sources are only added to the last chunk with finish_reason "stop".

Key changes:

Refactored streaming logic to buffer payloads and append sources only on the final chunk
Added comprehensive unit test to verify sources are included correctly in streamed responses
Improved error handling and token usage tracking in the streaming generator

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
nilai-api/src/nilai_api/routers/private.py	Refactored streaming logic to buffer payloads, add sources to final chunk, and improve error handling
tests/unit/nilai_api/routers/test_private.py	Added unit test to verify sources are properly included in final streamed payload

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

nilai-api/src/nilai_api/routers/private.py

jcabrero · 2025-10-06T11:11:27Z

nilai-api/src/nilai_api/routers/private.py

+                    try:
+                        stop_condition = payload["choices"][0].get("finish_reason")
+                    except Exception:
+                        stop_condition = None


This is an important issue.

In most programming languages, try/except are expensive routines to run, and slow down most programs https://paltman.com/try-except-performance-in-python-a-simple-test/.
It is not a problem if you're executing it just once and if it normally goes through the "try" branch, but this is executed for every single token that is produced through the stream routine and it is executed through the except branch even more times. If this snippet slowed the program down by 1ms, it would mean for a 1000 token generation it would already eat up a second from that user.

So, easy option that works most times, use get("abc", None) + check if None.

Even easier option for this case. I see you're updating to if chunk.usage is not None. For the current setup, the chunk.usage is only reported on the last streamed chunk. This means that you can profit from chunk.usage not being None in that case to know it is the last request. At that point, you can reply with the sources.

jcabrero · 2025-10-06T11:19:51Z

nilai-api/src/nilai_api/routers/private.py

+                    payload = chunk.model_dump(exclude_unset=True)
+
+                    if chunk.usage is not None:
+                        last_seen_usage = payload.get("usage", last_seen_usage)
+                        if last_seen_usage:
+                            prompt_token_usage = last_seen_usage.get(
+                                "prompt_tokens", prompt_token_usage
+                            )
+                            completion_token_usage = last_seen_usage.get(
+                                "completion_tokens", completion_token_usage
+                            )
+


This aligns with what I told you on Friday. You should not dump the model. The response is a Pydantic model from the OpenAI client. This enforces you to see if fields such as "choices", or. "finish_reason" exists. They are there. They may be None if they are optional, which you can easily check.

Usage parameter is used for the stop condition now. It's possible as I updated the continuous_usage_stats so that it only appears in the last chunk.
The pydantic object is directly used instead of using a dictionary

jcabrero

Please have a look at the comments please

jcabrero

👍 LGTM

blefo added 5 commits October 2, 2025 09:33

feat: enhance chat completion streaming to include sources in final p…

5319da4

…ayload

refactor: ensure sources are included in final payload

f2a4ec8

refactor: streamline response line iteration in chat completion test

1a8baec

fix: ruff

8cfa6a8

fix: correct response formatting in chat completion streaming

2b1005e

blefo requested a review from Copilot October 2, 2025 11:33

Copilot AI reviewed Oct 2, 2025

View reviewed changes

nilai-api/src/nilai_api/routers/private.py Outdated Show resolved Hide resolved

nilai-api/src/nilai_api/routers/private.py Show resolved Hide resolved

blefo requested a review from jcabrero October 2, 2025 11:35

jcabrero reviewed Oct 6, 2025

View reviewed changes

jcabrero requested changes Oct 6, 2025

View reviewed changes

refactor: optimize chat completion streaming

5425fcc

blefo force-pushed the fix-steram-includes-sources branch from b694367 to 5425fcc Compare October 6, 2025 13:06

fix: limit chunk processing in streaming chat completion tests

476ba13

blefo requested a review from jcabrero October 6, 2025 14:10

jcabrero approved these changes Oct 6, 2025

View reviewed changes

blefo merged commit 17fefa1 into main Oct 6, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: add sources after web search in stream mode #156

Fix: add sources after web search in stream mode #156

Uh oh!

blefo commented Oct 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jcabrero Oct 6, 2025

Uh oh!

jcabrero Oct 6, 2025

Uh oh!

blefo Oct 6, 2025

Uh oh!

jcabrero left a comment

Uh oh!

jcabrero left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: add sources after web search in stream mode #156

Fix: add sources after web search in stream mode #156

Uh oh!

Conversation

blefo commented Oct 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

jcabrero Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

jcabrero Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

blefo Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

jcabrero left a comment

Choose a reason for hiding this comment

Uh oh!

jcabrero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants