Feat/support streaming intermediate results with non live #4170

Lin-Nikaido · 2026-01-16T03:43:10Z

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

close Master issue: [Streaming Tools] support streaming intermediate results for tools for non-streaming case #2014
resolve: feat: enable to will_continue-like function response with streaming mode. #2788
related: feat: Add progressive streaming for run_async via ProgressiveTool (partial progress + final result) #2698

2. Or, if no issue exists, describe the change:
Solution:

Enable to will_continue-like function response in BaseLlmFlow.run_async method with streaming mode.
It expected the tool returns generator, and the runner.async_run method when streaming_mode: StreamingMode.SSE yields the generator result as each Event. also, the streaming_mode is not SSE there is no change.

I expect this usecase is the function-tool will take few minutes total, and the user want to notice user its progress.
e.g. Like this function.

def search_tool(query: str):
    """ Example
    The query gives like " Answer to the Ultimate Question of Life, the Universe, and Everything".
    """
    yield {
        {
            "status": "inProgress",
            "message": f"searching Google with `{query}`"
        },
    }
    # Searching will take few second...
    link_list = ["https://hitsite1.com", "https://hitsite2.com", "https://hitsite3.com"]
    yield {
        {
            "status": "inProgress",
            "message": f"accessing each sites.\n{json.dumps(link_list, indent=2, ensure_ascii=False)}"
        },
    }
    # access these, retrieve information and analyze.
    yield {
        {
            "status": "inProgress",
            "message": f"summarizing the information."
        },
    }
    # summarizing each analyzed information.
    yield {
        {
            "status": "ok",
            "message": "42"
        },
    }

Testing Plan

Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Please include a summary of passed pytest results.
Add: tests/unittests/tools/test_tools_generative_call.py and it passed.

and also, all unit tests pass locally.

Manual End-to-End (E2E) Tests:

Create the Agent has tool it like asyncgen_function(example: above).
Request prompt the agent call the tool.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

This capability has been strongly requested, and I believe it will significantly enhance the developer experience when building practical, production-grade agents with ADK.
I would appreciate it if you could review and consider this change. If you have any change requests, I will fix it as soon as possible.
Thank you.

…un_async method with streaming mode.

…-with-stream' into feat/generative-function-calling-with-stream # Conflicts: # src/google/adk/agents/readonly_context.py # src/google/adk/flows/llm_flows/functions.py # src/google/adk/tools/_function_parameter_parse_util.py # src/google/adk/tools/function_tool.py # tests/unittests/testing_utils.py

gemini-code-assist · 2026-01-16T03:43:41Z

Summary of Changes

Hello @Lin-Nikaido, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the LLM flow by enabling the streaming of intermediate results from tool executions. This allows long-running tools to provide real-time progress updates, improving the user experience for interactive agents. The changes involve updating the core function calling logic to process generator and async generator outputs as events, ensuring that each yielded value is propagated when operating in Server-Sent Events (SSE) streaming mode. This makes the execution of complex tools more transparent and responsive to the end-user.

Highlights

Streaming Intermediate Tool Results: Introduced the capability for tools to stream intermediate results during their execution, significantly enhancing responsiveness for long-running operations. This is enabled when the streaming_mode is set to StreamingMode.SSE.
Generator Support for Tools: Updated the function calling mechanism to correctly handle Python Generator and AsyncGenerator return types from tools. Each yielded value from these generators is now processed as an event, allowing for real-time progress updates.
Refactored Function Handling: The core functions responsible for handling tool calls (handle_function_calls_async and _execute_single_function_call_async) have been refactored into generator-based versions (handle_function_calls_async_gen and _execute_single_function_call_async_gen) to support the new streaming paradigm.
Parallel Streaming Tool Execution: Implemented a mechanism (_concat_function_call_generators) to manage and combine streams from multiple parallel generative function calls, ensuring that intermediate results from all concurrently running tools are properly aggregated and yielded.
Schema Parsing for Generators: Modified the _function_parameter_parse_util.py to correctly interpret generator and async generator return types as ARRAY in the tool's function schema, ensuring proper type declaration.
Comprehensive Testing: Added a new dedicated test file (test_tools_generative_call.py) with extensive unit tests covering various scenarios for generative functions, including streaming, non-streaming, and parallel execution, to validate the new functionality.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature to support streaming intermediate results from tools that are generators, which is particularly useful for long-running tasks. The changes are well-implemented across the LLM flow and tool handling logic, correctly distinguishing between streaming and non-streaming modes. The addition of a new, comprehensive test file for this feature is also a great contribution. I've identified a few opportunities for refactoring to reduce code duplication, which would enhance the maintainability of the new logic.

gemini-code-assist · 2026-01-16T03:45:34Z

src/google/adk/flows/llm_flows/functions.py

+        if inspect.isasyncgen(function_response) or isinstance(
+            function_response, AsyncIterator
+        ):
+          res = None
+          async for res in function_response:
+            if inspect.isawaitable(res):
+              res = await res
+            if (
+                invocation_context.run_config.streaming_mode
+                == StreamingMode.SSE
+            ):
+              yield __build_response_event(
+                  tool, res, tool_context, invocation_context
+              )
+          function_response = res
+        elif inspect.isgenerator(function_response) or isinstance(
+            function_response, Iterator
+        ):
+          res = None
+          for res in function_response:
+            if inspect.isawaitable(res):
+              res = await res
+            if (
+                invocation_context.run_config.streaming_mode
+                == StreamingMode.SSE
+            ):
+              yield __build_response_event(
+                  tool, res, tool_context, invocation_context
+              )
+          function_response = res


The logic for handling asynchronous and synchronous generators is duplicated in these blocks. While separate loops are necessary for async for and for, the body of the loops is identical. Extracting the loop body into a helper function could reduce this duplication and improve maintainability.

gemini-code-assist · 2026-01-16T03:45:34Z

src/google/adk/tools/_function_parameter_parse_util.py

+    if origin in {ABCGenerator, ABCIterator, ABCIterable}:
+      schema.type = types.Type.ARRAY
+      item_ann = args[0] if args else Any
+      schema.items = _parse_schema_from_parameter(
+          variant,
+          inspect.Parameter(
+              'item',
+              inspect.Parameter.POSITIONAL_OR_KEYWORD,
+              annotation=item_ann,
+          ),
+          func_name,
+      )
+      if param.default is not inspect.Parameter.empty:
+        if not _is_default_value_compatible(param.default, param.annotation):
+          raise ValueError(default_value_error_msg)
+        schema.default = param.default
+      _raise_if_schema_unsupported(variant, schema)
+      return schema
+    if origin in {ABCAsyncGenerator, ABCAsyncIterator, ABCAsyncIterable}:
+      schema.type = types.Type.ARRAY
+      item_ann = args[0] if args else Any
+      schema.items = _parse_schema_from_parameter(
+          variant,
+          inspect.Parameter(
+              'item',
+              inspect.Parameter.POSITIONAL_OR_KEYWORD,
+              annotation=item_ann,
+          ),
+          func_name,
+      )
+      if param.default is not inspect.Parameter.empty:
+        if not _is_default_value_compatible(param.default, param.annotation):
+          raise ValueError(default_value_error_msg)
+        schema.default = param.default
+      _raise_if_schema_unsupported(variant, schema)
+      return schema


The logic in these two if blocks for handling synchronous and asynchronous generators is identical. You can combine them into a single block to reduce code duplication and improve maintainability.

if origin in { ABCGenerator, ABCIterator, ABCIterable, ABCAsyncGenerator, ABCAsyncIterator, ABCAsyncIterable, }: schema.type = types.Type.ARRAY item_ann = args[0] if args else Any schema.items = _parse_schema_from_parameter( variant, inspect.Parameter( 'item', inspect.Parameter.POSITIONAL_OR_KEYWORD, annotation=item_ann, ), func_name, ) if param.default is not inspect.Parameter.empty: if not _is_default_value_compatible(param.default, param.annotation): raise ValueError(default_value_error_msg) schema.default = param.default _raise_if_schema_unsupported(variant, schema) return schema

gemini-code-assist · 2026-01-16T03:45:35Z

src/google/adk/tools/function_tool.py

+    elif is_generator:
+      # if streaming_mode: SSE, return as generator object.
+      if tool_context.run_config.streaming_mode == StreamingMode.SSE:
+        return target(**args_to_call)
+
+      # elif streaming_mode != SSE, return last yields value.
+      res = None
+      for res in target(**args_to_call):
+        if inspect.isawaitable(res):
+          res = await res
+      return res
+
+    elif is_asyncgen:
+      # if streaming_mode: SSE return just async generator object.
+      if tool_context.run_config.streaming_mode == StreamingMode.SSE:
+        return target(**args_to_call)
+
+      # elif streaming_mode != SSE, return last yields value.
+      res = None
+      async for res in target(**args_to_call):
+        if inspect.isawaitable(res):
+          res = await res
+      return res


The logic for handling synchronous and asynchronous generators in the non-streaming case (streaming_mode != SSE) is duplicated. To improve maintainability, you could refactor this by extracting the loop that consumes the generator and returns the last value into helper functions.

Lin-Nikaido and others added 10 commits August 30, 2025 23:53

feat: enable to will_continue-like function response in BaseLlmFlow.r…

a0cd676

…un_async method with streaming mode.

Merge branch 'main' into feat/generative-function-calling-with-stream

083197b

update: fix gemini code review.

8f8128b

feat: enable to will_continue-like function response in BaseLlmFlow.r…

dd83677

…un_async method with streaming mode.

update: fix gemini code review.

62b27ee

resolve

1bfe199

resolve

bce8a8e

resolve

c0928f3

resolve

27c5850

adk-bot added the live [Component] This issue is related to live, voice and video chat label Jan 16, 2026

gemini-code-assist bot reviewed Jan 16, 2026

View reviewed changes

Lin-Nikaido mentioned this pull request Jan 16, 2026

feat: enable to will_continue-like function response with streaming mode. #2788

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/support streaming intermediate results with non live #4170

Feat/support streaming intermediate results with non live #4170

Uh oh!

Lin-Nikaido commented Jan 16, 2026

Uh oh!

gemini-code-assist bot commented Jan 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 16, 2026

Uh oh!

gemini-code-assist bot Jan 16, 2026

Uh oh!

gemini-code-assist bot Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feat/support streaming intermediate results with non live #4170

Are you sure you want to change the base?

Feat/support streaming intermediate results with non live #4170

Uh oh!

Conversation

Lin-Nikaido commented Jan 16, 2026

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

gemini-code-assist bot commented Jan 16, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants