feat: add support for /responses background parameter by cdoern · Pull Request #4824 · llamastack/llama-stack

cdoern · 2026-02-03T21:26:26Z

What does this PR do?

Add OpenAI-compatible background mode for the Responses API, allowing responses to be queued for asynchronous processing. Added background parameter (bool, default: false), When true, returns immediately with status "queued". openai_responses.py now has Background processing with _create_background_response and _process_background_response

add new integration tests using the field, and associated recordings.

closes: #4701

Test Plan

new integration tests + recordings using openAI client should pass.

saving /cancel route for a separate PR.

github-actions · 2026-02-03T21:27:16Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

feat: add support for /responses background parameter

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ llama-stack-client-python studio · code · diff

Your SDK built successfully.
generate ⚠️ → build ✅ → lint ✅ → test ✅
pip install https://pkg.stainless.com/s/llama-stack-client-python/0477a25c223eeba36da2d560ea04e4b39610bfa8/llama_stack_client-0.4.0a15-py3-none-any.whl

✅ llama-stack-client-kotlin studio · code · diff

Your SDK built successfully.
generate ⚠️ → build ❗ → lint ✅ → test ❗

✅ llama-stack-client-openapi studio · code · diff

Your SDK built successfully.
generate ⚠️

✅ llama-stack-client-node studio · code · diff

Your SDK built successfully.
generate ⚠️ → build ✅ → lint ✅ → test ❗
npm install https://pkg.stainless.com/s/llama-stack-client-node/47bda28552f1624fcaac24cca35c3ecc52361f76/dist.tar.gz

⏳ llama-stack-client-go studio · conflict

⏳ These are partial results; builds are still running.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
Last updated: 2026-02-04 20:48:48 UTC

Add OpenAI-compatible background mode for the Responses API, allowing responses to be queued for asynchronous processing. - Added `background` parameter (bool, default: false) - When true, returns immediately with status "queued" - Added `background` field to `OpenAIResponseObject` - New status values: "queued", "in_progress" - `agents/models.py`: Added `background` to `CreateResponseRequest` - `openai_responses.py`: Added `background` field to response object - `openai_responses.py`: Background processing with `_create_background_response` and `_process_background_response` - `responses_store.py`: Added `update_response_object` for status updates - Issue: llamastack#4701 - OpenAI docs: https://platform.openai.com/docs/guides/background Signed-off-by: Charlie Doern <cdoern@redhat.com>

cdoern · 2026-02-03T21:31:35Z

src/llama_stack_api/helpers.py

+# the root directory of this source tree.
+
+
+def remove_null_from_anyof(schema: dict) -> None:


FYI, moved this to a helper so it can be used by multiple APIs.

mattf

some quick comments

mattf · 2026-02-04T12:43:42Z

src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py

+            final_response = None
+            failed_response = None


you only need one of these

mattf · 2026-02-04T12:45:30Z

src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py

+            except Exception as update_error:
+                logger.exception(f"Failed to update response {response_id} with error status: {update_error}")


what will a user see / not see if this happens?

hmm good point. the server will log an err, but the client might not get a useful one. let me see if I can propagate this in a better way.

mattf · 2026-02-04T12:46:41Z

src/llama_stack/providers/utils/responses/responses_store.py

+        if not self.sql_store:
+            raise ValueError("Responses store is not initialized")


this looks like a fail on startup kind of situation

mattf · 2026-02-04T12:47:54Z

src/llama_stack/providers/utils/responses/responses_store.py

+        # Preserve existing messages if not provided
+        if messages is not None:
+            data["messages"] = [msg.model_dump() for msg in messages]
+        else:
+            data["messages"] = existing_data.get("messages", [])


when would the response have a messages field?

mattf · 2026-02-04T12:51:28Z

src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py

+        )
+
+        # Schedule background processing task
+        asyncio.create_task(


how does this behave with concurrent users requesting background processing?

mattf · 2026-02-04T12:52:57Z

src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py

+        # Schedule background processing task
+        asyncio.create_task(
+            self._process_background_response(
+                response_id=response_id,


what happens when a user gets the response_id and uses it as previous_response_id before this original request has terminated?

mattf · 2026-02-04T12:54:37Z

src/llama_stack/providers/utils/responses/responses_store.py

+        if not existing_row:
+            raise ValueError(f"Response with id {response_object.id} not found")


if there's no row then there's some serious internal logic error. lots of logging here, maybe even crash the server.

mattf · 2026-02-04T12:56:54Z

src/llama_stack/providers/utils/responses/responses_store.py

+        # Preserve existing input if not provided
+        if input is not None:
+            data["input"] = [input_item.model_dump() for input_item in input]
+        else:
+            data["input"] = existing_data.get("input", [])


this will be another place where previous_response_id chains will have to be followed for #3646

mattf · 2026-02-04T13:00:30Z

src/llama_stack/providers/utils/responses/responses_store.py

+        existing_row = await self.sql_store.fetch_one(
+            self.reference.table_name,
+            where={"id": response_object.id},
+        )


i expect this will be heavy - every new event will require a query for the old event, a few ser/des rounds and an update. until we can optimize the storage schema, only do this dance when necessary.

…d-no-cancel

cdoern requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners February 3, 2026 21:26

cdoern marked this pull request as draft February 3, 2026 21:26

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 3, 2026

cdoern force-pushed the responses-background-no-cancel branch from b03ebdf to a3b69f6 Compare February 3, 2026 21:28

cdoern commented Feb 3, 2026

View reviewed changes

mattf reviewed Feb 4, 2026

View reviewed changes

cdoern added 2 commits February 4, 2026 14:30

Merge remote-tracking branch 'upstream/main' into responses-backgroun…

7798ee1

…d-no-cancel

Merge remote-tracking branch 'upstream/main' into responses-backgroun…

84da334

…d-no-cancel

		# the root directory of this source tree.


		def remove_null_from_anyof(schema: dict) -> None:

		except Exception as update_error:
		logger.exception(f"Failed to update response {response_id} with error status: {update_error}")

		if not self.sql_store:
		raise ValueError("Responses store is not initialized")

		if not existing_row:
		raise ValueError(f"Response with id {response_object.id} not found")

Conversation

cdoern commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

github-actions bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cdoern commented Feb 3, 2026 •

edited

Loading

github-actions bot commented Feb 3, 2026 •

edited

Loading