Skip to content

Conversation

@s-alexey
Copy link
Contributor

@s-alexey s-alexey commented Jan 7, 2026

Major refactor of the LLM chat architecture to improve code organization,
maintainability, and type safety.

Key Changes:

  • Split LLMChat subclasses into distinct Non-Streaming and Streaming
    implementations. Streaming logic (primarily for notebooks) was
    complicating the core classes; this split makes primary actors more
    concise and less error-prone.
  • Moved provider-specific implementations into separate files:
    openai.py and genai.py.
  • Replaced the generic LLMResponse with a strictly typed version,
    specifically enforcing types for tool_usage and token_usage.
  • Updated invoke method to accept explicit arguments.
  • Migrated OpenAI integration from the completion API to the more
    user-friendly responses API.

Testing:

  • Added coverage for common use cases using real APIs (tests run
    conditionally if environment keys are present).

@s-alexey s-alexey requested a review from dolaameng January 7, 2026 16:04
@s-alexey s-alexey added the wip Work in progress label Jan 7, 2026
Copy link
Contributor

@dolaameng dolaameng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refactoring, which is really helpful and clearer! Left some questions/comments to understand more.

We can keep iterating it.

"""
Defines a chat agent that interacts with a Large Language Model (LLM).
Design Note:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we update here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I will get back to that once the internal api is more stable



@dataclasses.dataclass
class FunctionCall:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also consider thought signature for reasoning models? Context here. Or it's already in "LLMMessage.thinking"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to have it as well, perhaps in a separate PR as this one is already overcomplicated

temperature: float | None = 0,
seed: int = 0,
tools: list[Any] | None = None,
) -> LLMMessage[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being explicit is nice! Wonder if we also want to keep the flexibility for other parameters, e.g., the config

return result


class ModelProxyOpenAI(OpenAI):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still support streaming output for Model Proxy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should, but I would rather wait till they fully support responses API so we don't have to implement it twice

Major refactor of the LLM chat architecture to improve code organization,
maintainability, and type safety.

Key Changes:
- Split `LLMChat` subclasses into distinct Non-Streaming and Streaming
  implementations. Streaming logic (primarily for notebooks) was
  complicating the core classes; this split makes primary actors more
  concise and less error-prone.
- Moved provider-specific implementations into separate files:
  `openai.py` and `genai.py`.
- Replaced the generic `LLMResponse` with a strictly typed version,
  specifically enforcing types for `tool_usage` and `token_usage`.
- Updated `invoke` method to accept explicit arguments.
- Migrated OpenAI integration from the `completion` API to the more
  user-friendly `responses` API.

Testing:
- Added coverage for common use cases using real APIs (tests run
  conditionally if environment keys are present).

answer = response.content
response._meta.update(chat=chat, schema=schema, raw_content=answer, **kwargs)
response._meta.update(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still use _meta?

Copy link

@develra develra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - it's a bit hard for me to tell as a person pretty ignorant of this code if the test coverage is sufficient to be confident that these changes are safe. I think that it would be good to think through what might break as a result of these changes and make sure we have test coverage for it - especially given the somewhat sensitive timing of a new launch.

args: ["run", "--group", "test", "pytest", "tests"]
waitFor: ["push-image"]
env:
- "LLM_DEFAULT=x"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it really doesn't matter but wondering why the MODEL_PROXY_* ones dropped the letter but LLM_DEFAULT didn't?

arguments=json.loads(item.arguments),
output=output,
)
# {"name": item.name, "arguments": item.arguments, "output": output}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - clean up if this is unused

{
"role": message.sender.role
if message.sender.role != "tool"
else "system", # TODO: Remove this renaming once ModelProxy supports tools
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this TODO - do we know if that is on the roadmap for ModelProxy?

if not self.support_structured_outputs and schema_instructions:
raw_messages.append(
{
"role": "system",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some models don't support system role, shall we use user?

temperature: float | None = 0,
seed: int = 0,
tools: list[Any] | None = None,
) -> LLMMessage[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For invoke, shall we return LLMMessage[T] for image output etc?

tool_calls: list[FunctionCall] | None = None
usage: Usage | None = None

def add_chunk(self, chunk: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what's the different uses of this new method and Message.stream?


@dataclasses.dataclass
class LLMMessage(messages.Message[T]):
content: T
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at the add_chunk method, do we essentially assume content is always string? If so shall we just rename it to text etc?

We can add another filed as image for image output in the future?

content: T
_status: utils.Status = utils.Status.RUNNING
thinking: str | None = None
tool_calls: list[FunctionCall] | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these fields are also useful to Message, do you think so?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @s-alexey that it will be great for us to test more existing examples to avoid back-incompatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wip Work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants