Skip to content

Streaming Does not Work with OpenAI Responses API #8577

@ninjudd

Description

@ninjudd

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code:

import { HumanMessage } from '@langchain/core/messages';
import { MessagesAnnotation, StateGraph } from '@langchain/langgraph';
import { ChatOpenAI } from '@langchain/openai';

async function testDirectModelStreaming(useResponsesApi: boolean): Promise<void> {
  console.warn(`\n=== Direct Model Streaming (useResponsesApi: ${useResponsesApi ? 'true' : 'false'}) ===`);

  const model = new ChatOpenAI({
    modelName: 'gpt-4',
    temperature: 0.7,
    // streaming: true, // same result with or without this
    useResponsesApi,
  });

  const messages = [new HumanMessage('Say "Hello world" in 10 words.')];
  const stream = model.streamEvents(messages, { version: 'v2' });

  for await (const event of stream) {
    console.warn(event.event);
  }
}

async function testLangGraphStreaming(useResponsesApi: boolean): Promise<void> {
  console.warn(`\n=== LangGraph Streaming (useResponsesApi: ${useResponsesApi ? 'true' : 'false'}) ===`);

  const model = new ChatOpenAI({
    modelName: 'gpt-4',
    temperature: 0.7,
    // streaming: true, // same result with or without this
    useResponsesApi,
  });

  async function callModel(state: typeof MessagesAnnotation.State): Promise<{ messages: unknown[] }> {
    const messages = [new HumanMessage('Say "Hello world" in 10 words.'), ...state.messages];
    const response = await model.invoke(messages);
    return { messages: [response] };
  }

  const workflow = new StateGraph(MessagesAnnotation)
    .addNode('agent', callModel)
    .addEdge('__start__', 'agent')
    .addConditionalEdges('agent', () => '__end__');

  const agent = workflow.compile();
  const eventStream = agent.streamEvents({ messages: [] }, { version: 'v2', streamMode: 'messages' });

  for await (const event of eventStream) {
    console.warn(event.event);
  }
}

async function runTests(): Promise<void> {
  await testDirectModelStreaming(false);
  await testDirectModelStreaming(true);
  await testLangGraphStreaming(true);
  await testLangGraphStreaming(false);
}

runTests().catch(console.error);

Error Message and Stack Trace (if applicable)

No response

Description

Token-by-token streaming doesn't work when useResponsesApi: true is enabled in LangGraph applications, even though it works fine when calling the model directly.

Root Cause

The issue is in the ChatOpenAI class implementation. When useResponsesApi: true is enabled:

  1. ChatOpenAI._streamResponseChunks() routes to this.responses._streamResponseChunks()
  2. However, it does NOT pass the runManager parameter to the responses method
  3. This prevents ChatOpenAIResponses._streamResponseChunks() from emitting handleLLMNewToken events
  4. The callback system never receives token-by-token updates

I am working on a fix for this with a test, and I'll submit a PR.

System Info

@langchain/openai@0.6.3
@langchain/core@0.3.66
@langchain/langgraph@0.3.11

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions