Skip to content

[BUG] ChatAgent fails while parsing LLMOutput #4136

@pyek-bot

Description

@pyek-bot

What is the bug?
Example 1:

Received response from remote service: {"metrics":{"latencyMs":6787},"output":{"message":{"content":[{"text":"I've examined the conversation data stored in the `.plugins-ml-memory-message` index. There are two message records:\n\n1. **Message ID: HX1_45gB10LfX3LkjKQG**:\n   - Created Time: 1756163380693 (Unix timestamp)\n   - Modified Time: 1756163380693 (Unix timestamp)\n   - Type: \"input\"\n   - Conversation ID: \"Hn1_45gB10LfX3LkVaRV\"\n   - Input:\n     - Raw: \"Use ListIndexTool to get an overview of all indices in the cluster\"\n     - Processed: \"Use ListIndexTool to get an overview of all indices in the cluster\"\n   - Response:\n     - Raw: \"I'll help you get an overview of all indices in the cluster using the ListIndexTool.\\n\\n"}],"role":"assistant"}},"stopReason":"tool_use","usage":{"cacheReadInputTokenCount":0,"cacheReadInputTokens":0,"cacheWriteInputTokenCount":0,"cacheWriteInputTokens":0,"inputTokens":3894,"outputTokens":263,"totalTokens":4157}}
[2025-08-25T16:17:37,792][ERROR][o.o.m.e.a.a.MLChatAgentRunner] [integTest-0] Failed to run chat agent
java.util.NoSuchElementException: null
        at java.base/java.util.ArrayList.getFirst(ArrayList.java:440) ~[?:?]
        at org.opensearch.ml.engine.algorithms.agent.AgentUtils.parseLLMOutput(AgentUtils.java:364) ~[?:?]
        at org.opensearch.ml.engine.algorithms.agent.MLChatAgentRunner.lambda$runReAct$0(MLChatAgentRunner.java:342) ~[?:?]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.ListenableFuture$1.doRun(ListenableFuture.java:126) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.OpenSearchExecutors$DirectExecutorService.execute(OpenSearchExecutors.java:341) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:120) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.ListenableFuture.lambda$done$0(ListenableFuture.java:112) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1604) [?:?]
        at org.opensearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:112) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.BaseFuture.set(BaseFuture.java:160) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.common.util.concurrent.ListenableFuture.onResponse(ListenableFuture.java:141) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.action.StepListener.innerOnResponse(StepListener.java:79) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.core.action.NotifyOnceListener.onResponse(NotifyOnceListener.java:58) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:115) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:109) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.core.action.ActionListener$6.onResponse(ActionListener.java:301) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.core.action.ActionListener$5.onResponse(ActionListener.java:268) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.core.action.ActionListener$5.onResponse(ActionListener.java:268) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.ml.task.MLPredictTaskRunner.lambda$runPredict$0(MLPredictTaskRunner.java:505) [opensearch-ml-3.2.0.0-SNAPSHOT.jar:3.2.0.0-SNAPSHOT]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.ml.engine.algorithms.remote.RemoteConnectorExecutor.lambda$executeAction$0(RemoteConnectorExecutor.java:64) [opensearch-ml-algorithms-3.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.action.support.GroupedActionListener.onResponse(GroupedActionListener.java:81) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.action.support.RetryableAction$RetryingListener.onResponse(RetryableAction.java:183) [opensearch-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
        at org.opensearch.ml.engine.algorithms.remote.MLSdkAsyncHttpResponseHandler.response(MLSdkAsyncHttpResponseHandler.java:208) [opensearch-ml-algorithms-3.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.ml.engine.algorithms.remote.MLSdkAsyncHttpResponseHandler$MLResponseSubscriber.onComplete(MLSdkAsyncHttpResponseHandler.java:167) [opensearch-ml-algorithms-3.2.0.0-SNAPSHOT.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$DataCountingPublisher$1.onComplete(ResponseHandler.java:511) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:246) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$600(ResponseHandler.java:76) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:367) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.publishMessage(HandlerPublisher.java:402) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.flushBuffer(HandlerPublisher.java:338) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.receivedDemand(HandlerPublisher.java:291) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.access$200(HandlerPublisher.java:61) [netty-nio-client-2.30.18.jar:?]
        at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher$ChannelSubscription$1.run(HandlerPublisher.java:495) [netty-nio-client-2.30.18.jar:?]
        at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) [netty-common-4.1.118.Final.jar:4.1.118.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) [netty-common-4.1.118.Final.jar:4.1.118.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [netty-common-4.1.118.Final.jar:4.1.118.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566) [netty-transport-4.1.118.Final.jar:4.1.118.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998) [netty-common-4.1.118.Final.jar:4.1.118.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.118.Final.jar:4.1.118.Final]
        at java.base/java.lang.Thread.run(Thread.java:1447) [?:?]

Example 2:

[2025-08-20T08:14:33,447][DEBUG][o.o.m.e.a.r.MLSdkAsyncHttpResponseHandler] [opensearch-cluster-master-0] Received response from remote service: {"metrics":{"latencyMs":5547},"output":{"message":{"content":[],"role":"assistant"}},"stopReason":"end_turn","usage":{"cacheReadInputTokenCount":0,"cacheReadInputTokens":0,"cacheWriteInputTokenCount":0,"cacheWriteInputTokens":0,"inputTokens":36721,"outputTokens":3,"totalTokens":36724}} | | [2025-08-20T08:14:33,447][ERROR][o.o.m.e.a.a.MLChatAgentRunner] [opensearch-cluster-master-0] Failed to run chat agent | | java.util.NoSuchElementException: null | | at java.base/java.util.ArrayList.getFirst(ArrayList.java:440) ~[?:?] | | at org.opensearch.ml.engine.algorithms.agent.AgentUtils.parseLLMOutput(AgentUtils.java:364) ~[?:?] | | at org.opensearch.ml.engine.algorithms.agent.MLChatAgentRunner.lambda$runReAct$6(MLChatAgentRunner.java:342) ~[?:?] | | at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-3.2.0.jar:3.2.0] | | at org.opensearch.common.util.concurrent.ListenableFuture$1.doRun(ListenableFuture.java:126) [opensearch-3.2.0.jar:3.2.0] | | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-3.2.0.jar:3.2.0] | | at

How can one reproduce the bug?
Flaky, appears depending on LLM response

What is the expected behavior?
All responses should be handled gracefully

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions