-
Notifications
You must be signed in to change notification settings - Fork 184
Description
What is the bug?
The BWC Rolling upgrade tests are failing in the neural search because of recent changes in ML-Commons.
So the scenario is when the bwc version is 2.10 and current version is 2.12.0-SNAPSHOT
./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.10.0
The tests fails with a reason
org.opensearch.transport.RemoteTransportException: [neuralSearchBwcCluster-rolling-1][127.0.0.1:35231][cluster:admin/opensearch/ml/predict]
Caused by: java.lang.IllegalStateException: unexpected byte [0x48]
at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:592) ~[opensearch-core-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:582) ~[opensearch-core-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.ml.common.transport.prediction.MLPredictionTaskRequest.<init>(MLPredictionTaskRequest.java:56) ~[?:?]
at org.opensearch.transport.RequestHandlerRegistry.newRequest(RequestHandlerRegistry.java:85) ~[opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundHandler.newRequest(InboundHandler.java:283) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundHandler.handleRequest(InboundHandler.java:243) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:133) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:115) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:767) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:175) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:150) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:115) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:95) [transport-netty4-client-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.101.Final.jar:4.1.101.Final]
It is because there is one more constructor for MLPredictionTaskRequest has been added in the PR mentioned above.
In BWC tests when cluster has 3 nodes and 2 of them are running on 2.10 and one node gets upgraded to 2.12.0-SNAPSHOT. Now
when neural search calls predict api then while initializing the MLPredictionTaskRequest here it fails at this line with error mentioned above.
It is happening because it is either getting more parameters or no parameters for boolean while executing in.readBoolean().
The same is happening when the BWC tests are executed for 2.9 OS version.
The tests are running fine for 2.11 OS version.
But it makes this tests execution flaky.
Because if Out of 3 nodes if it uses the node which is running 2.11 version for calling predict api rather than 2.12.0-SNAPSHOT then it will face the same issue.
How can one reproduce the bug?
Steps to reproduce the behavior:
- Clone the neural search project
- checkout on 2.x branch
- Run ./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.10.0
- See error
What is the expected behavior?
The BWC rolling upgrade tests should run with all previous version with no error.
What is your host/environment?
- OS: linux
- Version
- Plugins Neural Search
Metadata
Metadata
Assignees
Labels
Type
Projects
Status