Skip to content

[Bug] BWC Rolling Upgrade Tests are failing on neural search due to changes in MLPredictionTaskRequest Object #1838

@vibrantvarun

Description

@vibrantvarun

What is the bug?
The BWC Rolling upgrade tests are failing in the neural search because of recent changes in ML-Commons.
So the scenario is when the bwc version is 2.10 and current version is 2.12.0-SNAPSHOT

./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.10.0 

The tests fails with a reason

org.opensearch.transport.RemoteTransportException: [neuralSearchBwcCluster-rolling-1][127.0.0.1:35231][cluster:admin/opensearch/ml/predict]
Caused by: java.lang.IllegalStateException: unexpected byte [0x48]
	at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:592) ~[opensearch-core-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:582) ~[opensearch-core-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.ml.common.transport.prediction.MLPredictionTaskRequest.<init>(MLPredictionTaskRequest.java:56) ~[?:?]
	at org.opensearch.transport.RequestHandlerRegistry.newRequest(RequestHandlerRegistry.java:85) ~[opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundHandler.newRequest(InboundHandler.java:283) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundHandler.handleRequest(InboundHandler.java:243) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:133) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:115) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:767) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:175) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:150) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:115) [opensearch-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:95) [transport-netty4-client-2.12.0-SNAPSHOT.jar:2.12.0-SNAPSHOT]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.101.Final.jar:4.1.101.Final]

It is because there is one more constructor for MLPredictionTaskRequest has been added in the PR mentioned above.

In BWC tests when cluster has 3 nodes and 2 of them are running on 2.10 and one node gets upgraded to 2.12.0-SNAPSHOT. Now
when neural search calls predict api then while initializing the MLPredictionTaskRequest here it fails at this line with error mentioned above.

It is happening because it is either getting more parameters or no parameters for boolean while executing in.readBoolean().

The same is happening when the BWC tests are executed for 2.9 OS version.

The tests are running fine for 2.11 OS version.
But it makes this tests execution flaky.
Because if Out of 3 nodes if it uses the node which is running 2.11 version for calling predict api rather than 2.12.0-SNAPSHOT then it will face the same issue.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Clone the neural search project
  2. checkout on 2.x branch
  3. Run ./gradlew :qa:rolling-upgrade:testAgainstOneThirdUpgradedCluster -Dbwc.version=2.10.0
  4. See error

What is the expected behavior?
The BWC rolling upgrade tests should run with all previous version with no error.

What is your host/environment?

  • OS: linux
  • Version
  • Plugins Neural Search

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions