Skip to content

[ci] :x-pack:rolling-upgrade:with-system-key times out when starting oneThirdUpgradedTestCluster node0 #32566

Closed
@andyb-elastic

Description

@andyb-elastic

Happened in CI intake job, on a PR job, and I was able to reproduce it locally

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2448/console

CI log
build-2448.txt

Test cluster logs
v6.5.0-SNAPSHOT#oldClusterTestCluster-node0.log
v6.5.0-SNAPSHOT#oldClusterTestCluster-node1.log
v6.5.0-SNAPSHOT#oldClusterTestCluster-node2.log
v6.5.0-SNAPSHOT#oneThirdUpgradedTestCluster-node0.log

I'm not sure if this deserialization error is the real cause but it appeared in all three instances I looked at the cluster logs for - it looks like there might have been some recent changes here (for example #32319)

[2018-08-01T20:22:35,314][WARN ][o.e.d.z.ZenDiscovery     ] [node-1] failed to validate incoming join request from node [{upgraded-node-0}{_DluXPheQ3q0NQzXEPKpzQ}{pb58T916Q0azZH_9t9KsVw}{127.0.0.1}{127.0.0.1:44168}{testattr=test, upgraded=true, ml.machine_memory=31606448128, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}]
org.elasticsearch.transport.RemoteTransportException: [upgraded-node-0][127.0.0.1:44168][internal:discovery/zen/join/validate]
Caused by: java.lang.IllegalStateException: unexpected byte [0x04]
        at org.elasticsearch.common.io.stream.StreamInput.readBoolean(StreamInput.java:439) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.common.io.stream.StreamInput.readBoolean(StreamInput.java:429) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.common.io.stream.StreamInput.readOptionalLong(StreamInput.java:322) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.xpack.core.ml.job.config.Job.<init>(Job.java:242) ~[?:?]
        at org.elasticsearch.xpack.core.ml.MlMetadata.<init>(MlMetadata.java:140) ~[?:?]
        at org.elasticsearch.common.io.stream.NamedWriteableAwareStreamInput.readNamedWriteable(NamedWriteableAwareStreamInput.java:46) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.common.io.stream.NamedWriteableAwareStreamInput.readNamedWriteable(NamedWriteableAwareStreamInput.java:39) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.cluster.metadata.MetaData.readFrom(MetaData.java:834) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.cluster.ClusterState.readFrom(ClusterState.java:727) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.discovery.zen.MembershipAction$ValidateJoinRequest.readFrom(MembershipAction.java:173) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.common.io.stream.Streamable.lambda$newWriteableReader$0(Streamable.java:51) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.transport.RequestHandlerRegistry.newRequest(RequestHandlerRegistry.java:56) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1633) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1501) ~[elasticsearch-6.5.0-SNAPSHOT.jar:6.5.0-SNAPSHOT]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) ~[?:?]

Bonus logs from my local reproduction
oldClusterTestCluster node0 run.log
oldClusterTestCluster node1 run.log
oldClusterTestCluster node2 run.log
oneThirdUpgradedTestCluster node0 run.log

Metadata

Metadata

Labels

:mlMachine learning>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions