Skip to content

Zookeeper in 3 node Kubernetes cluster does not pass heath check in 2.8.0 #11070

Closed
@andrekramer1

Description

Describe the bug
Running the Kubernetes Helm chart with a 2.8.0 pulsar image, Zookeeper pod 0 runs but fails repeatedly. The rest of the Pulsar cluster is waiting on Zookeeper initialization. The Zookeeper health check is not passing "echo ruok | nc localhost 2181" is not returning "imok" - just hanging I think. So the kubernetes health check is failing repeatedly. Zookeeper is also complaining about not resolving zookeeper-1 and zookeeper-2 as usual but the zookeeper-0 log have a new NullReferenceException that we've not seen before:

10:58:55.990 [epollEventLoopGroup-4-2] WARN org.apache.zookeeper.server.NettyServerCnxnFactory - Exception caught
java.lang.NullPointerException: null
at org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.channelActive(NettyServerCnxnFactory.java:258) ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:209) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1398) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:230) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:216) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:895) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:522) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486) [io.netty-netty-transport-4.1.63.Final.jar:4.1.63.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-native-epoll-4.1.63.Final-linux-x86_64.jar:4.1.63.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.63.Final.jar:4.1.63.Final]
at java.lang.Thread.run(Thread.java:829) [?:?

To Reproduce
Pulsar 2.8.0 docker image on Kubernetes as deployed by Helm chart. Changing just the zookeeper image to 2.7.0 and the cluster came up.

Additional context

Possibly java 11 is part of the problem or just a bad version of Zookeeper?

Metadata

Assignees

No one assigned

    Labels

    release/blockerIndicate the PR or issue that should block the release until it gets resolvedtype/bugThe PR fixed a bug or issue reported a bug

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions