Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is there a zombie process in the container? #497

Open
sjt157 opened this issue May 25, 2019 · 6 comments
Open

Why is there a zombie process in the container? #497

sjt157 opened this issue May 25, 2019 · 6 comments

Comments

@sjt157
Copy link

sjt157 commented May 25, 2019

1558757495(1)
1558757634(1)
1558757682(1)

platform: Ubuntu 16.04

docker-compose.yml

version: '2.1'

services:
  kafka1:
    image: wurstmeister/kafka:2.12-2.0.1
    restart: always
    hostname: kafka4
    container_name: kafka4
    ports:
    - 9097:9092
    environment:
  
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka4:9092
      KAFKA_LISTENERS: PLAINTEXT://kafka4:9092
      KAFKA_BROKER_ID: 18
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181
       #6 partition and 3 replicas
      KAFKA_CREATE_TOPICS: "wave2018021:6:3,wave2018031:6:3,wave2018041:6:3"
    volumes:
    - /home/ubuntu16/Docker/data/kafka1:/kafka
    - /home/ubuntu16/Docker/logs/kafka1:/opt/kafka/logs
    external_links:
    - zoo1
    - zoo2
    - zoo3
    networks:
      mybridge:
        ipv4_address: 172.18.20.230

  kafka2:
    image: wurstmeister/kafka:2.12-2.0.1
    restart: always
    hostname: kafka5
    container_name: kafka5
    ports:
    - 9098:9092
    environment:
     # KAFKA_ADVERTISED_HOST_NAME: kafka2
     # KAFKA_ADVERTISED_PORT: 9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka5:9092
      KAFKA_LISTENERS: PLAINTEXT://kafka5:9092
      KAFKA_BROKER_ID: 19
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181
      KAFKA_CREATE_TOPICS: "taxi1:6:3"
    volumes:
    - /home/ubuntu16/Docker/data/kafka2:/kafka
    - /home/ubuntu16/Docker/logs/kafka2:/opt/kafka/logs
    external_links:
    - zoo1
    - zoo2
    - zoo3
    networks:
      mybridge:
        ipv4_address: 172.18.20.231

  kafka3:
    image: wurstmeister/kafka:2.12-2.0.1
    restart: always
    hostname: kafka6
    container_name: kafka6
    ports:
    - 9099:9092

    environment:
     # KAFKA_ADVERTISED_HOST_NAME: kafka3
     # KAFKA_ADVERTISED_PORT: 9094
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka6:9092
      KAFKA_LISTENERS: PLAINTEXT://kafka6:9092
      KAFKA_BROKER_ID: 20
      KAFKA_ZOOKEEPER_CONNECT: zoo1:2181,zoo2:2181,zoo3:2181
      KAFKA_CREATE_TOPICS: "camera1:6:3"
    volumes:
    - /home/ubuntu16/Docker/data/kafka1:/kafka
    - /home/ubuntu16/Docker/logs/kafka1:/opt/kafka/logs
    external_links:
    - zoo1
    - zoo2
    - zoo3
    networks:
      mybridge:
        ipv4_address: 172.18.20.232

networks:
  mybridge:
    external:
      name: mybridge
@sscaling
Copy link
Collaborator

the create topics script runs in the background and is initiated by PID 1. Perhaps using disown may allow this to be reaped after it has been completed - however, as the start_kafka script runs as PID 1 I'm not sure if it will work. It will need a little investigation to test this.

@sjt157
Copy link
Author

sjt157 commented May 28, 2019

Do you mean add disown in the start_kafka script? and Where is disown added?After create-topics.sh &??
I am not very familiar with Shell.

@sscaling
Copy link
Collaborator

It would be create-topics.sh & disown - but as the script runs as PID 1, I don't think the kernel will reap the process as it's PID 1's responsibility. We'd probably need to introduce a lightweight init system such as dumb-init to handle this scenario - https://github.com/Yelp/dumb-init#why-you-need-an-init-system

@sjt157
Copy link
Author

sjt157 commented May 28, 2019

I see. What do you think of this solution?-https://github.com/phusion/baseimage-docker/blob/rel-0.9.16/image/bin/my_init .Which is more suitable to handle this scenario ?

@sscaling
Copy link
Collaborator

I think for most, it's probably not a huge issue - so unless it's causing problems's (such as filing up the last slot in the process table - in which case you probably have bigger issues) then there's nothing to do. The Phusion solution requires Python - which seems like a lot of extra baggage to pull in (100MBs vs < 1Mb)

@theBNT
Copy link

theBNT commented Aug 17, 2021

Hey, we are seeing this issue where eventually no new processes can be spawned on the host because of zombie processes with the same parent. The deployment is a single broker, zookeeper and AKHQ one, started via docker-compose on a SLES system.

Any hints on how to debug/improve this further?

Process is started by this container:
kafka-docker_kafka "start-kafka.sh" 29 hours ago Up 29 hours 0.0.0.0:9095->9095/tcp kafka-docker_kafka_1

so everytime a new topic is created (e.g. via AKHQ), a new defunct process hangs in the system (where 20653 is the kafka process)

root 32753 20653 0 07:20 ? 00:00:00 [timeout] <defunct>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants