Lost leader after restarting chef-backend #1473

maciu10 · 2018-02-15T17:14:18Z

Using a chef-backend cluster with 3 backend nodes - backend1, backend2 and backend3. backend1 is the current leader. When I stop chef-backend (chef-backendctl stop) on all three machines, the start it up again the cluster has no leader and never recovers.

Expected Behavior

Leader to be re-elected after restarting chef-backend on all three backend machines.

Current Behavior

Leader is never re-elected after starting the services and leaderl logs show:

2018-02-15_17:03:14.91161 [I] <0.715.0> leader_elector:init => leader_elector is starting in state: initializing
2018-02-15_17:03:14.91226 [I] <0.716.0> status_updater:init => status_updater started
2018-02-15_17:03:14.97625 [I] <0.563.0> no_mod:no_fun => Application leaderl started on node 'leaderl@127.0.0.1'
2018-02-15_17:03:14.97720 [I] <0.563.0> no_mod:no_fun => Application eper started on node 'leaderl@127.0.0.1'
2018-02-15_17:03:14.97785 [I] <0.563.0> no_mod:no_fun => Application recon started on node 'leaderl@127.0.0.1'
2018-02-15_17:03:15.09875 [I] <0.730.0> key_watcher:handle_info => Initial start of watcher on behalf of leader_block_state for "cb/control/blocked_leaders"
2018-02-15_17:03:15.09967 [I] <0.729.0> leader_elector:do_connect => Connecting as node backend1 (10.40.10.56,da5d0b3758d410e434be81a1453b4c24)
2018-02-15_17:03:15.09997 [I] <0.729.0> leader_elector:do_connect => Leader no_leader, Boot no_bootstrap_node
2018-02-15_17:03:15.10002 [I] <0.729.0> leader_elector:do_connect => No leader in place.
2018-02-15_17:03:15.10048 [I] <0.729.0> leader_elector:do_connect => I am NOT a bootstrap node because bootstrap_key returned no_bootstrap_node so I'll wait for a leader.

cluster-status shows:

Name            IP           GUID                              Role                PG        ES
backend2  10.40.10.57  59540d34fa370a28ed3098cc78b2a245  waiting_for_leader  follower  not_master
backend1  10.40.10.56  da5d0b3758d410e434be81a1453b4c24  waiting_for_leader  leader    not_master
backend3  10.40.10.55  8c4ae52bf8a7b77ee65b638f7159199a  waiting_for_leader  follower  master

Steps to Reproduce (for bugs)

Starting with backend1, then backend2 followed by backend3 "chef-backendctl stop"
Bring back chef in the reverse order - staring with backend3, then backend2 followed by backend3 "chef-backendctl start"

Your Environment

Chef Server Version: chef-backend 2.0.1
Total/free RAM and disk space: Total RAM 8GB / free RAM 5.2 GB
Operating System and Version: ubuntu 14.04.5 LTS
If upgrading, previous Chef Server version: N/A
Running in a container? no

The text was updated successfully, but these errors were encountered:

stevendanna · 2018-02-17T11:50:39Z

@maciu10 If your cluster is still down please contact chef-support for more immediate assistance. this looks like a case in which using the force-leader command on backend-1 should resolve the issue, but support can help if there is any doubt.

maciu10 · 2018-02-21T19:03:16Z

@stevendanna thanks for the response. I was able to bring back the leader right away using force-leader. I created the defect thinking that this should not happen (aka leader should come up online after restart of chef-backend).

PrajaktaPurohit · 2019-10-18T19:49:32Z

Chef-backend is not opened for users to log issues. Keeping this here with label component:chef-backend.

maciu10 changed the title ~~Lost leader after stopping chef-backend~~ Lost leader after restating chef-backend Feb 15, 2018

maciu10 changed the title ~~Lost leader after restating chef-backend~~ Lost leader after restarting chef-backend Feb 15, 2018

tas50 removed the Aspect: Correctness label May 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lost leader after restarting chef-backend #1473

Lost leader after restarting chef-backend #1473

maciu10 commented Feb 15, 2018 •

edited

Loading

stevendanna commented Feb 17, 2018

maciu10 commented Feb 21, 2018

PrajaktaPurohit commented Oct 18, 2019

Lost leader after restarting chef-backend #1473

Lost leader after restarting chef-backend #1473

Comments

maciu10 commented Feb 15, 2018 • edited Loading

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Your Environment

stevendanna commented Feb 17, 2018

maciu10 commented Feb 21, 2018

PrajaktaPurohit commented Oct 18, 2019

maciu10 commented Feb 15, 2018 •

edited

Loading