Skip to content

FE restart failed when has follower is not alive #4673

Open
@BabySid

Description

Describe the bug
i upgrade my doris cluster to master version.
and found error in fe'restart.
check the log and found content as below:

2020-09-26 11:37:18,961 WARN (UNKNOWN 172.28.18.140_9010_1591588831143(-1)|1) [Catalog.notifyNewFETypeTransfer():2356] notify new FE type transfer: UNKNOWN
2020-09-26 11:37:20,967 WARN (RepNode 172.28.18.140_9010_1591588831143(-1)|56) [Catalog.notifyNewFETypeTransfer():2356] notify new FE type transfer: MASTER
2020-09-26 11:37:21,162 ERROR (stateListener|67) [EditLog.loadJournal():804] Operation Type 29
java.lang.NullPointerException: null
        at org.apache.doris.consistency.ConsistencyChecker.replayFinishConsistencyCheck(ConsistencyChecker.java:373) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:332) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2497) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.transferToMaster(Catalog.java:1167) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.access$1100(Catalog.java:261) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog$4.runOneCycle(Catalog.java:2414) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]

To Reproduce
Steps to reproduce the behavior:

  1. run command add follower ... on the old version of doris. (the follower fe to be added is NOT started now)
  2. run bin/stop_fe.sh to stop the old version of fe
  3. upgrade files to new version of fe. e.g. lib/* webroot/*
  4. run bin/start_fe.sh to start the new version of fe
  5. check the log then found the error as above

how to prevent in trick method.

  1. rollback the version
  2. run command drop follower ... on the old version of doris
  3. upgrade files and restart fe
  4. fe start ok
  5. run add follower ... on the new version

Expected behavior

  1. add follower on the old version whether the service survives or not
  2. upgrade the version of fe
  3. restart ok

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/fixCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions