Skip to content

balance remove error #3654

Closed
Closed
@HarrisChu

Description

@HarrisChu

just record it.

issue 1:
src and dst are in the same.

issue 2:
after metad leader changed, cannot stop the job.

(root@nebula) [(none)]>       CREATE SPACE test(vid_type=int, replica_factor=3, partition_num=10) on "z1","z2","z3","z4";
Execution succeeded (time spent 10524/20016 us)

Thu, 06 Jan 2022 17:47:06 CST

(root@nebula) [(none)]> use test
Execution succeeded (time spent 2591/13982 us)

Thu, 06 Jan 2022 17:47:10 CST

(root@nebula) [test]>
(root@nebula) [test]> show parts
+--------------+-------------------+-----------------------------------------------------+-------+
| Partition ID | Leader            | Peers                                               | Losts |
+--------------+-------------------+-----------------------------------------------------+-------+
| 1            | "127.0.0.1:12391" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:17318" | ""    |
| 2            | "127.0.0.1:18320" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:19179" | ""    |
| 3            | "127.0.0.1:17318" | "127.0.0.1:12391, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 4            | "127.0.0.1:19179" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 5            | "127.0.0.1:17318" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:17318" | ""    |
| 6            | "127.0.0.1:18320" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:19179" | ""    |
| 7            | "127.0.0.1:12391" | "127.0.0.1:12391, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 8            | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 9            | "127.0.0.1:17318" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:17318" | ""    |
| 10           | "127.0.0.1:18320" | "127.0.0.1:12391, 127.0.0.1:18320, 127.0.0.1:19179" | ""    |
+--------------+-------------------+-----------------------------------------------------+-------+
Got 10 rows (time spent 2588/16467 us)

Thu, 06 Jan 2022 17:47:12 CST

(root@nebula) [test]> SUBMIT JOB BALANCE IN ZONE REMOVE 127.0.0.1:12391
+------------+
| New Job Id |
+------------+
| 2          |
+------------+
Got 1 rows (time spent 6921/24940 us)

Thu, 06 Jan 2022 17:47:39 CST

(root@nebula) [test]> show job 2
+------------------------+------------------------------------+---------------+---------------------------------+----------------------------+
| Job Id(spaceId:partId) | Command(src->dst)                  | Status        | Start Time                      | Stop Time                  |
+------------------------+------------------------------------+---------------+---------------------------------+----------------------------+
| 2                      | "DATA_BALANCE"                     | "RUNNING"     | "2022-01-06T09:47:39.000000000" | "__EMPTY__"                |
| "2, 1:1"               | "127.0.0.1:12391->127.0.0.1:12391" | "IN_PROGRESS" | 2022-01-06T09:47:39.000000      |                            |
| "2, 1:2"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "2, 1:3"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "2, 1:5"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "2, 1:6"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "2, 1:7"               | "127.0.0.1:12391->127.0.0.1:12391" | "IN_PROGRESS" | 2022-01-06T09:47:39.000000      |                            |
| "2, 1:9"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "2, 1:10"              | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"      | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000 |
| "Total:8"              | "Succeeded:0"                      | "Failed:6"    | "In Progress:2"                 | "Invalid:0"                |
+------------------------+------------------------------------+---------------+---------------------------------+----------------------------+
Got 10 rows (time spent 2760/38490 us)

Thu, 06 Jan 2022 17:47:42 CST

(root@nebula) [test]> show job 2
+------------------------+------------------------------------+------------+---------------------------------+---------------------------------+
| Job Id(spaceId:partId) | Command(src->dst)                  | Status     | Start Time                      | Stop Time                       |
+------------------------+------------------------------------+------------+---------------------------------+---------------------------------+
| 2                      | "DATA_BALANCE"                     | "FAILED"   | "2022-01-06T09:47:39.000000000" | "2022-01-06T09:47:44.000000000" |
| "2, 1:1"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:44.000000      |
| "2, 1:2"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "2, 1:3"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "2, 1:5"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "2, 1:6"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "2, 1:7"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:44.000000      |
| "2, 1:9"               | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "2, 1:10"              | "127.0.0.1:12391->127.0.0.1:12391" | "FAILED"   | 2022-01-06T09:47:39.000000      | 2022-01-06T09:47:39.000000      |
| "Total:8"              | "Succeeded:0"                      | "Failed:8" | "In Progress:0"                 | "Invalid:0"                     |
+------------------------+------------------------------------+------------+---------------------------------+---------------------------------+
Got 10 rows (time spent 2420/57369 us)

Thu, 06 Jan 2022 17:47:45 CST
(root@nebula) [test]> show parts
+--------------+-------------------+-----------------------------------------------------+-------+
| Partition ID | Leader            | Peers                                               | Losts |
+--------------+-------------------+-----------------------------------------------------+-------+
| 1            | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:12391" | ""    |
| 2            | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:19179, 127.0.0.1:12391" | ""    |
| 3            | "127.0.0.1:17318" | "127.0.0.1:17318, 127.0.0.1:19179, 127.0.0.1:12391" | ""    |
| 4            | "127.0.0.1:19179" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 5            | "127.0.0.1:17318" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:12391" | ""    |
| 6            | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:19179, 127.0.0.1:12391" | ""    |
| 7            | "127.0.0.1:17318" | "127.0.0.1:17318, 127.0.0.1:19179, 127.0.0.1:12391" | ""    |
| 8            | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:19179" | ""    |
| 9            | "127.0.0.1:17318" | "127.0.0.1:18320, 127.0.0.1:17318, 127.0.0.1:12391" | ""    |
| 10           | "127.0.0.1:18320" | "127.0.0.1:18320, 127.0.0.1:19179, 127.0.0.1:12391" | ""    |
+--------------+-------------------+-----------------------------------------------------+-------+
Got 10 rows (time spent 2836/7545 us)

Thu, 06 Jan 2022 18:24:32 CST

(root@nebula) [test]> SUBMIT JOB BALANCE IN ZONE REMOVE 127.0.0.1:12391
+------------+
| New Job Id |
+------------+
| 3          |
+------------+
Got 1 rows (time spent 6842/10417 us)

Thu, 06 Jan 2022 18:24:59 CST

(root@nebula) [test]> show job 3
[ERROR (-1005)]: LeaderChanged: Leader changed!

Thu, 06 Jan 2022 18:25:05 CST

(root@nebula) [test]> show job 3
+------------------------+-------------------+------------+---------------------------------+-------------+
| Job Id(spaceId:partId) | Command(src->dst) | Status     | Start Time                      | Stop Time   |
+------------------------+-------------------+------------+---------------------------------+-------------+
| 3                      | "DATA_BALANCE"    | "RUNNING"  | "2022-01-06T10:24:59.000000000" | "__EMPTY__" |
| "Total:0"              | "Succeeded:0"     | "Failed:0" | "In Progress:0"                 | "Invalid:0" |
+------------------------+-------------------+------------+---------------------------------+-------------+
Got 2 rows (time spent 3328/7089 us)

Thu, 06 Jan 2022 18:25:07 CST

(root@nebula) [test]> show jobs
+--------+----------------+-----------+----------------------------+----------------------------+
| Job Id | Command        | Status    | Start Time                 | Stop Time                  |
+--------+----------------+-----------+----------------------------+----------------------------+
| 3      | "DATA_BALANCE" | "RUNNING" | 2022-01-06T10:24:59.000000 |                            |
| 2      | "DATA_BALANCE" | "FAILED"  | 2022-01-06T09:47:39.000000 | 2022-01-06T09:47:44.000000 |
+--------+----------------+-----------+----------------------------+----------------------------+
Got 2 rows (time spent 7155/11466 us)

Thu, 06 Jan 2022 18:25:38 CST

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

type/bugType: something is unexpected

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions