-
Notifications
You must be signed in to change notification settings - Fork 53
Description
I'm using topicmappr built on the v4.2.1 tag (https://github.com/DataDog/kafka-kit/tree/v4.2.1).
I have the following topic set-up in production, spread between brokers 1001 to 1009:
Topic:webrequest_text PartitionCount:24 ReplicationFactor:3 Configs:
Topic: webrequest_text Partition: 0 Leader: 1008 Replicas: 1008,1001,1004 Isr: 1001,1004,1008
Topic: webrequest_text Partition: 1 Leader: 1002 Replicas: 1002,1005,1008 Isr: 1002,1005,1008
Topic: webrequest_text Partition: 2 Leader: 1007 Replicas: 1007,1009,1002 Isr: 1002,1007,1009
Topic: webrequest_text Partition: 3 Leader: 1001 Replicas: 1001,1005,1006 Isr: 1001,1005,1006
Topic: webrequest_text Partition: 4 Leader: 1009 Replicas: 1009,1002,1007 Isr: 1002,1007,1009
Topic: webrequest_text Partition: 5 Leader: 1005 Replicas: 1005,1008,1001 Isr: 1005,1008,1001
Topic: webrequest_text Partition: 6 Leader: 1003 Replicas: 1003,1007,1009 Isr: 1003,1009,1007
Topic: webrequest_text Partition: 7 Leader: 1004 Replicas: 1003,1004,1006 Isr: 1003,1004,1006
Topic: webrequest_text Partition: 8 Leader: 1006 Replicas: 1006,1007,1001 Isr: 1006,1007,1001
Topic: webrequest_text Partition: 9 Leader: 1008 Replicas: 1008,1004,1003 Isr: 1003,1004,1008
Topic: webrequest_text Partition: 10 Leader: 1002 Replicas: 1002,1008,1004 Isr: 1002,1004,1008
Topic: webrequest_text Partition: 11 Leader: 1007 Replicas: 1007,1002,1009 Isr: 1002,1007,1009
Topic: webrequest_text Partition: 12 Leader: 1001 Replicas: 1001,1007,1008 Isr: 1001,1007,1008
Topic: webrequest_text Partition: 13 Leader: 1009 Replicas: 1009,1003,1005 Isr: 1003,1009,1005
Topic: webrequest_text Partition: 14 Leader: 1005 Replicas: 1005,1009,1002 Isr: 1005,1009,1002
Topic: webrequest_text Partition: 15 Leader: 1003 Replicas: 1003,1005,1006 Isr: 1003,1005,1006
Topic: webrequest_text Partition: 16 Leader: 1004 Replicas: 1003,1004,1006 Isr: 1003,1004,1006
Topic: webrequest_text Partition: 17 Leader: 1006 Replicas: 1002,1006,1007 Isr: 1002,1006,1007
Topic: webrequest_text Partition: 18 Leader: 1008 Replicas: 1008,1003,1004 Isr: 1003,1004,1008
Topic: webrequest_text Partition: 19 Leader: 1002 Replicas: 1002,1005,1009 Isr: 1002,1009,1005
Topic: webrequest_text Partition: 20 Leader: 1007 Replicas: 1007,1008,1001 Isr: 1001,1007,1008
Topic: webrequest_text Partition: 21 Leader: 1001 Replicas: 1001,1009,1005 Isr: 1001,1005,1009
Topic: webrequest_text Partition: 22 Leader: 1009 Replicas: 1009,1001,1005 Isr: 1001,1009,1005
Topic: webrequest_text Partition: 23 Leader: 1005 Replicas: 1005,1002,1006 Isr: 1005,1002,1006
I attempted to generate a chunked reassignment plan using --force-rebuild, as I'm evacuating brokers 1001 to 1006 to brokers 1010 to 1015, while keeping brokers 1007 to 1009. A non-forced-rebuild would produce a non-homogeneous number of partitions per broker, as it tries to minimize the number of partition movements.
However, that chunked force rebuild produces invalid partition movements:
$ topicmappr rebuild --topics webrequest_text --brokers 1015,1014,1013,1012,1011,1010,1009,1008,1007 --chunk-step-size 1 --force-rebuild | grep -v no-op
Topics:
webrequest_text
Broker change summary:
Broker 1001 marked for removal
Broker 1004 marked for removal
Broker 1005 marked for removal
Broker 1003 marked for removal
Broker 1002 marked for removal
Broker 1006 marked for removal
New broker 1012
New broker 1015
New broker 1010
New broker 1014
New broker 1011
New broker 1013
-
Replacing 6, added 6, missing 0, total count changed by 0
Action:
Rebuild topic with 6 broker(s) marked for replacement
Force rebuilding map
Partition map changes:
webrequest_text p0: [1008 1001 1004] -> [1008 1010 1014] replaced broker
webrequest_text p1: [1002 1005 1008] -> [1013 1012 1007] replaced broker
webrequest_text p2: [1007 1009 1002] -> [1007 1014 1008] replaced broker
webrequest_text p3: [1001 1006 1005] -> [1015 1010 1009] replaced broker
webrequest_text p4: [1009 1002 1007] -> [1011 1015 1008] replaced broker
webrequest_text p5: [1005 1008 1001] -> [1009 1007 1013] replaced broker
webrequest_text p6: [1003 1007 1009] -> [1010 1013 1007] replaced broker
webrequest_text p7: [1004 1003 1006] -> [1012 1009 1014] replaced broker
webrequest_text p8: [1006 1001 1007] -> [1014 1012 1009] replaced broker
webrequest_text p9: [1008 1004 1003] -> [1008 1014 1011] replaced broker
webrequest_text p10: [1002 1008 1004] -> [1013 1011 1007] replaced broker
webrequest_text p11: [1007 1002 1009] -> [1007 1008 1011] replaced broker
webrequest_text p12: [1001 1007 1008] -> [1015 1010 1009] replaced broker
webrequest_text p13: [1009 1003 1005] -> [1011 1008 1013] replaced broker
webrequest_text p14: [1005 1009 1002] -> [1009 1012 1015] replaced broker
webrequest_text p15: [1003 1005 1006] -> [1010 1007 1015] replaced broker
webrequest_text p16: [1004 1006 1003] -> [1012 1015 1008] replaced broker
webrequest_text p17: [1006 1002 1007] -> [1014 1011 1007] replaced broker
webrequest_text p18: [1008 1003 1004] -> [1008 1014 1010] replaced broker
webrequest_text p19: [1002 1005 1009] -> [1013 1009 1012] replaced broker
webrequest_text p20: [1007 1008 1001] -> [1007 1013 1011] replaced broker
webrequest_text p21: [1001 1009 1005] -> [1015 1010 1008] replaced broker
webrequest_text p22: [1009 1001 1005] -> [1011 1015 1009] replaced broker
webrequest_text p23: [1005 1006 1002] -> [1009 1012 1013] replaced broker
Broker distribution:
degree [min/max/avg]: 5/6/5.78 -> 6/8/6.44
-
Broker 1007 - leader: 3, follower: 6, total: 9
Broker 1008 - leader: 3, follower: 6, total: 9
Broker 1009 - leader: 3, follower: 6, total: 9
Broker 1010 - leader: 2, follower: 5, total: 7
Broker 1011 - leader: 3, follower: 5, total: 8
Broker 1012 - leader: 2, follower: 5, total: 7
Broker 1013 - leader: 3, follower: 5, total: 8
Broker 1014 - leader: 2, follower: 5, total: 7
Broker 1015 - leader: 3, follower: 5, total: 8
Generating reassignments in chunks 1 brokers at a time:
Skipping noop map output for brokers 1015
Skipping noop map output for brokers 1014
Skipping noop map output for brokers 1013
Skipping noop map output for brokers 1012
Skipping noop map output for brokers 1011
Skipping noop map output for brokers 1010
Changes for partition map chunk 1 for brokers 1009
Partition map changes:
webrequest_text p2: [1007 1009 1002] -> [1007 1014 1002] replaced broker
webrequest_text p4: [1009 1002 1007] -> [1011 1002 1007] replaced broker
webrequest_text p6: [1003 1007 1009] -> [1003 1007 1007] replaced broker
webrequest_text p11: [1007 1002 1009] -> [1007 1002 1011] replaced broker
webrequest_text p13: [1009 1003 1005] -> [1011 1003 1005] replaced broker
webrequest_text p14: [1005 1009 1002] -> [1005 1012 1002] replaced broker
webrequest_text p19: [1002 1005 1009] -> [1002 1005 1012] replaced broker
webrequest_text p21: [1001 1009 1005] -> [1001 1010 1005] replaced broker
webrequest_text p22: [1009 1001 1005] -> [1011 1001 1005] replaced broker
Changes for partition map chunk 2 for brokers 1008
Partition map changes:
webrequest_text p1: [1002 1005 1008] -> [1002 1005 1007] replaced broker
webrequest_text p5: [1005 1008 1001] -> [1005 1007 1001] replaced broker
webrequest_text p10: [1002 1008 1004] -> [1002 1011 1004] replaced broker
webrequest_text p12: [1001 1007 1008] -> [1001 1007 1009] replaced broker
webrequest_text p20: [1007 1008 1001] -> [1007 1013 1001] replaced broker
Changes for partition map chunk 3 for brokers 1007
Partition map changes:
webrequest_text p4: [1011 1002 1007] -> [1011 1002 1008] replaced broker
webrequest_text p6: [1003 1007 1007] -> [1003 1013 1007] replaced broker
webrequest_text p8: [1006 1001 1007] -> [1006 1001 1009] replaced broker
webrequest_text p12: [1001 1007 1009] -> [1001 1010 1009] replaced broker
Changes for partition map chunk 4 for brokers 1006
Partition map changes:
webrequest_text p3: [1001 1006 1005] -> [1001 1010 1005] replaced broker
webrequest_text p7: [1004 1003 1006] -> [1004 1003 1014] replaced broker
webrequest_text p8: [1006 1001 1009] -> [1014 1001 1009] replaced broker
webrequest_text p15: [1003 1005 1006] -> [1003 1005 1015] replaced broker
webrequest_text p16: [1004 1006 1003] -> [1004 1015 1003] replaced broker
webrequest_text p17: [1006 1002 1007] -> [1014 1002 1007] replaced broker
webrequest_text p23: [1005 1006 1002] -> [1005 1012 1002] replaced broker
Changes for partition map chunk 5 for brokers 1005
Partition map changes:
webrequest_text p1: [1002 1005 1007] -> [1002 1012 1007] replaced broker
webrequest_text p3: [1001 1010 1005] -> [1001 1010 1009] replaced broker
webrequest_text p5: [1005 1007 1001] -> [1009 1007 1001] replaced broker
webrequest_text p13: [1011 1003 1005] -> [1011 1003 1013] replaced broker
webrequest_text p14: [1005 1012 1002] -> [1009 1012 1002] replaced broker
webrequest_text p15: [1003 1005 1015] -> [1003 1007 1015] replaced broker
webrequest_text p19: [1002 1005 1012] -> [1002 1009 1012] replaced broker
webrequest_text p21: [1001 1010 1005] -> [1001 1010 1008] replaced broker
webrequest_text p22: [1011 1001 1005] -> [1011 1001 1009] replaced broker
webrequest_text p23: [1005 1012 1002] -> [1009 1012 1002] replaced broker
Changes for partition map chunk 6 for brokers 1004
Partition map changes:
webrequest_text p0: [1008 1001 1004] -> [1008 1001 1014] replaced broker
webrequest_text p7: [1004 1003 1014] -> [1012 1003 1014] replaced broker
webrequest_text p9: [1008 1004 1003] -> [1008 1014 1003] replaced broker
webrequest_text p10: [1002 1011 1004] -> [1002 1011 1007] replaced broker
webrequest_text p16: [1004 1015 1003] -> [1012 1015 1003] replaced broker
webrequest_text p18: [1008 1003 1004] -> [1008 1003 1010] replaced broker
Changes for partition map chunk 7 for brokers 1003
Partition map changes:
webrequest_text p6: [1003 1013 1007] -> [1010 1013 1007] replaced broker
webrequest_text p7: [1012 1003 1014] -> [1012 1009 1014] replaced broker
webrequest_text p9: [1008 1014 1003] -> [1008 1014 1011] replaced broker
webrequest_text p13: [1011 1003 1013] -> [1011 1008 1013] replaced broker
webrequest_text p15: [1003 1007 1015] -> [1010 1007 1015] replaced broker
webrequest_text p16: [1012 1015 1003] -> [1012 1015 1008] replaced broker
webrequest_text p18: [1008 1003 1010] -> [1008 1014 1010] replaced broker
Changes for partition map chunk 8 for brokers 1002
Partition map changes:
webrequest_text p1: [1002 1012 1007] -> [1013 1012 1007] replaced broker
webrequest_text p2: [1007 1014 1002] -> [1007 1014 1008] replaced broker
webrequest_text p4: [1011 1002 1008] -> [1011 1015 1008] replaced broker
webrequest_text p10: [1002 1011 1007] -> [1013 1011 1007] replaced broker
webrequest_text p11: [1007 1002 1011] -> [1007 1008 1011] replaced broker
webrequest_text p14: [1009 1012 1002] -> [1009 1012 1015] replaced broker
webrequest_text p17: [1014 1002 1007] -> [1014 1011 1007] replaced broker
webrequest_text p19: [1002 1009 1012] -> [1013 1009 1012] replaced broker
webrequest_text p23: [1009 1012 1002] -> [1009 1012 1013] replaced broker
Changes for partition map chunk 9 for brokers 1001
Partition map changes:
webrequest_text p0: [1008 1001 1014] -> [1008 1010 1014] replaced broker
webrequest_text p3: [1001 1010 1009] -> [1015 1010 1009] replaced broker
webrequest_text p5: [1009 1007 1001] -> [1009 1007 1013] replaced broker
webrequest_text p8: [1014 1001 1009] -> [1014 1012 1009] replaced broker
webrequest_text p12: [1001 1010 1009] -> [1015 1010 1009] replaced broker
webrequest_text p20: [1007 1013 1001] -> [1007 1013 1011] replaced broker
webrequest_text p21: [1001 1010 1008] -> [1015 1010 1008] replaced broker
webrequest_text p22: [1011 1001 1009] -> [1011 1015 1009] replaced broker
WARN:
[none]
New partition maps:
webrequest_text-phase1.json
webrequest_text-phase7.json
webrequest_text-phase8.json
webrequest_text-phase0.json
webrequest_text-phase2.json
webrequest_text-phase3.json
webrequest_text-phase4.json
webrequest_text-phase5.json
webrequest_text-phase6.json
Namely, in the webrequest_text-phase0 plan, we see that 1007 is present twice:
webrequest_text p6: [1003 1007 1009] -> [1003 1007 1007] replaced broker
When applying that plan, we get the following (expected) error from Kafka:
$ kafka reassign-partitions --reassignment-json-file ./webrequest_text-phase0.json --execute --throttle 80000000
Partitions reassignment failed due to Partition replica lists may not contain duplicate entries: webrequest_text-6 contains multiple entries for 1007
kafka.common.AdminCommandFailedException: Partition replica lists may not contain duplicate entries: webrequest_text-6 contains multiple entries for 1007
at kafka.admin.ReassignPartitionsCommand$.parseAndValidate(ReassignPartitionsCommand.scala:290)
at kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:203)
at kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:199)
at kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:62)
at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
Let me know if there's any other additional detail you'd need to reproduce the issue. Thanks!