Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdc rolling upgrade / scale-in ultilize two-phase-scheduling #1972

Merged
merged 64 commits into from
Jul 26, 2022
Merged
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
099e673
add cdc api .
3AceShowHand Jun 30, 2022
9d5e395
http client add put method
3AceShowHand Jul 1, 2022
a28115d
add a lot to support 2 phase scheduling
3AceShowHand Jul 1, 2022
182b4eb
add 2 files
3AceShowHand Jul 1, 2022
d7814af
add more http method
3AceShowHand Jul 1, 2022
6c19916
add more code.
3AceShowHand Jul 2, 2022
7dfaff9
add first version prerestart for cdc
3AceShowHand Jul 5, 2022
8db8be3
add first version prerestart for cdc
3AceShowHand Jul 5, 2022
077609f
refine cdc rolling upgrade .
3AceShowHand Jul 5, 2022
527940a
add changes.
3AceShowHand Jul 6, 2022
c72c276
fix go.sum
3AceShowHand Jul 6, 2022
92df189
upgrade is ready for test.
3AceShowHand Jul 6, 2022
119a271
finish scale-in, ready for test.
3AceShowHand Jul 6, 2022
aaaf4dc
fix
3AceShowHand Jul 6, 2022
1d3875c
tiny fix.
3AceShowHand Jul 6, 2022
f14756c
update go.mod
3AceShowHand Jul 6, 2022
8e59bd2
check status code when drain the capture.
3AceShowHand Jul 6, 2022
868557f
enlarge the timeout for get all captures.
3AceShowHand Jul 6, 2022
6d22b15
Merge branch 'cdc-scale-in-drain-capture' of https://github.com/3AceS…
3AceShowHand Jul 7, 2022
d54af39
refine by the first code review.
3AceShowHand Jul 7, 2022
411ebf8
fix log.
3AceShowHand Jul 7, 2022
4fbd54e
ignore 404 when all captures closed.
3AceShowHand Jul 7, 2022
a43363d
check status code for all http request.
3AceShowHand Jul 7, 2022
535b4cd
also checks body.
3AceShowHand Jul 7, 2022
653fc4d
check status before scale-in
3AceShowHand Jul 7, 2022
5f77b6f
do not force stop cdc when restart, if not all cdc nodes selected.
3AceShowHand Jul 7, 2022
4abfe3f
fix make check
3AceShowHand Jul 7, 2022
321a0c8
fix make check
3AceShowHand Jul 7, 2022
750dc82
fix stop and api
3AceShowHand Jul 8, 2022
4f7587f
refact the drain capture.
3AceShowHand Jul 8, 2022
6c19b95
refact the drain capture.
3AceShowHand Jul 8, 2022
99f6129
refact the drain capture.
3AceShowHand Jul 8, 2022
77ad85b
remove the log
3AceShowHand Jul 8, 2022
e3b5ab4
redact the log
3AceShowHand Jul 8, 2022
7caf37f
refact drain capture.
3AceShowHand Jul 8, 2022
0427d3a
remove unncessary change, and drop changes to restart command.
3AceShowHand Jul 8, 2022
dd3c143
refact one more time.
3AceShowHand Jul 8, 2022
42497ab
upgrade check instance status.
3AceShowHand Jul 8, 2022
b5165f2
add a log for debug
3AceShowHand Jul 8, 2022
4f31e63
refine cdc api url
3AceShowHand Jul 9, 2022
569aa82
for debug
3AceShowHand Jul 9, 2022
ed6d598
pass upgrade 6.0.0 to 6.1.0
3AceShowHand Jul 9, 2022
e9a5823
stop cdc cluster can be upgrade now.
3AceShowHand Jul 9, 2022
efa1826
fix stop cluster tlscfg not nil.
3AceShowHand Jul 9, 2022
9a46feb
fix inst count, and add timing to pre-restart and post-restart
3AceShowHand Jul 9, 2022
29382b3
change log level to debug, api layer return all errors, and handle by…
3AceShowHand Jul 9, 2022
6c0eb54
fix cdc api.
3AceShowHand Jul 9, 2022
4cf51ab
tiny fix on log.
3AceShowHand Jul 9, 2022
84f8b19
change level to debug
3AceShowHand Jul 9, 2022
40aeffb
fix old version does not support api, cause capture not found.
3AceShowHand Jul 9, 2022
1ea178a
tiny fix.
3AceShowHand Jul 10, 2022
8954563
refine drain capture error handling.
3AceShowHand Jul 10, 2022
fc5160b
revert change in go.mod
3AceShowHand Jul 10, 2022
c3fc898
fix cdc api.
3AceShowHand Jul 11, 2022
a47f234
revert change
3AceShowHand Jul 11, 2022
525824d
update api-timeout flag description
3AceShowHand Jul 11, 2022
642386f
tiny fix
3AceShowHand Jul 11, 2022
59bae37
sleep 2 seconds after new owner found.
3AceShowHand Jul 11, 2022
b005eac
add log for debug
3AceShowHand Jul 12, 2022
0240507
fix some typo.
3AceShowHand Jul 20, 2022
885d39b
tiny fix.
3AceShowHand Jul 20, 2022
5533f3f
fix scale-in, pass option directly.
3AceShowHand Jul 21, 2022
f7fa27f
refine the code, can be reviewed now.
3AceShowHand Jul 21, 2022
05ccc02
Merge branch 'master' into cdc-scale-in-drain-capture
3AceShowHand Jul 21, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion components/cluster/command/patch.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ func newPatchCmd() *cobra.Command {
cmd.Flags().BoolVar(&overwrite, "overwrite", false, "Use this package in the future scale-out operations")
cmd.Flags().StringSliceVarP(&gOpt.Nodes, "node", "N", nil, "Specify the nodes")
cmd.Flags().StringSliceVarP(&gOpt.Roles, "role", "R", nil, "Specify the roles")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders, also for TiCDC drain one capture")
cmd.Flags().BoolVarP(&offlineMode, "offline", "", false, "Patch a stopped cluster")
return cmd
}
2 changes: 1 addition & 1 deletion components/cluster/command/reload.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ func newReloadCmd() *cobra.Command {
cmd.Flags().BoolVar(&gOpt.Force, "force", false, "Force reload without transferring PD leader and ignore remote error")
cmd.Flags().StringSliceVarP(&gOpt.Roles, "role", "R", nil, "Only reload specified roles")
cmd.Flags().StringSliceVarP(&gOpt.Nodes, "node", "N", nil, "Only reload specified nodes")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders, also for TiCDC drain one capture")
cmd.Flags().BoolVarP(&gOpt.IgnoreConfigCheck, "ignore-config-check", "", false, "Ignore the config check result")
cmd.Flags().BoolVar(&skipRestart, "skip-restart", false, "Only refresh configuration to remote and do not restart services")

Expand Down
2 changes: 1 addition & 1 deletion components/cluster/command/scale_in.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ func newScaleInCmd() *cobra.Command {
}

cmd.Flags().StringSliceVarP(&gOpt.Nodes, "node", "N", nil, "Specify the nodes (required)")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 300, "Timeout in seconds when transferring PD and TiKV store leaders, also for TiCDC drain one capture")
cmd.Flags().BoolVar(&gOpt.Force, "force", false, "Force just try stop and destroy instance before removing the instance from topo")

_ = cmd.MarkFlagRequired("node")
Expand Down
2 changes: 1 addition & 1 deletion components/cluster/command/upgrade.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ func newUpgradeCmd() *cobra.Command {
},
}
cmd.Flags().BoolVar(&gOpt.Force, "force", false, "Force upgrade without transferring PD leader")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 600, "Timeout in seconds when transferring PD and TiKV store leaders")
cmd.Flags().Uint64Var(&gOpt.APITimeout, "transfer-timeout", 600, "Timeout in seconds when transferring PD and TiKV store leaders, also for TiCDC drain one capture")
cmd.Flags().BoolVarP(&gOpt.IgnoreConfigCheck, "ignore-config-check", "", false, "Ignore the config check result")
cmd.Flags().BoolVarP(&offlineMode, "offline", "", false, "Upgrade a stopped cluster")

Expand Down
3 changes: 2 additions & 1 deletion components/dm/command/scale_in.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ func ScaleInDMCluster(
continue
}
instCount[instance.GetHost()]--
if err := operator.StopAndDestroyInstance(ctx, topo, instance, options, instCount[instance.GetHost()] == 0); err != nil {
if err := operator.StopAndDestroyInstance(ctx, topo, instance, options, false, instCount[instance.GetHost()] == 0, tlsCfg); err != nil {
log.Warnf("failed to stop/destroy %s: %v", component.Name(), err)
}
}
Expand Down Expand Up @@ -156,6 +156,7 @@ func ScaleInDMCluster(
[]dm.Instance{instance},
noAgentHosts,
options.OptTimeout,
false,
false, /* evictLeader */
&tls.Config{}, /* not used as evictLeader is false */
); err != nil {
Expand Down
37 changes: 24 additions & 13 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ require (
github.com/AstroProfundis/tabby v1.1.1-color
github.com/BurntSushi/toml v1.1.0
github.com/ScaleFT/sshkeys v0.0.0-20200327173127-6142f742bca5
github.com/VividCortex/ewma v1.2.0 // indirect
github.com/alecthomas/assert v1.0.0
github.com/appleboy/easyssh-proxy v1.3.10-0.20211209134747-6671f69d85f5
github.com/asaskevich/EventBus v0.0.0-20200907212545-49d423059eef
Expand All @@ -24,23 +23,19 @@ require (
github.com/gofrs/flock v0.8.1
github.com/gogo/protobuf v1.3.2
github.com/golang/protobuf v1.5.2
github.com/golang/snappy v0.0.4 // indirect
github.com/google/uuid v1.3.0
github.com/gorilla/mux v1.8.0
github.com/grpc-ecosystem/grpc-gateway v1.16.0
github.com/jeremywohl/flatten v1.0.1
github.com/joomcode/errorx v1.1.0
github.com/juju/ansiterm v0.0.0-20210929141451-8b71cc96ebdc
github.com/kr/text v0.2.0 // indirect
github.com/mattn/go-colorable v0.1.12 // indirect
github.com/mattn/go-runewidth v0.0.13
github.com/otiai10/copy v1.7.0
github.com/pingcap/check v0.0.0-20211026125417-57bd13f7b5f0
github.com/pingcap/errors v0.11.5-0.20201126102027-b0a155152ca3
github.com/pingcap/failpoint v0.0.0-20220423142525-ae43b7f4e5c3
github.com/pingcap/fn v1.0.0
github.com/pingcap/kvproto v0.0.0-20220525022339-6aaebf466305
github.com/pingcap/log v1.1.0 // indirect
github.com/pingcap/tidb-insight/collector v0.0.0-20220111101533-227008e9835b
github.com/pkg/errors v0.9.1
github.com/prometheus/client_model v0.2.0
Expand All @@ -55,29 +50,45 @@ require (
github.com/spf13/cobra v1.4.0
github.com/spf13/pflag v1.0.5
github.com/stretchr/testify v1.7.1
github.com/syndtr/goleveldb v1.0.1-0.20190318030020-c3a204f8e965 // indirect
github.com/tj/go-termd v0.0.1
github.com/tklauser/go-sysconf v0.3.10 // indirect
github.com/tklauser/numcpus v0.5.0 // indirect
github.com/yusufpapurcu/wmi v1.2.2 // indirect
go.etcd.io/etcd/client/pkg/v3 v3.5.4
go.etcd.io/etcd/client/v3 v3.5.4
go.uber.org/atomic v1.9.0
go.uber.org/multierr v1.8.0 // indirect
go.uber.org/zap v1.21.0
golang.org/x/crypto v0.0.0-20220525230936-793ad666bf5e
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4
golang.org/x/net v0.0.0-20220524220425-1d687d428aca // indirect
golang.org/x/sync v0.0.0-20220513210516-0976fa681c29
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a
golang.org/x/term v0.0.0-20220526004731-065cf7ba2467
golang.org/x/text v0.3.7
golang.org/x/tools v0.1.11 // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/genproto v0.0.0-20220525015930-6ca3db687a9d
google.golang.org/grpc v1.46.2
gopkg.in/ini.v1 v1.66.4
gopkg.in/yaml.v2 v2.4.0
gopkg.in/yaml.v3 v3.0.0
software.sslmate.com/src/go-pkcs12 v0.2.0
)

require (
github.com/VividCortex/ewma v1.2.0 // indirect
github.com/benbjohnson/clock v1.3.0 // indirect
github.com/fsnotify/fsnotify v1.5.1 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/google/go-cmp v0.5.7 // indirect
github.com/kr/pretty v0.3.0 // indirect
github.com/mattn/go-colorable v0.1.12 // indirect
github.com/onsi/ginkgo v1.16.5 // indirect
github.com/onsi/gomega v1.18.1 // indirect
github.com/pingcap/log v1.1.0 // indirect
github.com/syndtr/goleveldb v1.0.1-0.20210305035536-64b5b1c73954 // indirect
github.com/tinylib/msgp v1.1.6 // indirect
github.com/tklauser/go-sysconf v0.3.10 // indirect
github.com/tklauser/numcpus v0.5.0 // indirect
github.com/yusufpapurcu/wmi v1.2.2 // indirect
go.etcd.io/bbolt v1.3.6 // indirect
go.uber.org/goleak v1.1.12 // indirect
go.uber.org/multierr v1.8.0 // indirect
golang.org/x/net v0.0.0-20220524220425-1d687d428aca // indirect
golang.org/x/tools v0.1.11 // indirect
google.golang.org/appengine v1.6.7 // indirect
)
Loading