Skip to content

Commit

Permalink
mcs: support multi keyspace group in tso service (tikv#88)
Browse files Browse the repository at this point in the history
* client: refine serviceModeKeeper code (tikv#6201)

ref tikv#5895

Some code refinements for `serviceModeKeeper`.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* *: use revision for watch test (tikv#6205)

ref tikv#6071

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* *: remove unnecessary rand init (tikv#6207)

close tikv#6134

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* Refactor TSO forward/dispatcher to be shared by both PD and TSO (tikv#6175)

ref tikv#5895

Add general tso forward/dispatcher for independent pd(tso)/tso services and cross cluster forwarding.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* Add basic multi-keyspace-group management (tikv#6214)

ref tikv#5895

Support basic functions of multi-keyspace-group management

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* *: support keyspace group RESTful API (tikv#6229)

ref tikv#6231

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: add more tso tests (tikv#6184)

ref tikv#5836

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* client: fix compatibility problem of pd client (tikv#6244)

close tikv#6243

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* *: unify the key prefix (tikv#6248)

ref tikv#5836

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* *: remove cluster dependency from keyspace (tikv#6249)

ref tikv#6231

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* *: make code clear by rename `isServing` to `isRunning` (tikv#6258)

ref tikv#4399

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* cgroup: fix the path problem due to special container name (tikv#6267)

close tikv#6266

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* server: fix watch keyspace revision (tikv#6251)

ref tikv#5895

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* tso, server: refine the TSO allocator manager parameters (tikv#6269)

ref tikv#5895

- Refine the TSO allocator manager parameters.
- Always run `tsoAllocatorLoop` to advance the Global TSO.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* tso: unify the TSO ServiceConfig and ConfigProvider interfaces (tikv#6272)

ref tikv#5895

Unify the TSO `ServiceConfig` and `ConfigProvider` interfaces.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* Load initial assignment and dynamically watch/apply keyspace groups' membership/distribution change (tikv#6247)

ref tikv#6232

Load initial keyspace group assignment.
Dynamically watch/apply keyspace groups' membership/distribution change.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* *: define user kind for keyspace group (tikv#6241)

ref tikv#6231

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* Add more failure tests when tso service loading initial keyspace groups assignment (tikv#6280)

ref tikv#6232

Add more failure tests when tso service loading initial keyspace groups assignment

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* Apply multi-keyspace-group membership to tso service and handle inconsistency issue (tikv#6282)

ref tikv#6232

Apply multi-keyspace-group membership to tso service and handle inconsistency issue.

1. Add KeyspaceLookupTable to endpoint.KeyspaceGroup
type KeyspaceGroup struct {
        ...
        // KeyspaceLookupTable is for fast lookup if a given keyspace belongs to this keyspace group.
        // It's not persisted and will be built when loading from storage.
        KeyspaceLookupTable map[uint32]struct{} `json:"-"`
}

2. After loading keyspace groups, the Keyspace Group Manager builds KeyspaceLookupTable for every keyspace groups.

3. When Keyspace Group Manager handles tso requests, it uses the keyspaceLookupTable to check if the required keypsace still belongs to the required keyspace group. If not, returns the current keyspace group id in the tso response header.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* *: auto assign keyspace group (tikv#6268)

close tikv#6231

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* keyspace, api: support the keyspace group split (tikv#6293)

ref tikv#6232

Support the keyspace group split and add related tests.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* Improve lock mechanism in tso.KeyspaceGroupManager (tikv#6305)

ref tikv#6232

Use the RWMutex instead of individual atomic types to better protect the state of the keyspace group manager

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* keyspace: add split-from field for endpoint.KeyspaceGroup (tikv#6309)

ref tikv#6232

Add `split-from` field for `endpoint.KeyspaceGroup`.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* Add read lock at one place for protection and better structure (tikv#6310)

ref tikv#6232, ref tikv#6305

follow-up tikv#6305 
Add read lock at one place for protection and better structure

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* tso: optimize function signatures to reduce parameter passing (tikv#6315)

ref tikv#6232

Optimize function signatures to reduce parameter passing.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* *: bootstrap keyspace group when server is in API mode (tikv#6308)

ref tikv#6231

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* keyspace: avoid keyspace being updated during the split (tikv#6316)

ref tikv#6232

Prevent keyspace from being updated during the split.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* *: fix `TestConcurrentlyReset` (tikv#6318)

close tikv#6275

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* bootstrap default keyspace group in the tso service (tikv#6306)

ref tikv#6232

Changes:

1. Introduce the initialization logic of the default keyspace group.
- If the default keyspace group isn't configured in the etcd, every tso node/pod should initialize it and join the election for the primary of this group.
- If the default keyspace group is configured in the etcd, the tso nodes/pods which are assigned with this group will initialize it and join the election for the primary of this group.

2. Introduce the keyspace group membership restriction -- default keyspace always belongs to default keyspace group.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* *: change the log level (tikv#6324)

ref tikv#6232

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* *: fix the missing log panic (tikv#6325)

close tikv#6257

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: fix watch primary address revision and update cache when meets not leader  (tikv#6279)

ref tikv#5895

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* tso, member: support TSO split based on keyspace group split (tikv#6313)

ref tikv#6232

Support TSO split based on keyspace group split.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: support metrics HTTP interface for tso/resource manager server (tikv#6329)

ref tikv#5895

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* tso: put finishSplitKeyspaceGroup into the critical section (tikv#6331)

ref tikv#6232

Put `finishSplitKeyspaceGroup` into the critical section.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* *: make `TestServerRegister` stable (tikv#6337)

close tikv#6334

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* tests: divide all tests into the CI chunks including submodule tests (tikv#6198)

ref tikv#6181, ref tikv#6183

Divide all tests into the CI chunks including submodule tests.

Signed-off-by: JmPotato <ghzpotato@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* tests: introduce TSO TestCluster in the test (tikv#6333)

ref tikv#6232

Introduce TSO `TestCluster` in the test.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: add balancer for keyspace group (tikv#6274)

close tikv#6233

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* Fixed bugs in tso service registry watching loop. (tikv#6346)

ref tikv#6343

Fixed the following two bugs:
1. When re-watch a range, to continue from what left by the last watch, the revision is wresp.Header.Revision + 1 instead of wresp.Header.Revision, where wresp.Header.Revision is the revision indicated in the response of the last watch. Because of this bug, it was processing the same event endless as you can see from the log below.
2. In tso service watch loop in /Users/binshi/code/pingcap/my-pd/pkg/keyspace/tso_keyspace_group.go, If this is delete event, the json.Unmarshal(event.Kv.Value, s) will fail with the error "unexpected end of JSON input", so there is no way to get s.serviceAddr from the result of json.Unmarshal.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* mcs: fix double compression of prom handler (tikv#6339)

ref prometheus/client_golang#622, ref tikv#5895

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* tests, tso: add more TSO split tests (tikv#6338)

ref tikv#6232

Add more TSO split tests.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* keyspace, tso: fix next revision to watch after watch/Get/RangeScan (tikv#6353)

ref tikv#6232

The next revision to watch should always be Header.Revision + 1 where header is response header of watch/Get/RangeScan

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* mcs, tests: use TSO cluster to do the failover test (tikv#6356)

ref tikv#5895

Use TSO cluster to do the failover test.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* fix startWatchLoop leak (tikv#6352)

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* mcs: update client when meet transport is closing (tikv#6341)

* mcs: update client when meet transport is closing

Signed-off-by: lhy1024 <admin@liudos.us>

* address comments

Signed-off-by: lhy1024 <admin@liudos.us>

* add retry

Signed-off-by: lhy1024 <admin@liudos.us>

---------

Signed-off-by: lhy1024 <admin@liudos.us>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* add bootstrap test (tikv#6347)

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* Fix flaky TestLoadKeyspaceGroupsAssignment test (tikv#6365)

Reduce the count of keyspace groups from 4096 to 512 in TestKeyspaceGroupManagerTestSuite/TestLoadKeyspaceGroupsAssignment to avoid timeout when test running slow.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* mcs, tso: fix ts fallback caused by multi-primary of the same keyspace group  (tikv#6362)

* Change participant election-prifix from listen-addr to advertise-listen-addr to gurantee uniqueness.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* Add TestPariticipantStartWithAdvertiseListenAddr

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* Add comments to fix go fmt errors

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

---------

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Co-authored-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* fix log output (tikv#6364)

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* mcs: fix duplicate start of RaftCluster. (tikv#6358)

* Using double-checked locking to avoid duplicate start of RaftCluster.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* Handle feedback

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* improve locking

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* handle feedback

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

---------

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Co-authored-by: Ryan Leung <rleungx@gmail.com>

* Add retry mechanism for updating keyspace group (tikv#6372)

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* mcs: add set handler for balancer and alloc node for default keyspace group (tikv#6342)

ref tikv#6233

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* mcs, tso: fix Nil pointer deference when (*AllocatorManager).GetMember (tikv#6383)

close tikv#6381

If the desired keyspace group fall back to the default keyspace group and the AM isn't initialized, return not served error.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* mcs, tso: support multi-keyspace-group and its service discovery in E2E path (tikv#6321)

ref tikv#6232

Support multi-keyspace-group in PD(TSO) client

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* client: add `NewClientWithKeyspaceName` for client (tikv#6380)

ref tikv#5895

Signed-off-by: Ryan Leung <rleungx@gmail.com>

* keyspace, tso: check the replica count before the split (tikv#6382)

ref tikv#6233

Check the replica count before the split.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: lhy1024 <admin@liudos.us>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* tso: fix bugs to make split test case to pass (tikv#6389)

ref tikv#6232

fix bugs to make split test case to pass

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* keyspace: patrol keyspace assignment before the first split (tikv#6388)

ref tikv#6232

Patrol the keyspace assignment before the first split to make sure every keyspace has its group assignment.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* tests: fix Flaky TestMicroserviceTSOServer/TestConcurrentlyReset (tikv#6396)

close tikv#6385

Get a copy of now then call base.add, because now is shared by all goroutines and now.add() will add to itself which isn't atomic and multi-goroutine safe.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* keyspace, slice: improve code efficiency in membership ops (tikv#6392)

ref tikv#6231

Improve code efficiency in membership ops

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* tests: enable TestTSOKeyspaceGroupSplitClient (tikv#6398)

ref tikv#6232

Enable `TestTSOKeyspaceGroupSplitClient`.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* tests: add more tests for multiple keyspace groups (tikv#6395)

ref tikv#5895

Add CheckMultiKeyspacesTSO() and WaitForMultiKeyspacesTSOAvailable in test utility. Add TestTSOKeyspaceGroupManager/TestKeyspacesServedByNonDefaultKeyspaceGroup. Cover TestGetTS, TestGetTSAsync, TestUpdateAfterResetTSO in TestMicroserviceTSOClient for multiple keyspace groups.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* tests: fix failpoint disable (tikv#6401)

ref tikv#4399

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* client: retry load keyspace meta when creating a new client (tikv#6402)

ref tikv#5895

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* Fix test issue in TestRandomResignLeader. (tikv#6410)

close tikv#6404

We need to make sure the selected keyspaces are from different keyspace groups, otherwise multiple goroutines below could try to resign the primary of the same keyspace group and cause race condition.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* keyspace, api2: fix the keyspace assignment patrol consistency (tikv#6397)

ref tikv#6232

Fix the keyspace assignment patrol consistency.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* election, tso: fix data race in lease.go (tikv#6379)

close tikv#6378

fix data race in lease.go

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* mcs: fix forward test with pd mode client (tikv#6290)

ref tikv#5895, ref tikv#6279, close tikv#6289

Signed-off-by: lhy1024 <admin@liudos.us>

* keyspace: patrol the keyspace assignment in batch (tikv#6411)

ref tikv#6232

Patrol the keyspace assignment in batch.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

* etcdutil: add watch loop (tikv#6390)

close tikv#6391

Signed-off-by: lhy1024 <admin@liudos.us>

* mcs, tso: add API interface to obtain the TSO keyspace group member info (tikv#6373)

ref tikv#6232

Add API interface to obtain the TSO keyspace group member info.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* pkg: move operator_check out from test_util tikv#6162

Signed-off-by: lhy1024 <admin@liudos.us>

* keysapce: wait region split when creating keyspace (tikv#6414)

ref tikv#6231

Signed-off-by: zeminzhou <zhouzemin@pingcap.com>
Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: zzm <zhouzemin@pingcap.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* mcs: use getClusterInfo to check whether api service is ready (tikv#6422)

ref tikv#5836

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* fix data race by replace clone (tikv#6242)

close tikv#6230

Signed-off-by: bufferflies <1045931706@qq.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

* fix test and git mod tidy

Signed-off-by: lhy1024 <admin@liudos.us>

* revert makefile

Signed-off-by: lhy1024 <admin@liudos.us>

* fix ctx in watch loop

Signed-off-by: lhy1024 <admin@liudos.us>

* delete pd-tests.yaml

Signed-off-by: lhy1024 <admin@liudos.us>

* pd-ctl, tests: add the keyspace group commands (tikv#6423)

ref tikv#6232

Add the keyspace group commands to show and split keyspace groups.

Signed-off-by: JmPotato <ghzpotato@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* Handle compatibility issue in GetClusterInfo RPC (tikv#6434)

ref tikv#5895, close tikv#6448

Handle the compatibility issue in the GetClusterInfo RPC

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* Provide GetMinTS API to solve the compatibility issue brought by multi-timeline tso (tikv#6421)

ref tikv#6142

1. Import kvproto change to introduce GetMinTS rpc in the TSO service.
6. Add server side implementation for GetMinTS rpc.
7. Add client side implementation for GetMinTS rpc.
8. Add unit test

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* tso: use less interval when waiting api service (tikv#6451)

close tikv#6449

Signed-off-by: lhy1024 <admin@liudos.us>

* etcdutil: fix ctx in watch loop (tikv#6445)

close tikv#6439

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* Fix "non-default keyspace groups use the same timestamp path by mistake" (tikv#6457)

close tikv#6453, close tikv#6465

The tso servers are loading keyspace groups asynchronously. Make sure all keyspace groups
are available for serving tso requests from corresponding keyspaces by querying
IsKeyspaceServing(keyspaceID, the Desired KeyspaceGroupID). if use default keyspace group id
in the query, it will always return true as the keyspace will be served by default keyspace group
before the keyspace groups are loaded.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>

* TSO microservice discovery fallback path shouldn't call FindGroupByKeyspaceID (tikv#6473)

close tikv#6472

TSO microservice discovery fallback path shouldn't call FindGroupByKeyspaceID

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* Revert "cgroup: fix the path problem due to special container name (tikv#6267)"

This reverts commit 0c4cf7f947799e5c45d6e37448475b921044bdde.

* *: rm debug file (tikv#6458)

ref tikv#4399

Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* Revert "*: remove unnecessary rand init (tikv#6207)"

This reverts commit 7383ded7581c417a3866da271eb2ec0a27b5a6c8.

* mcs, tso: handle null keyspace (tikv#6476)

ref tikv#5895

For API V1 and legacy path (NewClientWithContext w/o keyspace id/name),
using Null Keypsace ID (uint32max) instead of default keyspace id and
make sure it can be served by the default keyspace group's timeline. Modifying test accordingly.

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* mcs, tso: print TSO service discovery fallback log just once (tikv#6478)

ref tikv#5895

Print TSO service discovery fallback log just once

Signed-off-by: Bin Shi <binshi.bing@gmail.com>

* client: return error if the keyspace meta cannot be found (tikv#6479)

ref tikv#6142

Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* client: support use API context to create client (tikv#6482)

ref tikv#6142

Signed-off-by: Ryan Leung <rleungx@gmail.com>

---------

Signed-off-by: Bin Shi <binshi.bing@gmail.com>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: JmPotato <ghzpotato@gmail.com>
Co-authored-by: JmPotato <ghzpotato@gmail.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Co-authored-by: Ryan Leung <rleungx@gmail.com>
Co-authored-by: Bin Shi <39923490+binshi-bing@users.noreply.github.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Co-authored-by: zzm <zhouzemin@pingcap.com>
Co-authored-by: buffer <1045931706@qq.com>
  • Loading branch information
8 people authored May 19, 2023
1 parent 60e97c4 commit dd7f753
Show file tree
Hide file tree
Showing 171 changed files with 11,439 additions and 2,752 deletions.
355 changes: 286 additions & 69 deletions client/client.go

Large diffs are not rendered by default.

19 changes: 10 additions & 9 deletions client/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"github.com/stretchr/testify/require"
"github.com/tikv/pd/client/testutil"
"github.com/tikv/pd/client/tlsutil"
"github.com/tikv/pd/client/tsoutil"
"go.uber.org/goleak"
"google.golang.org/grpc"
)
Expand All @@ -32,13 +33,13 @@ func TestMain(m *testing.M) {
goleak.VerifyTestMain(m, testutil.LeakOptions...)
}

func TestTsLessEqual(t *testing.T) {
func TestTSLessEqual(t *testing.T) {
re := require.New(t)
re.True(tsLessEqual(9, 9, 9, 9))
re.True(tsLessEqual(8, 9, 9, 8))
re.False(tsLessEqual(9, 8, 8, 9))
re.False(tsLessEqual(9, 8, 9, 6))
re.True(tsLessEqual(9, 6, 9, 8))
re.True(tsoutil.TSLessEqual(9, 9, 9, 9))
re.True(tsoutil.TSLessEqual(8, 9, 9, 8))
re.False(tsoutil.TSLessEqual(9, 8, 8, 9))
re.False(tsoutil.TSLessEqual(9, 8, 9, 6))
re.True(tsoutil.TSLessEqual(9, 6, 9, 8))
}

func TestUpdateURLs(t *testing.T) {
Expand All @@ -58,11 +59,11 @@ func TestUpdateURLs(t *testing.T) {
cli := &pdServiceDiscovery{option: newOption()}
cli.urls.Store([]string{})
cli.updateURLs(members[1:])
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2]}), cli.GetURLs())
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2]}), cli.GetServiceURLs())
cli.updateURLs(members[1:])
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2]}), cli.GetURLs())
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2]}), cli.GetServiceURLs())
cli.updateURLs(members)
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2], members[0]}), cli.GetURLs())
re.Equal(getURLs([]*pdpb.Member{members[1], members[3], members[2], members[0]}), cli.GetServiceURLs())
}

const testClientURL = "tmp://test.url:5255"
Expand Down
40 changes: 25 additions & 15 deletions client/errs/errno.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,27 +21,37 @@ import (
)

const (
// NotLeaderErr indicates the the non-leader member received the requests which should be received by leader.
// NotLeaderErr indicates the non-leader member received the requests which should be received by leader.
// Note: keep the same as the ones defined on the server side, because the client side checks if an error message
// contains this string to judge whether the leader is changed.
NotLeaderErr = "is not leader"
// MismatchLeaderErr indicates the the non-leader member received the requests which should be received by leader.
// MismatchLeaderErr indicates the non-leader member received the requests which should be received by leader.
// Note: keep the same as the ones defined on the server side, because the client side checks if an error message
// contains this string to judge whether the leader is changed.
MismatchLeaderErr = "mismatch leader id"
RetryTimeoutErr = "retry timeout"
// NotServedErr indicates an tso node/pod received the requests for the keyspace groups which are not served by it.
// Note: keep the same as the ones defined on the server side, because the client side checks if an error message
// contains this string to judge whether the leader is changed.
NotServedErr = "is not served"
RetryTimeoutErr = "retry timeout"
)

// client errors
var (
ErrClientGetProtoClient = errors.Normalize("failed to get proto client", errors.RFCCodeText("PD:client:ErrClientGetProtoClient"))
ErrClientCreateTSOStream = errors.Normalize("create TSO stream failed, %s", errors.RFCCodeText("PD:client:ErrClientCreateTSOStream"))
ErrClientTSOStreamClosed = errors.Normalize("encountered TSO stream being closed unexpectedly", errors.RFCCodeText("PD:client:ErrClientTSOStreamClosed"))
ErrClientGetTSOTimeout = errors.Normalize("get TSO timeout", errors.RFCCodeText("PD:client:ErrClientGetTSOTimeout"))
ErrClientGetTSO = errors.Normalize("get TSO failed, %v", errors.RFCCodeText("PD:client:ErrClientGetTSO"))
ErrClientGetLeader = errors.Normalize("get leader from %v error", errors.RFCCodeText("PD:client:ErrClientGetLeader"))
ErrClientGetMember = errors.Normalize("get member failed", errors.RFCCodeText("PD:client:ErrClientGetMember"))
ErrClientGetClusterInfo = errors.Normalize("get cluster info failed", errors.RFCCodeText("PD:client:ErrClientGetClusterInfo"))
ErrClientUpdateMember = errors.Normalize("update member failed, %v", errors.RFCCodeText("PD:client:ErrUpdateMember"))
ErrClientProtoUnmarshal = errors.Normalize("failed to unmarshal proto", errors.RFCCodeText("PD:proto:ErrClientProtoUnmarshal"))
ErrClientGetMultiResponse = errors.Normalize("get invalid value response %v, must only one", errors.RFCCodeText("PD:client:ErrClientGetMultiResponse"))
ErrClientGetServingEndpoint = errors.Normalize("get serving endpoint failed", errors.RFCCodeText("PD:client:ErrClientGetServingEndpoint"))
ErrClientGetProtoClient = errors.Normalize("failed to get proto client", errors.RFCCodeText("PD:client:ErrClientGetProtoClient"))
ErrClientCreateTSOStream = errors.Normalize("create TSO stream failed, %s", errors.RFCCodeText("PD:client:ErrClientCreateTSOStream"))
ErrClientTSOStreamClosed = errors.Normalize("encountered TSO stream being closed unexpectedly", errors.RFCCodeText("PD:client:ErrClientTSOStreamClosed"))
ErrClientGetTSOTimeout = errors.Normalize("get TSO timeout", errors.RFCCodeText("PD:client:ErrClientGetTSOTimeout"))
ErrClientGetTSO = errors.Normalize("get TSO failed, %v", errors.RFCCodeText("PD:client:ErrClientGetTSO"))
ErrClientGetMinTSO = errors.Normalize("get min TSO failed, %v", errors.RFCCodeText("PD:client:ErrClientGetMinTSO"))
ErrClientGetLeader = errors.Normalize("get leader failed, %v", errors.RFCCodeText("PD:client:ErrClientGetLeader"))
ErrClientGetMember = errors.Normalize("get member failed", errors.RFCCodeText("PD:client:ErrClientGetMember"))
ErrClientGetClusterInfo = errors.Normalize("get cluster info failed", errors.RFCCodeText("PD:client:ErrClientGetClusterInfo"))
ErrClientUpdateMember = errors.Normalize("update member failed, %v", errors.RFCCodeText("PD:client:ErrUpdateMember"))
ErrClientProtoUnmarshal = errors.Normalize("failed to unmarshal proto", errors.RFCCodeText("PD:proto:ErrClientProtoUnmarshal"))
ErrClientGetMultiResponse = errors.Normalize("get invalid value response %v, must only one", errors.RFCCodeText("PD:client:ErrClientGetMultiResponse"))
ErrClientGetServingEndpoint = errors.Normalize("get serving endpoint failed", errors.RFCCodeText("PD:client:ErrClientGetServingEndpoint"))
ErrClientFindGroupByKeyspaceID = errors.Normalize("can't find keyspace group by keyspace id", errors.RFCCodeText("PD:client:ErrClientFindGroupByKeyspaceID"))
)

// grpcutil errors
Expand Down
8 changes: 4 additions & 4 deletions client/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ require (
github.com/opentracing/opentracing-go v1.2.0
github.com/pingcap/errors v0.11.5-0.20211224045212-9687c2b0f87c
github.com/pingcap/failpoint v0.0.0-20210918120811-547c13e3eb00
github.com/pingcap/kvproto v0.0.0-20230321060725-1841520d34ba
github.com/pingcap/kvproto v0.0.0-20230511011722-6e0e8a7deaa1
github.com/pingcap/log v1.1.1-0.20221110025148-ca232912c9f3
github.com/prometheus/client_golang v1.11.0
github.com/stretchr/testify v1.8.1
Expand All @@ -31,9 +31,9 @@ require (
github.com/prometheus/procfs v0.6.0 // indirect
go.uber.org/atomic v1.9.0 // indirect
go.uber.org/multierr v1.7.0 // indirect
golang.org/x/net v0.2.0 // indirect
golang.org/x/sys v0.2.0 // indirect
golang.org/x/text v0.4.0 // indirect
golang.org/x/net v0.7.0 // indirect
golang.org/x/sys v0.5.0 // indirect
golang.org/x/text v0.7.0 // indirect
google.golang.org/genproto v0.0.0-20221202195650-67e5cbc046fd // indirect
google.golang.org/protobuf v1.28.1 // indirect
gopkg.in/natefinch/lumberjack.v2 v2.0.0 // indirect
Expand Down
16 changes: 8 additions & 8 deletions client/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,8 @@ github.com/pingcap/errors v0.11.5-0.20211224045212-9687c2b0f87c h1:xpW9bvK+HuuTm
github.com/pingcap/errors v0.11.5-0.20211224045212-9687c2b0f87c/go.mod h1:X2r9ueLEUZgtx2cIogM0v4Zj5uvvzhuuiu7Pn8HzMPg=
github.com/pingcap/failpoint v0.0.0-20210918120811-547c13e3eb00 h1:C3N3itkduZXDZFh4N3vQ5HEtld3S+Y+StULhWVvumU0=
github.com/pingcap/failpoint v0.0.0-20210918120811-547c13e3eb00/go.mod h1:4qGtCB0QK0wBzKtFEGDhxXnSnbQApw1gc9siScUl8ew=
github.com/pingcap/kvproto v0.0.0-20230321060725-1841520d34ba h1:7g2yM0llENlRqtjboBKFBJ8N9SE01hPDpKuTwxBLpLM=
github.com/pingcap/kvproto v0.0.0-20230321060725-1841520d34ba/go.mod h1:RjuuhxITxwATlt5adgTedg3ehKk01M03L1U4jNHdeeQ=
github.com/pingcap/kvproto v0.0.0-20230511011722-6e0e8a7deaa1 h1:VXQ6Du/nKZ9IQnI9NWMzKbftWu8NV5pQkSLKIRzzGN4=
github.com/pingcap/kvproto v0.0.0-20230511011722-6e0e8a7deaa1/go.mod h1:guCyM5N+o+ru0TsoZ1hi9lDjUMs2sIBjW3ARTEpVbnk=
github.com/pingcap/log v1.1.1-0.20221110025148-ca232912c9f3 h1:HR/ylkkLmGdSSDaD8IDP+SZrdhV1Kibl9KrHxJ9eciw=
github.com/pingcap/log v1.1.1-0.20221110025148-ca232912c9f3/go.mod h1:DWQW5jICDR7UJh4HtxXSM20Churx4CQL0fwL/SoOSA4=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
Expand Down Expand Up @@ -161,8 +161,8 @@ golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLL
golang.org/x/net v0.0.0-20200625001655-4c5254603344/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
golang.org/x/net v0.2.0 h1:sZfSu1wtKLGlWI4ZZayP0ck9Y73K1ynO6gqzTdBVdPU=
golang.org/x/net v0.2.0/go.mod h1:KqCZLdyyvdV855qA2rE3GC2aiw5xGR5TEjj8smXukLY=
golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
Expand All @@ -187,14 +187,14 @@ golang.org/x/sys v0.0.0-20210124154548-22da62e12c0c/go.mod h1:h1NjWce9XRLGQEsW7w
golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20210603081109-ebe580a85c40/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.2.0 h1:ljd4t30dBnAvMZaQCevtY0xLLD0A+bRZXbgLMLU1F/A=
golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0 h1:MUK/U/4lj1t1oPg0HfuXDN/Z1wv31ZJ/YcPiGccS4DU=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.4.0 h1:BrVqGRd7+k1DiOgtnFvAkoQEWQvBc25ouMJM6429SFg=
golang.org/x/text v0.4.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20191029041327-9cc4af7d6b2c/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
Expand Down
9 changes: 5 additions & 4 deletions client/grpcutil/grpcutil.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,12 +90,13 @@ func GetOrCreateGRPCConn(ctx context.Context, clientConns *sync.Map, addr string
if err != nil {
return nil, err
}
old, ok := clientConns.LoadOrStore(addr, cc)
if !ok {
conn, loaded := clientConns.LoadOrStore(addr, cc)
if !loaded {
// Successfully stored the connection.
return cc, nil
}
cc.Close()
log.Debug("use old connection", zap.String("target", cc.Target()), zap.String("state", cc.GetState().String()))
return old.(*grpc.ClientConn), nil
cc = conn.(*grpc.ClientConn)
log.Debug("use existing connection", zap.String("target", cc.Target()), zap.String("state", cc.GetState().String()))
return cc, nil
}
Loading

0 comments on commit dd7f753

Please sign in to comment.