async rpc in put message to cluster #24

DoraALin · 2017-05-09T03:30:09Z

Signed-off-by: 元守 lulin@youzan.com
change rpc call in async way in PutMessageToCluster & PutMessagesToCluster
async in put messages to cluster

absolute8511 · 2017-05-09T06:29:57Z

consistence/nsqd_coordinator_cluster_write.go

@@ -222,9 +224,10 @@ func (self *NsqdCoordinator) PutMessagesToCluster(topic *nsqd.Topic,
 		}
 		return false
 	}
-	clusterErr := self.doSyncOpToCluster(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
+	clusterErr := self.doSyncOpToClusterAsync(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,


No async for cluster write, async is only for slave replication.

absolute8511 · 2017-05-09T06:30:30Z

consistence/nsqd_coordinator_cluster_write.go

@@ -116,8 +117,8 @@ func (self *NsqdCoordinator) PutMessageToCluster(topic *nsqd.Topic,
 		return false
 	}

-	clusterErr := self.doSyncOpToCluster(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,
-		doRefresh, doSlaveSync, handleSyncResult)
+	clusterErr := self.doSyncOpToClusterAsync(true, coord, doLocalWrite, doLocalExit, doLocalCommit, doLocalRollback,


There is No async for cluster write, async is only for slave replication.

absolute8511 · 2017-05-09T06:34:15Z

consistence/nsqd_coordinator_cluster_write.go

+		if rpcErr == nil {
+			wg.Add(1)
+			id := nodeID
+			go func(){


I think no goroutinues should be used here, just wait on the slave result channel is enough

absolute8511 · 2017-05-09T06:36:58Z

consistence/nsqd_coordinator_cluster_write.go

+					syncStart = time.Now()
+				}
+				select {
+				case <-time.After(c.c.RequestTimeout):


Can we use only one timeout for all slaves. Once timeout cancel all the slave replication and retry or just fail.

Do we need onr timeout for all? My thought is that if all async rpc calls need wait, it only degrade to sync rpc call in time eclapse.

DoraALin · 2017-05-15T10:34:28Z

@absolute8511 for your review again.

DoraALin · 2017-05-16T03:39:49Z

@absolute8511 for your review

Signed-off-by: 元守 <lulin@youzan.com> async in put messages to cluster Signed-off-by: 元守 <lulin@youzan.com> remove debug log Signed-off-by: 元守 <lulin@youzan.com> one doSyncOpToCluster Signed-off-by: 元守 <lulin@youzan.com>

DoraALin · 2017-05-17T03:14:04Z

@absolute8511 for your review. async rpc response wait incorporated in SlaveWriteResult.GetResult()

absolute8511 · 2017-05-22T11:25:23Z

consistence/nsqd_coordinator_cluster_write.go

@@ -318,15 +320,12 @@ retrysync:
 		}
 	}
 	success = 0
-	failedNodes = make(map[string]struct{})


Why removing this? While we retry we need reset the failed info for all nodes

absolute8511 · 2017-05-22T11:31:11Z

consistence/nsqd_coordinator_cluster_write.go

 		var start time.Time
 		if checkCost {
 			start = time.Now()
 		}
 		// should retry if failed, and the slave should keep the last success write to avoid the duplicated
-		rpcErr = doSlaveSync(c, nodeID, tcData)
+		retSlave := doSlaveSync(c, nodeID, tcData)
+		rpcResps = append(rpcResps, RpcResp{nodeID, retSlave})


rpcResps and totalTimeout should be reset while retry.

absolute8511 · 2017-05-22T11:34:13Z

@DoraALin Could you please compare the detail benchmark before and after this changes. (both for benchmark test and the real server test.)

DoraALin · 2017-05-31T03:33:28Z

@absolute8511 benchmark shows async rpc call does NOT fatser than replica synchronization. I will try with server test.

Sync RPC call
10000 211161 ns/op 0.61 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator128-4
10000 279250 ns/op 0.46 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator128-4
10000 297141 ns/op 0.43 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator128-4
5000 251618 ns/op 4.07 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator1024-4
10000 331672 ns/op 3.09 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator1024-4
5000 307006 ns/op 3.34 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator1024-4
3000 380150 ns/op 0.34 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator128-4
5000 427857 ns/op 2.39 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator1024-4
5000 325689 ns/op 0.39 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator128-4
5000 344935 ns/op 2.97 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator1024-4

Async RPC call
PASS
10000 223244 ns/op 0.57 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator128-4
5000 344346 ns/op 0.37 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator128-4
5000 350341 ns/op 0.37 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator128-4
5000 205962 ns/op 4.97 MB/s
--- BENCH: BenchmarkNsqdCoordPub1Replicator1024-4
5000 322390 ns/op 3.18 MB/s
--- BENCH: BenchmarkNsqdCoordPub2Replicator1024-4
5000 457972 ns/op 2.24 MB/s
--- BENCH: BenchmarkNsqdCoordPub3Replicator1024-4
3000 441307 ns/op 0.29 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator128-4
3000 392285 ns/op 2.61 MB/s
--- BENCH: BenchmarkNsqdCoordPub5Replicator1024-4
5000 361425 ns/op 0.35 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator128-4
3000 377754 ns/op 2.71 MB/s
--- BENCH: BenchmarkNsqdCoordPub4Replicator1024-4

absolute8511 reviewed May 9, 2017

View reviewed changes

DoraALin force-pushed the async-in-cluster-rpc-call branch from 2839dfc to b51d1f1 Compare May 15, 2017 10:19

DoraALin force-pushed the async-in-cluster-rpc-call branch from b51d1f1 to 84a6757 Compare May 16, 2017 03:38

async rpc in put message to cluster

32a31e0

Signed-off-by: 元守 <lulin@youzan.com> async in put messages to cluster Signed-off-by: 元守 <lulin@youzan.com> remove debug log Signed-off-by: 元守 <lulin@youzan.com> one doSyncOpToCluster Signed-off-by: 元守 <lulin@youzan.com>

DoraALin force-pushed the async-in-cluster-rpc-call branch from 84a6757 to 32a31e0 Compare May 17, 2017 03:10

absolute8511 reviewed May 22, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

async rpc in put message to cluster #24

async rpc in put message to cluster #24

DoraALin commented May 9, 2017

absolute8511 May 9, 2017

absolute8511 May 9, 2017

absolute8511 May 9, 2017

absolute8511 May 9, 2017

DoraALin May 15, 2017

DoraALin commented May 15, 2017

DoraALin commented May 16, 2017

DoraALin commented May 17, 2017

absolute8511 May 22, 2017

absolute8511 May 22, 2017

absolute8511 commented May 22, 2017

DoraALin commented May 31, 2017

async rpc in put message to cluster #24

Are you sure you want to change the base?

async rpc in put message to cluster #24

Conversation

DoraALin commented May 9, 2017

absolute8511 May 9, 2017

Choose a reason for hiding this comment

absolute8511 May 9, 2017

Choose a reason for hiding this comment

absolute8511 May 9, 2017

Choose a reason for hiding this comment

absolute8511 May 9, 2017

Choose a reason for hiding this comment

DoraALin May 15, 2017

Choose a reason for hiding this comment

DoraALin commented May 15, 2017

DoraALin commented May 16, 2017

DoraALin commented May 17, 2017

absolute8511 May 22, 2017

Choose a reason for hiding this comment

absolute8511 May 22, 2017

Choose a reason for hiding this comment

absolute8511 commented May 22, 2017

DoraALin commented May 31, 2017