Query metric per host 1155 #1156

alourie · 2018-08-17T15:12:59Z

Attempt to solve #1155

This ended up being a bit bigger than initially expected.

Nevertheless, the majority of work was to update the tests to fetch the correct host they're doing the work for and sending it over as a parameter to other functions. Additionally, it should be somewhat easier to add new metrics to the query execution.

The tests seem to pass successfully now.

@Zariel would love your feedback on this.

annismckenzie

Returning the favor because you reviewed my PR as well – hope that's okay. 😄

annismckenzie · 2018-08-19T00:54:12Z

session_test.go

-		t.Fatalf("expceted batch.Attempts() to return %v, got %v", 1, b.Attempts())
+	b.metrics[ip] = &queryMetric{attempts: 1}
+	if b.Attempts(host) != 1 {
+		t.Fatalf("expceted batch.Attempts() to return %v, got %v", 1, b.Attempts(host))


Fix expceted spelling while you're here?

annismckenzie · 2018-08-19T00:59:08Z

session.go

 	s.mu.RUnlock()
 }

+func initializeMetrics(s *Session) map[string]*queryMetric {


Any reason why this is a lone function and not func (q *Query) initializeMetrics()?

the main reason is that Query and Batch initialisers are the same an don't really need to work on a query object.

On the other hand, acking the comment on the initialiser itself, I will probably get rid of it completely as it would no longer make sense if the nodes are added later.

annismckenzie · 2018-08-19T01:00:23Z

session.go

@@ -658,6 +658,11 @@ func (s *Session) connect(host *HostInfo, errorHandler ConnErrorHandler) (*Conn,
 	return s.dial(host, s.connCfg, errorHandler)
 }

+type queryMetric struct {
+	attempts     int


Make those uint64 because there's no good reason why it should ever be negative.

annismckenzie · 2018-08-19T01:04:55Z

session.go

+	defer s.pool.mu.RUnlock()
+
+	// Range over all defined IPs and prebuild metrics map
+	for _, hostConn := range s.pool.hostConnPools {


Is there a possibility that a host is added to the pool after this is done? I'm pretty sure that can happen if you add a new node to the cluster and it's gossiped around the cluster through the control connection. In that case you might panic in the map accesses below.

as mentioned above, will probably remove this completely.

annismckenzie · 2018-08-19T01:06:57Z

conn_test.go

-func NewTestServer(t testing.TB, protocol uint8, ctx context.Context) *TestServer {
-	laddr, err := net.ResolveTCPAddr("tcp", "127.0.0.1:0")
+func NewTestServerWithAddress(addr string, t testing.TB, protocol uint8, ctx context.Context) *TestServer {
+


drop this empty line

annismckenzie · 2018-08-19T01:07:03Z

conn_test.go

@@ -721,6 +805,11 @@ func NewTestServer(t testing.TB, protocol uint8, ctx context.Context) *TestServe
 	go srv.serve()

 	return srv
+


drop this empty line

annismckenzie · 2018-08-19T01:07:28Z

session.go

-	q.attempts++
-	q.totalLatency += end.Sub(start).Nanoseconds()
-	// TODO: track latencies per host and things as well instead of just total
+


drop this empty line

annismckenzie · 2018-08-19T01:07:41Z

session.go

-	b.attempts++
-	b.totalLatency += end.Sub(start).Nanoseconds()
-	// TODO: track latencies per host and things as well instead of just total
+


drop this empty line

annismckenzie · 2018-08-19T01:07:52Z

session.go

}
+

drop this empty line

alourie · 2018-08-20T01:38:15Z

@annismckenzie more than OK, I really appreciate it!

alourie · 2018-08-20T12:13:12Z

@annismckenzie I've updated the PR, would love if you could have a look. Thanks.

annismckenzie

Good progress on making it less complex! Still have a couple nits.

annismckenzie · 2018-08-20T14:35:55Z

session.go

 	s.mu.RUnlock()
 }

+func (q *Query) getHostMetrics(host *HostInfo) *queryMetrics {
+	_, exists := q.metrics[host.connectAddress.String()]


Only look it up once in this method?

Considering that this call is used for both a lookup and initiating a zero value, I really can't think of a more "efficient" or better way to do it, except splitting it into 2 functions, 1 to initiate the map and the other is to fetch. Personally, don't think it makes it all that better, but is that what you're thinking of?

I meant something like this:

func (q *Query) getHostMetrics(host *HostInfo) *queryMetrics { metrics, exists := q.metrics[host.connectAddress.String()] if !exists { // if the host is not in the map, it means it's been accessed for the first time metrics = &queryMetrics{attempts: 0, totalLatency: 0} q.metrics[host.connectAddress.String()] = metrics } return metrics }

yea, thought about this one too, but didn't have a good opinion on it... would you think it looks better? If so, can easily change.

I think it does look better. 🙈

annismckenzie · 2018-08-20T14:38:27Z

session.go

-	if b.attempts > 0 {
-		return b.totalLatency / int64(b.attempts)
+func (b *Batch) Latency(host *HostInfo) int64 {
+	hostMetric := b.getHostMetrics(host)


Why doesn't the Batch have a mutex while the Query still does? 🤔

They both don't have a mutex. The only mutexes in Query (but also in Batches) are the ones embedded in Session, where there are 2: one for the session itself and the second is for the hostPool access. Am I missing something? :-)

Don't know what I was seeing – weird. Disregard. 😬

Zariel · 2018-08-21T20:13:54Z

@alourie needs a rebase now

alourie · 2018-08-22T11:55:31Z

@Zariel @annismckenzie okay....So I've rebased with the last master, cleaned up the code a bit, removed yet unnecessary locks. Also found a small bug in the query_executor, which I also fixed, added a separate test case for that for future reference and finally updated the testRetryPolicy to be a bit more flexible.

Travis is now happy, would love your feedback/approval ;-)

annismckenzie

I have a couple questions, see my comments.

annismckenzie · 2018-08-23T06:40:49Z

query_executor.go

 		return p.GetRetryType(err), nil
 	}
+
+	// p.Attempt is false, should not retry anything
 	return p.GetRetryType(err), err


The correct fix would be to change this to return Rethrow, err. Then the change below can be dropped again as well.

~~Well, that's also a problem. If the retryPolicy is implemented similarly to the new~~ ~~testRetryPolicy, then when those attempts are done it returns RetryNextHost, not~~ ~~Rethrow. If you ignore that, then you're throwing an error instead of trying the next host.~~

~~That's why I ended up accepting the need to call it here, but check it in the queryExecutor. I don't like both solutions, but at least the current one seems to be working correctly.~~

All above is wrong. That is indeed the correct fix, and I misunderstood how RetryPolicy works, so hence the fix above. So will implement.

Would you mind if I fix this in my branch because I'm working on #1161 and it touches the same code path again in the query executor?

annismckenzie · 2018-08-23T06:43:38Z

session.go

@@ -826,8 +846,8 @@ func (q *Query) attempt(keyspace string, end, start time.Time, iter *Iter, host
 			End:       end,
 			Rows:      iter.numRows,
 			Host:      host,
+			Metrics:   q.metrics[host.ConnectAddress().String()],


Pass hostMetric as it contains the correct information (and is also not used currently in this function anyway?).

you're right, don't know what I was thinking...will fix

annismckenzie · 2018-08-23T08:11:22Z

session.go

@@ -713,14 +729,18 @@ func (q Query) String() string {
 }

 //Attempts returns the number of times the query was executed.
-func (q *Query) Attempts() int {
-	return q.attempts
+func (q *Query) Attempts(host *HostInfo) uint64 {


The main gripe I have with this approach is that this a) break backwards compatibility in a major way and b) now produces a different behavior even with the default SimpleRetryPolicy. The metrics are tracked by host and default retry type of the SimpleRetryPolicy is to RetryNextHost which means the attempts are now also per host. The max retries are ~~therefor also number of hosts * max retries - (number of hosts - 1) (don't ask, the gist: it fills~~ (that's not correct) not honored because on each successive retry (on the next host) the attempts for that will be 0 and it will exhaust the host pool and fail with a different error than before. Mind you, I'm not harking on the general direction or the feature (it is very useful) but maybe leave the current signatures alone and keep track of a summed version of attempts and latencies (as it is right now) and make the per host latencies and attempts available to the retry policy through some other means so going forward one can create an advanced retry policy that takes those into account? 🤔

Hmmm...now I'm kinda confused.

So, you're right, the current queryExecution only checks that the total accumulated attempts across all hosts have not surpassed the number set in the retry policy, and this approach changes that. I was wrong in my understanding of what it does.

So...I think I understand it now. What I will probably do is introduce a way to find the total count by summing the map values; also will see how to keep signatures the same.

Thanks for the review!

alourie · 2018-08-23T22:03:53Z

Totally, go ahead. I planned to look at #1161 if you can't ;-)

On Fri., 24 Aug. 2018, 07:28 Daniel Lohse, ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In query_executor.go <#1156 (comment)>: > return p.GetRetryType(err), nil } + + // p.Attempt is false, should not retry anything return p.GetRetryType(err), err Would you mind if I fix this in my branch because I'm working on #1161 <#1161> and it touches the same code path again in the query executor? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1156 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AADdBxw7B_ZiZh7gmNB4zAIlCw-sqi8yks5uTyV2gaJpZM4WBvE0> .

-- Best Regards, Alex Lourie

annismckenzie · 2018-08-23T22:10:15Z

Nah, already deep in the weeds on #1161. I'll count on your 👀 though. 😬

annismckenzie

Went through it again and restoring the public API was the right call (don't want people running away because of high number of breaking changes) – I do have some thoughts. 😬

annismckenzie · 2018-08-25T12:59:28Z

conn_test.go


 func (o *testQueryObserver) ObserveQuery(ctx context.Context, q ObservedQuery) {
-	Logger.Printf("Observed query %q. Returned %v rows, took %v on host %q. Error: %q\n", q.Statement, q.Rows, q.End.Sub(q.Start), q.Host.ConnectAddress().String(), q.Err)
+	if o.metrics == nil {


I'd drop this because it's noisy. It's a test so initialize metrics when creating the observer below.

annismckenzie · 2018-08-25T13:01:15Z

conn_test.go

+		Logger.Printf("Observed query %q. Returned %v rows, took %v on host %q. Error: %q\n", q.Statement, q.Rows, q.End.Sub(q.Start), host, q.Err)
+	}
+
+	// Uncomment and add verbosity?


Drop and move the missing log entries q.Metrics.attempts and q.Metrics.totalLatency into the log message above.

annismckenzie · 2018-08-25T13:05:27Z

control.go

@@ -453,7 +453,11 @@ func (c *controlConn) query(statement string, values ...interface{}) (iter *Iter
 			Logger.Printf("control: error executing %q: %v\n", statement, iter.err)
 		}

-		q.attempts++
+		host := c.getConn().host


Collapse this down to:

c.session.mu.Lock() q.getHostMetrics(c.getConn().host).attempts++ c.session.mu.Unlock()

It's just a lot of variables for no purpose. You can leave the host := line but I'd definitely collapse the attempt increase because it reads better.

Actually, this whole business of increasing the attempts in here is wrong and it needs to go. The query constructed above doesn't have a retry policy (and rightly so) but it will nevertheless be running through the whole query execution chain so it will also run through the query executor and end up calling attempt() on the query below which will handle recording the metrics. Am I missing something?

In that case you can just drop all of those lines.

As there's only one instance of the query running at any one time (it's blocking and wrapped in a for loop) a lock wouldn't be necessary anyway but yeah, dropping the whole code block is better I guess. 😅

I'm not sure this one follows the regular query execution. It seems we're constructing and executing the statement directly on conn, and not via the reqular query mechanisms. The retry policy is not constructed, which I reckon means it would be set to SimpleRetryPolicy{numAtt: 3}...so we still need to count attempts, but probably not locking.... :-)

annismckenzie · 2018-08-25T13:06:09Z

conn_test.go

+	// 	q.Statement, q.Rows, q.End.Sub(q.Start), host, q.Metrics.attempts, q.Metrics.totalLatency, q.Err)
+}
+
+func (o *testQueryObserver) GetMetrics(host *HostInfo) *queryMetrics {


no need for an exported function in a test

annismckenzie · 2018-08-25T13:06:34Z

conn_test.go

@@ -57,13 +58,17 @@ func TestJoinHostPort(t *testing.T) {
 	}
 }

-func testCluster(addr string, proto protoVersion) *ClusterConfig {
-	cluster := NewCluster(addr)
+func testMultiNodeCluster(addresses []string, proto protoVersion) *ClusterConfig {


Nice, I'll need this too. 👍

annismckenzie · 2018-08-25T13:24:02Z

session.go

 		Err:     iter.err,
-		Attempt: b.attempts,


If you're good with my suggestions above, you may want to restore this field and pass b.totalAttempts. On the other hand, query observers could be fine with calculating it from hostMetrics. Your call.

annismckenzie · 2018-08-25T13:27:19Z

session.go

@@ -826,8 +859,8 @@ func (q *Query) attempt(keyspace string, end, start time.Time, iter *Iter, host
 			End:       end,
 			Rows:      iter.numRows,
 			Host:      host,
+			Metrics:   hostMetrics,


I hope it's okay that the query observer will access the fields in hostMetrics while they could be changed – you're protecting write access above but that mutex is unlocked in line 852. Could that race? 🤔 😅

Pretty sure it's okay as it's serialized.

annismckenzie · 2018-08-25T13:29:33Z

session.go

 	s.mu.RUnlock()
 }

+func (q *Query) getHostMetrics(host *HostInfo) *queryMetrics {
+	q.session.mu.Lock()
+	defer q.session.mu.Unlock()


As there's no way you're returning from this function early, I'd drop the defer and move the mutex unlock down. It's pretty clear so there's no need for the defer overhead, especially because this could be called a sizable number of times while a query is executing and the retry policy consulted. Come to think of it, making this available to the retry policy isn't implemented yet, right?

no, not implemented, and I'm not sure retryPolicy needs to deal with metrics at all. But considering the Attempt() gets the whole query, it may, in fact, use the metrics if relevant for the retry policy.

annismckenzie · 2018-08-25T13:30:51Z

policies_test.go

@@ -263,8 +263,12 @@ func TestSimpleRetryPolicy(t *testing.T) {
 		{5, false},
 	}

+	// initiate metrics map


I think you can do away with this comment as well as the ones in lines 269 and 354.

annismckenzie · 2018-08-25T13:34:09Z

session.go

 	}
 	s.mu.RUnlock()
 	return batch
 }

+func (b *Batch) getHostMetrics(host *HostInfo) *queryMetrics {
+	b.mu.Lock()


The same suggestion I had above with changing this machinery applies to the Batch as well. Shame that it's duplicated. This screams separate manager object that holds query attempt metrics as well as latencies and total attempts. It would also include the mutex. Sorry for bringing it up. 😇

I totally agree :-) I've almost implemented it on Friday, just was too sleepy to make sure the tests pass. Hopefully it will resolve all the gripes with the current implementation.

Zariel · 2018-08-25T15:50:59Z

conn_test.go

@@ -57,13 +58,17 @@ func TestJoinHostPort(t *testing.T) {
 	}
 }

-func testCluster(addr string, proto protoVersion) *ClusterConfig {
-	cluster := NewCluster(addr)
+func testMultiNodeCluster(addresses []string, proto protoVersion) *ClusterConfig {


Define this like

func testMultiNodeCluster(proto protoVersion, addresses ...string) *ClusterConfig

That way callers dont need to wrap a single addr in a slice

wasn't sure that it's ok to change the order, but will do now :-)

Zariel · 2018-08-25T15:53:26Z

session.go

@@ -714,13 +732,26 @@ func (q Query) String() string {

 //Attempts returns the number of times the query was executed.
 func (q *Query) Attempts() int {
-	return q.attempts
+	q.session.mu.Lock()
+	defer q.session.mu.Unlock()


Why are we now locking the session mutex around the place?

will change

Zariel · 2018-08-25T15:54:04Z

session.go

 	s.mu.RUnlock()
 }

+func (q *Query) getHostMetrics(host *HostInfo) *queryMetrics {
+	q.session.mu.Lock()


What is this lock protecting?

The metrics map but it's unnecessary as per query there's only ever one attempt running at any given time so retries are serialized and there's no need for locking this structure. If we add speculative retries (@alourie said he'd like to work on that) we'll have to construct a wait group (or some other container for keeping track of parallel query attempts) and the metrics will have to be updated once at the end (first successful result, the other attempts will be canceled), making this approach valid still. I'm pretty sure we can drop it but we should run stress (I recently learned about that from someone 🏅) against the tests to make sure it's safe. But I obviously don't know the codebase like you do, what do you think?

@annismckenzie You're right, I am working on speculative queries executions, and indeed I use wait groups to keep track of the parallel query executions. The lock here is protecting from the parallel access of the attempts number in the query from different goroutines ....but mind you that I've built the speculative stuff a week before I started this work, and since then I've discovered a few holes in my understanding of how retryPolicies and retries in general work. So much could change and I might remove the locks after all.

Zariel · 2018-08-25T15:54:36Z

integration.sh


 		ccm clear
 		ccm start
 		sleep 1s

-		go test -tags "ccm gocql_debug" -timeout=5m -race $args
+		go test -tags "ccm gocql_debug" -timeout=15m -race $args


remove unrelated change

Zariel · 2018-08-25T15:57:17Z

@alourie this is a -1 from due to the lock required on the session, I don't see why this should require locking the session from down inside the query/control connection. The locking is non obvious and will add a burden to maintenance because now when the code changes the locking needs to be correctly move too. As well as being non obvious it will likely incur overall query performance degradation due to lock contention.

annismckenzie

Went through it once more while keeping @Zariel's valid concerns in mind. Maybe we can do away with the locks, see my comments.

annismckenzie · 2018-08-25T19:06:30Z

control.go

@@ -453,7 +453,11 @@ func (c *controlConn) query(statement string, values ...interface{}) (iter *Iter
 			Logger.Printf("control: error executing %q: %v\n", statement, iter.err)
 		}

-		q.attempts++
+		host := c.getConn().host


Actually, this whole business of increasing the attempts in here is wrong and it needs to go. The query constructed above doesn't have a retry policy (and rightly so) but it will nevertheless be running through the whole query execution chain so it will also run through the query executor and end up calling attempt() on the query below which will handle recording the metrics. Am I missing something?

annismckenzie · 2018-08-25T19:06:55Z

control.go

@@ -453,7 +453,11 @@ func (c *controlConn) query(statement string, values ...interface{}) (iter *Iter
 			Logger.Printf("control: error executing %q: %v\n", statement, iter.err)
 		}

-		q.attempts++
+		host := c.getConn().host


In that case you can just drop all of those lines.

annismckenzie · 2018-08-25T19:08:35Z

control.go

@@ -453,7 +453,11 @@ func (c *controlConn) query(statement string, values ...interface{}) (iter *Iter
 			Logger.Printf("control: error executing %q: %v\n", statement, iter.err)
 		}

-		q.attempts++
+		host := c.getConn().host


As there's only one instance of the query running at any one time (it's blocking and wrapped in a for loop) a lock wouldn't be necessary anyway but yeah, dropping the whole code block is better I guess. 😅

annismckenzie · 2018-08-25T19:09:09Z

query_executor.go

@@ -48,7 +48,8 @@ func (q *queryExecutor) checkRetryPolicy(rq ExecutableQuery, err error) (RetryTy
 	if p.Attempt(rq) {
 		return p.GetRetryType(err), nil
 	}
-	return p.GetRetryType(err), err


A rebase against the current master will make this go away as the PR has been merged already.

annismckenzie · 2018-08-25T19:09:43Z

session.go

@@ -658,6 +658,11 @@ func (s *Session) connect(host *HostInfo, errorHandler ConnErrorHandler) (*Conn,
 	return s.dial(host, s.connCfg, errorHandler)
 }

+type queryMetrics struct {
+	attempts     int


You're passing this to the query observer which can't access unexported fields – you'll have to at least export those.

Interesting point. Should we have an integration suite that's external to the main package? This way we'd catch issues like this.

That might be a bit much – one just has to remember that observers are being used with functions provided by the user. 😆

annismckenzie · 2018-08-25T19:14:23Z

session.go

 	s.mu.RUnlock()
 }

+func (q *Query) getHostMetrics(host *HostInfo) *queryMetrics {
+	q.session.mu.Lock()


The metrics map but it's unnecessary as per query there's only ever one attempt running at any given time so retries are serialized and there's no need for locking this structure. If we add speculative retries (@alourie said he'd like to work on that) we'll have to construct a wait group (or some other container for keeping track of parallel query attempts) and the metrics will have to be updated once at the end (first successful result, the other attempts will be canceled), making this approach valid still. I'm pretty sure we can drop it but we should run stress (I recently learned about that from someone 🏅) against the tests to make sure it's safe. But I obviously don't know the codebase like you do, what do you think?

annismckenzie · 2018-08-25T19:17:10Z

session.go

+	hostMetrics := q.getHostMetrics(host)
+	q.session.mu.Lock()
+	hostMetrics.attempts++
+	hostMetrics.totalLatency += end.Sub(start).Nanoseconds()


And that lock should not be necessary if my reasoning above is correct.

annismckenzie · 2018-08-25T19:17:39Z

session.go

@@ -826,8 +859,8 @@ func (q *Query) attempt(keyspace string, end, start time.Time, iter *Iter, host
 			End:       end,
 			Rows:      iter.numRows,
 			Host:      host,
+			Metrics:   hostMetrics,


Pretty sure it's okay as it's serialized.

Zariel · 2018-08-27T11:16:45Z

If it requires locking for concurrent metric updating then it should use much finer grained locking, per host/per query, or it should just atomics.

alourie · 2018-08-28T07:57:11Z

@Zariel @annismckenzie so I've cleared up all the things. Now not using locks at all, as indeed things run sequentially anyway. Will deal with locking when necessary in speculative query executions work.

Additionally:

Exported the queryMetric fields so that an Observer can use them.
Internal test functions are updated so that it is possible to launch multinode mock cluster without providing different calls or slices of strings.
Few subjective syntax improvements here and there
Cleanups

annismckenzie · 2018-08-28T08:24:35Z

Could you redo the changes into 4 or 5 logical commits? The last 3 are pretty generic and don't really say what they contain. 😬

alourie · 2018-08-28T08:33:43Z

@annismckenzie do you mean the whole thing or just the last 3? I can squash it all (or just the last ones), so that it is all just 1 commit. Also, many of the commits are just draft work, you probably have no interest in those anyway.

annismckenzie · 2018-08-28T10:18:13Z

Well, I see @Zariel usually squashes but that squash merge commit still links to this MR and I'd like it to be comprehensible. I really did mean to just do git reset master to do a soft reset to the current master state and just redoing the changes in 5 smaller logical commits:

Run Cassandra integration tests on 3.11.3 as well
Add myself as a contributor
the code for creating multinode mock clusters in tests
the whole Record query metrics on a per-host basis (could be argued that the tests should be their own commit)
lastly, Provide query observers with the per-host metrics

I'd find that quite valuable and the squash commit merge is also nice and short.

alourie · 2018-08-28T11:29:11Z

@annismckenzie uhm, you realize that what you're asking is quite bit of a manual work up with git's history, right? ;-) I don't mind doing it, it's just it will require force push, and while github will manage that fine, some comments/content may disappear. Is that ok with you?

annismckenzie · 2018-08-28T11:51:32Z

Yup, exactly what I'm asking. You basically did 3 implementations here and while it's sometimes valuable to see the evolution of a feature through time that isn't the case here. We should strive to make the implementation you chose clear (it also helps with issues being reported to point to a specific commit that introduced it).

Signed-off-by: Alex Lourie <alex@instaclustr.com>

* Now it's possible to spin multi-node test clusters * Batch tests moved from legacy to session.<> calls. Signed-off-by: Alex Lourie <alex@instaclustr.com>

Signed-off-by: Alex Lourie <alex@instaclustr.com>

alourie · 2018-08-28T14:31:47Z

@annismckenzie Done. Hope it is more clear this way.

annismckenzie · 2018-08-28T19:22:03Z

Very much so! 🎉

alourie · 2018-08-29T02:06:08Z

yay! Thanks @annismckenzie

Zariel · 2018-08-29T22:18:15Z

session.go

-	if q.attempts > 0 {
-		return q.totalLatency / int64(q.attempts)
+	attempts := 0
+	var latency int64 = 0


use latency := 0 or var attempt int dont mix both

Signed-off-by: Alex Lourie <alex@instaclustr.com>

Zariel · 2018-08-31T21:39:42Z

session.go

+	hostMetrics, exists := b.metrics[host.ConnectAddress().String()]
+	if !exists {
+		// if the host is not in the map, it means it's been accessed for the first time
+		hostMetrics = &queryMetrics{Attempts: 0, TotalLatency: 0}


nit: go will initialize these values to 0 so you don't need to

annismckenzie suggested changes Aug 19, 2018

View reviewed changes

annismckenzie suggested changes Aug 20, 2018

View reviewed changes

annismckenzie reviewed Aug 23, 2018

View reviewed changes

annismckenzie suggested changes Aug 25, 2018

View reviewed changes

Zariel reviewed Aug 25, 2018

View reviewed changes

annismckenzie reviewed Aug 25, 2018

View reviewed changes

alourie changed the title ~~Query metric per host 1155~~ WIP: Query metric per host 1155 Aug 27, 2018

Alex Lourie added 5 commits August 28, 2018 23:51

Store attempts and latencies in a host-based map

d8a759e

Signed-off-by: Alex Lourie <alex@instaclustr.com>

Update ObservedQuery for the new metrics.

13f2e46

Signed-off-by: Alex Lourie <alex@instaclustr.com>

Tests improvements

c74ce00

* Now it's possible to spin multi-node test clusters * Batch tests moved from legacy to session.<> calls. Signed-off-by: Alex Lourie <alex@instaclustr.com>

Added test for query with metrics and QueryObserver

8f2eb65

Signed-off-by: Alex Lourie <alex@instaclustr.com>

Test on Travis with newer C* version and add self to contributors

9446847

Signed-off-by: Alex Lourie <alex@instaclustr.com>

alourie force-pushed the queryMetricPerHost_1155 branch from 0776f52 to 9446847 Compare August 28, 2018 14:26

annismckenzie approved these changes Aug 28, 2018

View reviewed changes

alourie changed the title ~~WIP: Query metric per host 1155~~ Query metric per host 1155 Aug 29, 2018

Zariel reviewed Aug 29, 2018

View reviewed changes

Initiating values in the same way

6f2dcfe

Signed-off-by: Alex Lourie <alex@instaclustr.com>

Zariel reviewed Aug 31, 2018

View reviewed changes

Zariel merged commit 4d29881 into apache:master Aug 31, 2018

vprithvi mentioned this pull request Dec 7, 2018

Expose per host metrics for Cassandra store jaegertracing/jaeger#1243

Closed

		@@ -721,6 +805,11 @@ func NewTestServer(t testing.TB, protocol uint8, ctx context.Context) *TestServe
		go srv.serve()

		return srv

Query metric per host 1155 #1156

Query metric per host 1155 #1156

Conversation

alourie commented Aug 17, 2018 • edited Loading

annismckenzie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alourie commented Aug 20, 2018

alourie commented Aug 20, 2018

annismckenzie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annismckenzie Aug 21, 2018 • edited Loading

Choose a reason for hiding this comment

Zariel commented Aug 21, 2018

alourie commented Aug 22, 2018

annismckenzie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alourie Aug 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alourie commented Aug 23, 2018 via email

annismckenzie commented Aug 23, 2018

annismckenzie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alourie Aug 27, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zariel commented Aug 25, 2018

annismckenzie left a comment

alourie commented Aug 17, 2018 •

edited

Loading

annismckenzie Aug 21, 2018 •

edited

Loading

alourie Aug 23, 2018 •

edited

Loading

alourie Aug 27, 2018 •

edited

Loading