Make mergeQueryable more generically usable #4117

simonswine · 2021-04-26T10:31:43Z

What this PR does:

This change allows to use a mergeQueryable (by external projects), to aggregate results from multiple underlying Queryables.

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

trevorwhitney

Just a few minor things, nothing blocking. I did have a lot of trouble understanding what each of those test cases was testing, particularly with the nested table tests in selectorCases, so it was hard to verify the new logic was tested.

pkg/querier/tenantfederation/merge_queryable.go

trevorwhitney · 2021-04-28T02:47:50Z

pkg/querier/tenantfederation/merge_queryable.go

+		// ensure to byPass a single querier and avoid adding a tenant label to
+		// ensure backward compatabiltiy when enabling multi-tenant query
+		// federation.
+		return tenantIDs, queriers, true, nil


I didn't understand that comment until I read the GenericQuerierCallback further down, maybe worth pulling that boolean into a variable for the sake of naming/clarity on what it does?

Good point, i think the way I rearranged it makes it way clearer

pkg/querier/tenantfederation/merge_queryable.go

simonswine · 2021-04-28T12:59:52Z

I did have a lot of trouble understanding what each of those test cases was testing, particularly with the nested table tests in selectorCases, so it was hard to verify the new logic was tested.

@trevorwhitney good point and thanks for diving into depths there. Any ideas how to decouple testing? I couldn't think of a simpler way to ensure the Queryable behaves correctly. I guess part of the problem the defined interfaces contains nested interfaces

trevorwhitney · 2021-04-28T14:08:10Z

@simonswine I think the biggest logic change to make sure we test is the bypass with single querier logic. That might be tested in the following test case but I wasn't sure:

{
			name:       "single tenant",
			tenants:    []string{"team-a"},
			labelNames: []string{"instance", "tenant-team-a"},
			expectedLabelValues: map[string][]string{
				"instance": {"host1", "host2.team-a"},
			},
		},

simonswine · 2021-04-28T14:17:34Z

@trevorwhitney I think we do test the skip case, as if we wouldn't byPass with a single querier the labelNames would also contain identifyingLabelName in that test case:

  	labelNames: []string{"instance", "tenant-team-a"},

pstibrany

I think you went too far with making this "generic". I see the value in supplying custom tenant ID label name or prefix for conflicting labels, but the renames seem unnecessary (they are not even complete... there is genericQueryable but mergeQuerier) and make diff too large.

One think that's missing in mergeQueryable is tracing -- I think we should span to it. It can be separate PR.

pkg/querier/tenantfederation/merge_queryable.go

pstibrany · 2021-04-28T15:34:59Z

pkg/querier/tenantfederation/merge_queryable.go

+// byPassWithSingleQuerierUsing determines if it should still be using the
+// mergeQuerier (and adding a `identifyingLabelName` label) or if just the
+// underlying querier should be returned.
+type GenericQuerierCallback func(ctx context.Context, mint int64, maxt int64) (labelValues []string, queriers []storage.Querier, byPassWithSingleQuerier bool, err error)


In what scenario does it make sense for callback to return byPassWithSingleQuerier=false?

I think in general the callback should be returning byPassWithSingleQuerier=false. Even though we only have a single cluster we are querying from, we still want the idLabelName on the results.

In my use case I want to able to aggregate results from multiple cluster and I want to make sure even if there's only a single cluster returned by the call that this cluster is identifiable using the idLabelName. Once a user adds a second cluster later on, there is no user visible change to the results if we also have the label on the single querier case.

The reason why for tenant federation byPassWithSingleQuerier=true, is so that we don't have a breaking behavior, when enabling tenant federation in a cortex cluster.

Even though we only have a single cluster we are querying from, we still want the idLabelName on the results.

Makes sense to me.

The reason why for tenant federation byPassWithSingleQuerier=true, is so that we don't have a breaking behavior, when enabling tenant federation in a cortex cluster.

This feature is still marked as experimental. As long as we properly document the change in the changelog, it's ok to break current behavior, esp. if it is not quite correct. And supporting only single case in the code will be easier. WDYT?

I think I am less worried about a breaking change in the experimental feature, than the future experience of turning tenant federation on. With having an additonal "id" label in the single user queries, it would be quite disruptive, while otherwise it would be not changing anything. I still think its worthy of keeping.

//cc @jtlisi wdyt?

So just to clarify the breaking change here would be always returning the __tenant_id__ on every single query?

I'm personally fine with that change. I honestly don't see it being a change that bothers people. I also think it would help end users become more familiar with tenancy in Cortex.

Exactly that, once tenant federation is switched on, every query would return a __tenant_id__ label.

OK maybe I was a bit too worried how user facing change it is. I will change this PR accordingly and add a [CHANGE] note to the changelog.

Discussed offline. We agreed on keeping the mechanism to bypass mergeQuerier logic when only single tenant is queried. Otherwise all queries would start returning extra labels, when using the feature.

I have moved byPassWithSingleQuerier to now be a parameter to NewQueryable and added a test case

I have moved byPassWithSingleQuerier to now be a parameter to NewQueryable and added a test case

Awesome, I much prefer doing it in the creation function and not the callback

pkg/querier/tenantfederation/merge_queryable.go

This changes allow to mergeQueryables to be reused in other cases, that require multiple underlying queryables to be aggregated. Signed-off-by: Christian Simon <simon@swine.de>

Signed-off-by: Christian Simon <simon@swine.de>

jtlisi

LGTM, I much prefer handling this at instantiation

simonswine · 2021-04-29T17:44:24Z

One think that's missing in mergeQueryable is tracing -- I think we should span to it. It can be separate PR.

That's something I am not particularly familiar, but I think this will be very valuable. Factored this out into #4147

pstibrany · 2021-04-29T18:21:16Z

That's something I am not particularly familiar, but I think this will be very valuable. Factored this out into #4147

Thanks for creating the issue.

pstibrany

Thanks!

pull-request-size bot added the size/XL label Apr 26, 2021

simonswine mentioned this pull request Apr 26, 2021

Add concurrency to the mergeQueryable #4065

Merged

2 tasks

simonswine force-pushed the 20210423_make-merge-queryable-more-generic branch from ea0f085 to 34d8b75 Compare April 27, 2021 17:11

pull-request-size bot added size/L and removed size/XL labels Apr 27, 2021

simonswine marked this pull request as ready for review April 27, 2021 17:15

trevorwhitney approved these changes Apr 28, 2021

View reviewed changes

pstibrany reviewed Apr 28, 2021

View reviewed changes

pkg/querier/tenantfederation/merge_queryable.go Outdated Show resolved Hide resolved

pkg/querier/tenantfederation/merge_queryable.go Outdated Show resolved Hide resolved

Make mergeQueryable more generically reusable

83b87ce

This changes allow to mergeQueryables to be reused in other cases, that require multiple underlying queryables to be aggregated. Signed-off-by: Christian Simon <simon@swine.de>

simonswine force-pushed the 20210423_make-merge-queryable-more-generic branch from 00ec9b4 to 83b87ce Compare April 29, 2021 10:08

Move byPassSingleQuerier to parameter

6f31fac

Signed-off-by: Christian Simon <simon@swine.de>

jtlisi approved these changes Apr 29, 2021

View reviewed changes

simonswine mentioned this pull request Apr 29, 2021

Add tracing to the tenant federation mergeQueryable #4147

Closed

pstibrany approved these changes May 5, 2021

View reviewed changes

pstibrany merged commit b6eea5f into cortexproject:master May 5, 2021

Make mergeQueryable more generically usable #4117

Make mergeQueryable more generically usable #4117

Uh oh!

Conversation

simonswine commented Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trevorwhitney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

simonswine commented Apr 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trevorwhitney commented Apr 28, 2021

Uh oh!

simonswine commented Apr 28, 2021

Uh oh!

pstibrany left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pstibrany Apr 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jtlisi left a comment

Choose a reason for hiding this comment

Uh oh!

simonswine commented Apr 29, 2021

Uh oh!

pstibrany commented Apr 29, 2021

Uh oh!

pstibrany left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simonswine commented Apr 26, 2021 •

edited

Loading

simonswine commented Apr 28, 2021 •

edited

Loading

pstibrany Apr 29, 2021 •

edited

Loading