Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tsdb/inverted index #6233

Merged
merged 3 commits into from
May 24, 2022
Merged

Tsdb/inverted index #6233

merged 3 commits into from
May 24, 2022

Conversation

owen-d
Copy link
Member

@owen-d owen-d commented May 24, 2022

In order to maintain query consistency, we must ensure the same shard (i.e. 1_of_16) corresponds to the same series. Since TSDB calculates shard inclusion differently (by using bit prefixes instead of modulos), we need to account for this change when resolving streams in the ingester's inverted index.

This is the first PR which doesn't use the new code paths.
Also, because the bit prefixed inverted index will support dynamic sharding of higher or lower shard factors, it occasionally needs to do additional computation when translating higher granularity shard factors to lower granularity ones (i.e. 16 -> 4).

ref #5428

@owen-d owen-d requested a review from a team as a code owner May 24, 2022 13:20
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
-        distributor	-0.3%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@owen-d owen-d merged commit c032da9 into grafana:main May 24, 2022
Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, have a view questions though.

// we can _exactly_ match ob1 => (ob10, ob11) and know all fingerprints in those
// resulting shards have the requested ob1 prefix (don't need to filter).
var filter bool
if shard.Of > len(ii.shards) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between len(ii.shards) and ii.totalShards? Since it seems that the length of ii.shards does not change, we could use ii.totalShards instead.

Suggested change
if shard.Of > len(ii.shards) {
if shard.Of > ii.totalShards {

// the requested shard's min/max fingerprint values
// in order to calculate the indices for the inverted index's
// shard factor.
requiredBits := index.NewShard(0, uint32(len(ii.shards))).RequiredBits()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for index.NewShard(), why not index.NewShard(0, ii.totalShards)?

Comment on lines +127 to +137
if len(matchers) == 0 {
for i := range shards {
fps := shards[i].allFPs()
result = append(result, fps...)
}
} else {
for i := range shards {
fps := shards[i].lookup(matchers)
result = append(result, fps...)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this extra logic for no matchers be abstracted into the lookup() function of indexShard?
I saw that we do the same logic in the Lookup() function of InvertedIndex as well.

}

if 1<<(shard.TSDB().RequiredBits()) != shard.Of {
return fmt.Errorf("Shard factor must be a power of two, got %d", shard.Of)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error strings should not be capitalized

Suggested change
return fmt.Errorf("Shard factor must be a power of two, got %d", shard.Of)
return fmt.Errorf("shard factor must be a power of two, got %d", shard.Of)

require.Nil(t, err)
}

func Test_BitPrefixDeleteAddLoopkup(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Suggested change
func Test_BitPrefixDeleteAddLoopkup(t *testing.T) {
func Test_BitPrefixDeleteAddLookup(t *testing.T) {

Comment on lines +101 to +102
// for _, shard := range []uint32{2, 4, 8, 16, 32, 64, 128} {
for _, shard := range []uint32{2} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you only tested for 2 shards?

return nil, err
}

var extractor func(unlockIndex) []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wdyt about extracting a type for this function?

@chaudum
Copy link
Contributor

chaudum commented May 25, 2022

@owen-d I was late to the game and submitted my review only after the PR was already merged. Any chance you could answer the questions if it's not too much effort? 😇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants