Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bloomstore): Introduce specialised cache for blocks #12257

Merged
merged 12 commits into from
Mar 20, 2024

Conversation

chaudum
Copy link
Contributor

@chaudum chaudum commented Mar 19, 2024

What this PR does / why we need it:

The blocks cache is an in-memory cache that represents the single source of truth of what blocks are available on the file system and manages their EOL lifecycle.

Special notes for your reviewer:

There is a follow-up branch to wire up the new cache, based on this branch. See https://github.com/grafana/loki/compare/chaudum/blocks-cache-impl?expand=1

@chaudum chaudum force-pushed the chaudum/blocks-cache branch 2 times, most recently from 2771174 to b99d663 Compare March 19, 2024 13:01
The blocks cache is an in-memory cache that represents the single source
of truth of what blocks are available on the filesystem and manages
their EOL lifecycle.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@chaudum chaudum marked this pull request as ready for review March 19, 2024 16:27
@chaudum chaudum requested a review from a team as a code owner March 19, 2024 16:27
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@chaudum
Copy link
Contributor Author

chaudum commented Mar 20, 2024

Regarding commit a1ad00d

Before:

$ gotest -v -count=1 ./pkg/storage/stores/shipper/bloomshipper/... -test.run=^$ -test.bench=Benchmark_ -benchtime=1000x
goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/storage/stores/shipper/bloomshipper
cpu: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Benchmark_BlocksCacheOld
Benchmark_BlocksCacheOld-8   	    1000	     39545 ns/op	     679 B/op	       7 allocs/op
Benchmark_BlocksCacheNew
Benchmark_BlocksCacheNew-8   	    1000	    177112 ns/op	   32719 B/op	    2008 allocs/op

After:

$ gotest -v -count=1 ./pkg/storage/stores/shipper/bloomshipper/... -test.run=^$ -test.bench=Benchmark_ -benchtime=1000x 
goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/storage/stores/shipper/bloomshipper
cpu: 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Benchmark_BlocksCacheOld
Benchmark_BlocksCacheOld-8   	    1000	     41728 ns/op	     678 B/op	       7 allocs/op
Benchmark_BlocksCacheNew
Benchmark_BlocksCacheNew-8   	    1000	     50552 ns/op	     701 B/op	       8 allocs/op

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
_, found = cache.Get(ctx, "b")
require.False(t, found)

require.Equal(t, int64(8), cache.currSizeBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add here another key which is older than b but has a refcount and therefore shouldn't be evicted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is actually a bug in the LRU eviction job


// write
for k, v := range entries {
err := cache.Store(ctx, []string{k}, []BlockDirectory{v})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IIUC this is the cache that we want to replace with the new implementation. Maybe we can rename the new PutMany by Store so we don't need to change it wherever we use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the bloomstore we use only ever use Put() to add a single entry.
In the benchmark I wanted to make the calls the same, though.

require.NoError(t, err)

err = cache.Put(ctx, "key", CacheValue("b", 10))
require.ErrorContains(t, err, "entry already exists: key")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can use errAlreadyExists right away so if one changes, the test does as well.


ctx := context.Background()

_ = cache.Put(ctx, "a", CacheValue("a", 5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should test PutInc here.

return nil
}

// NewFsBlocksCache returns a new initialised BlocksCache with the key and value of requested types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is outdated

return entry.Value, true
}

func (c *BlocksCache) get(_ context.Context, key string) *Entry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ctx is unused, I think we should remove it.

close(c.done)
}

func (c *BlocksCache) sizeOf(entry *Entry) int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I' rather use entry.Value.Size() right away instead of this method.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@chaudum chaudum enabled auto-merge (squash) March 20, 2024 15:13
@chaudum chaudum merged commit 0f48ce8 into main Mar 20, 2024
12 checks passed
@chaudum chaudum deleted the chaudum/blocks-cache branch March 20, 2024 15:40
edsoncelio pushed a commit to edsoncelio/loki that referenced this pull request Mar 22, 2024
The blocks cache is an in-memory cache that represents the single source of truth of what blocks are available on the file system and manages their end-of-life behaviour.

There is a follow-up branch to wire up the new cache, based on this branch. 

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
The blocks cache is an in-memory cache that represents the single source of truth of what blocks are available on the file system and manages their end-of-life behaviour.

There is a follow-up branch to wire up the new cache, based on this branch. 

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants