-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deletion: fix log deletion with line filters #8151
deletion: fix log deletion with line filters #8151
Conversation
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
@@ -332,75 +339,85 @@ func newChunkRewriter(chunkClient client.Client, tableName string, chunkIndexer | |||
} | |||
} | |||
|
|||
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, intervalFilters []IntervalFilter) (bool, error) { | |||
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, filterFunc filter.Func) (bool, bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment that explains the return values? It's only at the end of this long function that you can see what they mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way easier to follow. LGTM with a couple of comments
}, | ||
Filter: ff, | ||
}) | ||
// delete request is without a line filter so initialize it to always return true to simplify the code below for partial chunk deletion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe stylistic:
Because everything is derived from a DeleteRequest
, I'd expect everything from lines 146 -> 168 to be par of d.FilterFunction
. It would definitely simplify this function and keep everything associated with creating a "correct" filter function together.
wdyt?
@@ -332,75 +339,85 @@ func newChunkRewriter(chunkClient client.Client, tableName string, chunkIndexer | |||
} | |||
} | |||
|
|||
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, intervalFilters []IntervalFilter) (bool, error) { | |||
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, filterFunc filter.Func) (bool, bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
1 similar comment
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-8151-to-release-2.7.x origin/release-2.7.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x b2d0481cbeee17355a4305ba48b7517c0cdea080
# Push it to GitHub
git push --set-upstream origin backport-8151-to-release-2.7.x
git switch main
# Remove the local backport branch
git branch -D backport-8151-to-release-2.7.x Then, create a pull request where the |
(cherry picked from commit b2d0481)
(cherry picked from commit b2d0481)
What this PR does / why we need it:
When a delete request is issued with a line filter, the current deletion code always assumes that the chunk would be re-written with a new id and marks the existing chunk for deletion. However, if no lines match the line filter, then the chunk is not changed. This causes us to delete the whole chunk unexpectedly.
This PR takes care of the issue by detecting if there was any data deleted from the chunk. If there is no change in the chunk then it should be a no-op.
Additionally, I have refactored the code to simplify partial chunk deletion by accepting
time.Time
in filter.Func. This way we do not have to calculate the chunk remainders beforehand and use thisfilter.Func
to decide whether the log line is supposed to be deleted or not while iterating through the logs in the chunk. This should not add much overhead since the chunk slicer anyways iterates through the log lines for building a new chunk.I have also improved the integration tests to cover all the deletion modes and test the deletion with/without query time filtering.
Checklist
CHANGELOG.md
updated