Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deletion: fix log deletion with line filters #8151

Merged

Conversation

sandeepsukhani
Copy link
Contributor

@sandeepsukhani sandeepsukhani commented Jan 16, 2023

What this PR does / why we need it:
When a delete request is issued with a line filter, the current deletion code always assumes that the chunk would be re-written with a new id and marks the existing chunk for deletion. However, if no lines match the line filter, then the chunk is not changed. This causes us to delete the whole chunk unexpectedly.

This PR takes care of the issue by detecting if there was any data deleted from the chunk. If there is no change in the chunk then it should be a no-op.
Additionally, I have refactored the code to simplify partial chunk deletion by accepting time.Time in filter.Func. This way we do not have to calculate the chunk remainders beforehand and use this filter.Func to decide whether the log line is supposed to be deleted or not while iterating through the logs in the chunk. This should not add much overhead since the chunk slicer anyways iterates through the log lines for building a new chunk.

I have also improved the integration tests to cover all the deletion modes and test the deletion with/without query time filtering.

Checklist

  • Tests updated
  • CHANGELOG.md updated
Not Found
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@@ -332,75 +339,85 @@ func newChunkRewriter(chunkClient client.Client, tableName string, chunkIndexer
}
}

func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, intervalFilters []IntervalFilter) (bool, error) {
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, filterFunc filter.Func) (bool, bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment that explains the return values? It's only at the end of this long function that you can see what they mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

@MasslessParticle MasslessParticle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way easier to follow. LGTM with a couple of comments

},
Filter: ff,
})
// delete request is without a line filter so initialize it to always return true to simplify the code below for partial chunk deletion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe stylistic:

Because everything is derived from a DeleteRequest, I'd expect everything from lines 146 -> 168 to be par of d.FilterFunction. It would definitely simplify this function and keep everything associated with creating a "correct" filter function together.

wdyt?

@@ -332,75 +339,85 @@ func newChunkRewriter(chunkClient client.Client, tableName string, chunkIndexer
}
}

func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, intervalFilters []IntervalFilter) (bool, error) {
func (c *chunkRewriter) rewriteChunk(ctx context.Context, ce ChunkEntry, tableInterval model.Interval, filterFunc filter.Func) (bool, bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

1 similar comment
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@sandeepsukhani sandeepsukhani changed the title fix chunk deletion with filters deletion: fix log deletion with line filters Jan 30, 2023
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@MichelHollands MichelHollands left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MasslessParticle MasslessParticle merged commit b2d0481 into grafana:main Jan 30, 2023
@DylanGuedes DylanGuedes added the type/bug Somehing is not working as expected label Feb 23, 2023
@DylanGuedes DylanGuedes added the backport release-2.7.x add to a PR to backport it into release 2.7.x label Feb 23, 2023
@grafanabot
Copy link
Collaborator

The backport to release-2.7.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-8151-to-release-2.7.x origin/release-2.7.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x b2d0481cbeee17355a4305ba48b7517c0cdea080
# Push it to GitHub
git push --set-upstream origin backport-8151-to-release-2.7.x
git switch main
# Remove the local backport branch
git branch -D backport-8151-to-release-2.7.x

Then, create a pull request where the base branch is release-2.7.x and the compare/head branch is backport-8151-to-release-2.7.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport release-2.7.x add to a PR to backport it into release 2.7.x backport-failed size/XXL type/bug Somehing is not working as expected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants