-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ingester/fix] Apply sanitizers to avoid panic on span.process=nil #3819
Conversation
assert.Equal(t, 0, len(dbSpanWithNilTags.Process.Tags)) | ||
assert.Equal(t, 0, len(dbSpanWithNilProcess.Process.Tags)) | ||
|
||
tagsMap := map[string]interface{}(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this was expected but this is what I got from the convertKeyValuesString
func if the Tags was nil.
fb2a295
to
d21e22d
Compare
return Process{ | ||
ServiceName: process.ServiceName, | ||
ServiceName: process.GetServiceName(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is exactly the type of change that I did not want to make. It avoids a panic, but does it silently, creating invalid data in the process (service name is not allowed to be empty, it could break many other places, including the UI).
I've added a sanitizer in #3631. We can try invoking it early in the Kafka consumer path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at this and making things clearer. Let me take a look at the sanitizer and try what you suggested.
It make sense that the service name must not be empty, so I might revert the change to process.GetServiceName() here. For the process.GetTags, I think it still nicely handles the nil
cases those led to the panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it still nicely handles the nil cases those led to the panics.
Yes, but again, it does that accidentally, it so happens that in this context the behavior is appropriate. But fundamentally you're trying to work around malformed data that should've never gotten to this point in the pipeline. I don't mind this specific change, but at least with the current code we knew that there is a data issue (even though panicing on it is not ideal reaction).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yurishkuro , thank you so much for your review. As you mentioned, I understand it's important to detect data issue in our system, so should we return error here and let caller know that we do not accept the nil process for example?
I think panic should be considered if we do want application immediately crash. Otherwise, we can consider error.
Thank you for your explanation and effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we apply the sanitizer earlier in the pipeline then this code does need to change as it will never have to deal with nil process.
assert.Equal(t, 0, len(dbSpanWithNilTags.Process.Tags)) | ||
assert.Equal(t, 0, len(dbSpanWithNilProcess.Process.Tags)) | ||
|
||
tagsMap := map[string]interface{}(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tagsMap := map[string]interface{}(nil) | |
nilMap := map[string]interface{}(nil) |
dbSpanWithNilTags := converter.FromDomainEmbedProcess(&spanWithNilProcessTags) | ||
dbSpanWithNilProcess := converter.FromDomainEmbedProcess(&spanWithNilProcess) | ||
|
||
assert.Equal(t, 3, len(dbSpanWithNilTags.Tags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert.Equal(t, 3, len(dbSpanWithNilTags.Tags)) | |
assert.Len(t, 3, dbSpanWithNilTags.Tags) |
dd9064b
to
7b08a60
Compare
Add `all-in-one` component to the list of components to build docker images for. Signed-off-by: Albert Teoh <see.kwang.teoh@gmail.com> Signed-off-by: Loc Mai <locmai0201@gmail.com>
…g#3814) Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.36.0 to 0.37.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.36.0...v0.37.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
* Tenancy for queries Signed-off-by: Ed Snible <snible@us.ibm.com> * New parameter for RegisterGRPCGateway() test function Signed-off-by: Ed Snible <snible@us.ibm.com> * More tests that are local to package itself Signed-off-by: Ed Snible <snible@us.ibm.com> * Additional test cases to raise test coverage Signed-off-by: Ed Snible <snible@us.ibm.com> * Fix test Signed-off-by: Ed Snible <snible@us.ibm.com> * spelling Signed-off-by: Ed Snible <snible@us.ibm.com> * Rename file Signed-off-by: Ed Snible <snible@us.ibm.com> * Refactor tenancy packages Signed-off-by: Ed Snible <snible@us.ibm.com> * restore empty_test.go as part of refactoring tenancy out of storage Signed-off-by: Ed Snible <snible@us.ibm.com> * lint/gofumpt Signed-off-by: Ed Snible <snible@us.ibm.com> * Enforce tenancy on non-streaming gRPC and add additional tests Signed-off-by: Ed Snible <snible@us.ibm.com> * Test for tenant flow to storage for both streaming and unary RPC Signed-off-by: Ed Snible <snible@us.ibm.com> * HTTP tenancy test Signed-off-by: Ed Snible <snible@us.ibm.com> * Unit test for unary tenancy handler Signed-off-by: Ed Snible <snible@us.ibm.com> * Factor out rendundent test function Signed-off-by: Ed Snible <snible@us.ibm.com> * Address review comments Signed-off-by: Ed Snible <snible@us.ibm.com> * Error name Signed-off-by: Ed Snible <snible@us.ibm.com> * Refactor TenancyConfig to TenancyManager Signed-off-by: Ed Snible <snible@us.ibm.com> * Address review comments Signed-off-by: Ed Snible <snible@us.ibm.com> * Refactor so that NewTenancyManager() only called from main and tests Signed-off-by: Ed Snible <snible@us.ibm.com> Signed-off-by: Loc Mai <locmai0201@gmail.com>
Signed-off-by: Loc Mai <locmai0201@gmail.com>
7b08a60
to
26a4398
Compare
Signed-off-by: Loc Mai <locmai0201@gmail.com>
… fix/check-nil-process
model.Int64("b.b", 1), | ||
} | ||
|
||
spanWithNilTags := model.Span{Tags: tags, Process: &model.Process{Tags: nil}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test would pass without any code changes in the PR. It's the nil Process that's causing panic in production, not the nil tags (nil tags are still iterable).
Signed-off-by: Yuri Shkuro <github@ysh.us>
Codecov Report
@@ Coverage Diff @@
## main #3819 +/- ##
==========================================
+ Coverage 97.58% 97.59% +0.01%
==========================================
Files 291 291
Lines 16938 16939 +1
==========================================
+ Hits 16529 16532 +3
+ Misses 323 321 -2
Partials 86 86
Continue to review full report at Codecov.
|
Which problem is this PR solving?
Short description of the changes