-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/stanza] Reconsider aggresive sampling policy for logger #23801
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
@BinaryFissionGames I think this is a good idea. The behavior is a holdover from when stanza was an independent agent. I see no reason we should be overriding global behaviors at this point. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
**Description:** Remove sampling policy from logger **Link to tracking Issue:** Fixes #23801
After #29886 was merged, I noted a change in the benchmarks. It looks like a 5-10% increase in CPU usage for at least some of the benchmark tests: |
**Description:** Remove sampling policy from logger **Link to tracking Issue:** Fixes open-telemetry#23801
I spent quite some time looking into this today and found nothing conclusive. However, there was a preexisting problem, fixed by #30301, which would may have contributed to excess logging. I can't see the full history but it appears these two tests specifically show better performance than before this change. I'm closing this issue but we feel free to reopen if there is more to discuss. |
Component(s)
pkg/stanza
Describe the issue you're reporting
Stanza pipelines are built with a very aggressive sampling policy.
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/2d23e4d9e0eb313da5d192e1066f444d19b8601f/pkg/stanza/pipeline/config.go#L38C2-L38C2
If I'm reading and understanding this correctly, this means that every second, 1 initial log of a particular message is allowed, then only 1/10000 logs of that particular message is allowed.
This seems overly aggressive. In fact, it actually makes the file consumer only log the first log it finds on startup here:
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/2d23e4d9e0eb313da5d192e1066f444d19b8601f/pkg/stanza/fileconsumer/file.go#L188C35-L188C35
Meaning, if you have 10 files that are picked up on startup, only one of them is logged. That poses a challenge for debugging whether a file is picked up at all.
In my opinion, we should rely on the logger already being sampled. The collector already supports sampling. It also samples by default with initial = 100, and thereafter = 100.
The text was updated successfully, but these errors were encountered: