Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/cumulativetodelta] Heavy memory usage of histograms #13751

Closed
balintzs opened this issue Aug 31, 2022 · 5 comments
Closed

[processor/cumulativetodelta] Heavy memory usage of histograms #13751

balintzs opened this issue Aug 31, 2022 · 5 comments
Assignees
Labels
priority:p2 Medium processor/cumulativetodelta Cumulative To Delta processor

Comments

@balintzs
Copy link
Contributor

Problem Description:
The Cumulative to Delta processor needs to record all numerical values of cumulative datapoints in order to calculate the difference between them. Values are stored based on their identity, which contains all resource, scope and datapoint attributes.

For histograms, the sum, the count and all bucket counts are stored as separate values. This causes huge spikes of heap memory usage when a large number of histogram datapoints pass through the processor.

The following screenshots show the heap memory allocation bytes and working set bytes for the same amount of data:
A) With Cumulative to Delta Processor turned on:
Screenshot 2022-08-31 at 16 51 33
B) With Cumulative to Delta Processor turned off:
Screenshot 2022-08-31 at 16 54 03

Proposed Solution:
After discussing the issue with @mistodon, we believe that instead of treating each numerical value in a histogram datapoint separately, we should store them together under a single entity, and add handling for calculating the delta of each value contained within.

@balintzs
Copy link
Contributor Author

Link to issue requesting histogram support in Cumulative to Delta Processor: #12423

@TylerHelmuth
Copy link
Member

@balintzs is this something you plan to work on?

@TylerHelmuth TylerHelmuth added priority:p2 Medium processor/cumulativetodelta Cumulative To Delta processor labels Sep 1, 2022
@balintzs
Copy link
Contributor Author

balintzs commented Sep 1, 2022

@TylerHelmuth, yes we have already started working on it.

codeboten pushed a commit that referenced this issue Sep 21, 2022
#13751) (#14007)

Currently a MetricIdentity struct is allocated for each count/sum/bucket of a histogram which means that all of the attributes for the datapoint are duplicated across each component within the datapoint. This change bundles the histogram values into one unit (HistogramValue) to be converted, which requires only one MetricIdentity, reducing memory footprint.
@balintzs
Copy link
Contributor Author

balintzs commented Oct 5, 2022

we found the cause of the memory issue and opened a PR with the fix: #14729

@balintzs
Copy link
Contributor Author

balintzs commented Oct 6, 2022

We have been running OTel with the above fix in place (using a custom image we built) and it seems that this issue is now effectively resolved.
See the following charts for the before and after (fix deployed on Oct 5 at 1pm):
Screenshot 2022-10-06 at 08 58 55

@balintzs balintzs closed this as completed Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:p2 Medium processor/cumulativetodelta Cumulative To Delta processor
Projects
None yet
Development

No branches or pull requests

2 participants