Skip to content

Latest commit

 

History

History
130 lines (105 loc) · 6.9 KB

File metadata and controls

130 lines (105 loc) · 6.9 KB

Memory Limiter Processor

Status
Stability beta: traces, metrics, logs
Distributions core, contrib, k8s
Issues Open issues Closed issues

Overview

The memory limiter processor is used to prevent out of memory situations on the collector. Given that the amount and type of data the collector processes is environment-specific and resource utilization of the collector is also dependent on the configured processors, it is important to put checks in place regarding memory usage.

Functionality

The memory limiter processor performs periodic checks of memory usage and will begin refusing data and forcing GC to reduce memory consumption when defined limits have been exceeded.

The processor uses soft and hard memory limits. The hard limit is defined via the limit_mib configuration option, and is always above or equal to the soft limit. The difference between the soft limit and hard limit is defined via the spike_limit_mib configuration option.

The processor will enter memory limited mode and will start refusing the data when memory usage exceeds the soft limit. This is done by returning errors to the preceding component in the pipeline that made the ConsumeLogs/Trace/Metrics function call.

In memory limited mode the error returned by ConsumeLogs/Trace/Metrics function is a non-permanent error. When receivers see this error they are expected to retry sending the same data. The receivers may also apply backpressure to their own data sources in order to slow the inflow of data into the Collector, and to allow memory usage to go below the set limits.

Warning: Data will be permanently lost if the component preceding the memory limiter in the telemetry pipeline does not correctly retry sending data after it has been refused by the memory limiter. We consider such components to be incorrectly implemented.

When the memory usage is above the hard limit the processor will additionally force garbage collection to be performed.

Normal operation is resumed when memory usage drops below the soft limit, meaning data will no longer be refused and the processor won't force garbage collection to be performed.

Best Practices

Note that while the processor can help mitigate out of memory situations, it is not a replacement for properly sizing and configuring the collector. Keep in mind that if the soft limit is crossed, the collector will return errors to all receive operations until enough memory is freed. This may eventually result in dropped data since the receivers may not be able to retry the data indefinitely.

It is highly recommended to configure the GOMEMLIMIT environment variable as well as the memory_limiter processor on every collector. GOMEMLIMIT should be set to 80% of the hard memory limit of your collector. For the memory_limiter processor, the best practice is to add it as the first processor in a pipeline. This is to ensure that backpressure can be sent to applicable receivers and minimize the likelihood of dropped data when the memory_limiter gets triggered.

The value of the spike_limit_mib configuration option should be selected in a way that ensures that memory usage cannot increase by more than this value within a single memory check interval. Otherwise, memory usage may exceed the hard limit, even if temporarily. A good starting point for spike_limit_mib is 20% of the hard limit. Bigger spike_limit_mib values may be necessary for spiky traffic or for longer check intervals.

Configuration

Please refer to memorylimiter.go for the config spec.

The following configuration options must be changed:

  • check_interval (default = 0s): Time between measurements of memory usage. The recommended value is 1 second. If the expected traffic to the Collector is very spiky then decrease the check_interval or increase spike_limit_mib to avoid memory usage going over the hard limit.
  • limit_mib (default = 0): Maximum amount of memory, in MiB, targeted to be allocated by the process heap. Note that typically the total memory usage of process will be about 50MiB higher than this value. This defines the hard limit.
  • spike_limit_mib (default = 20% of limit_mib): Maximum spike expected between the measurements of memory usage. The value must be less than limit_mib. The soft limit value will be equal to (limit_mib - spike_limit_mib). The recommended value for spike_limit_mib is about 20% limit_mib.
  • limit_percentage (default = 0): Maximum amount of total memory targeted to be allocated by the process heap. This configuration is supported on Linux systems with cgroups and it's intended to be used in dynamic platforms like docker. This option is used to calculate memory_limit from the total available memory. For instance setting of 75% with the total memory of 1GiB will result in the limit of 750 MiB. The fixed memory setting (limit_mib) takes precedence over the percentage configuration.
  • spike_limit_percentage (default = 0): Maximum spike expected between the measurements of memory usage. The value must be less than limit_percentage. This option is used to calculate spike_limit_mib from the total available memory. For instance setting of 25% with the total memory of 1GiB will result in the spike limit of 250MiB. This option is intended to be used only with limit_percentage.

Examples:

processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 4000
    spike_limit_mib: 800
processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 50
    spike_limit_percentage: 30

Refer to config.yaml for detailed examples on using the processor.