Status | |
---|---|
Stability | alpha: logs |
beta: traces | |
Distributions | core, contrib, aws, observiq, splunk, sumo |
Issues | |
Code Owners | @jpkrohling |
The probabilistic sampler supports two types of sampling for traces:
sampling.priority
semantic convention as defined by OpenTracing- Trace ID hashing
The sampling.priority
semantic convention takes priority over trace ID hashing. As the name
implies, trace ID hashing samples based on hash values determined by trace IDs. See Hashing for more information.
The following configuration options can be modified:
hash_seed
(no default): An integer used to compute the hash algorithm. Note that all collectors for a given tier (e.g. behind the same load balancer) should have the same hash_seed.sampling_percentage
(default = 0): Percentage at which traces are sampled; >= 100 samples all traces
Examples:
processors:
probabilistic_sampler:
hash_seed: 22
sampling_percentage: 15.3
The probabilistic sampler supports sampling logs according to their trace ID, or by a specific log record attribute.
The probabilistic sampler optionally may use a hash_seed
to compute the hash of a log record.
This sampler samples based on hash values determined by log records. See Hashing for more information.
The following configuration options can be modified:
hash_seed
(no default, optional): An integer used to compute the hash algorithm. Note that all collectors for a given tier (e.g. behind the same load balancer) should have the same hash_seed.sampling_percentage
(required): Percentage at which logs are sampled; >= 100 samples all logs, 0 rejects all logs.attribute_source
(default = traceID, optional): defines where to look for the attribute in from_attribute. The allowed values aretraceID
orrecord
.from_attribute
(default = null, optional): The optional name of a log record attribute used for sampling purposes, such as a unique log record ID. The value of the attribute is only used if the trace ID is absent or ifattribute_source
is set torecord
.sampling_priority
(default = null, optional): The optional name of a log record attribute used to set a different sampling priority from thesampling_percentage
setting. 0 means to never sample the log record, and >= 100 means to always sample the log record.
In order for hashing to work, all collectors for a given tier (e.g. behind the same load balancer)
must have the same hash_seed
. It is also possible to leverage a different hash_seed
at
different collector tiers to support additional sampling requirements. Please refer to
config.go for the config spec.
Examples:
Sample 15% of the logs:
processors:
probabilistic_sampler:
sampling_percentage: 15
Sample logs according to their logID attribute:
processors:
probabilistic_sampler:
sampling_percentage: 15
attribute_source: record # possible values: one of record or traceID
from_attribute: logID # value is required if the source is not traceID
Sample logs according to the attribute priority
:
processors:
probabilistic_sampler:
sampling_percentage: 15
sampling_priority: priority
Refer to config.yaml for detailed examples on using the processor.