Composite policy doesn't allow for overflow between subpolicies if a policy sets Sampled but has gone over its max rate

### Component(s)

processor/tailsampling

### What happened?

## Description

[Within the algorithm for processing of subpolicies within the composite policy](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.116.0/processor/tailsamplingprocessor/internal/sampling/composite.go#L102-L120), the following may happen:

1. the current subpolicy being processed has set the current trace as `Sampled`
2. we check if allowing this sampling would put this subpolicy over its specific spans/sec limit or the global limit
3. if either check evaluates to true, we return `NotSampled` and exit the function, thereby never checking any of the other subpolicies.

My personal opinion is that this is a bug, but I can see how this might be construed as a feature.

## Steps to Reproduce

1. start up an otelcol with the provided config
2. send a bunch of traces in (for example, by using [xk6-client-tracing](https://github.com/grafana/xk6-client-tracing/blob/main/examples/template/template.js))

## Expected Result

The always sample subpolicy (ASSP) receives the traces first, and any unsampled traces then move on to the probabilistic subpolicy (PSP). 

If a ramping increase of spans/sec is used, we will at first see a 100% sampling rate for this Composite policy followed by an inflection point where the ASSP saturates at 100 spans/sec and 80 % of spans above this point are not sampled. Similarly, when we look at the exporter's rate/sec rate, we can see that there is an inflection at around this time, where the slope of the graph after the inflection is 20% of the slop before.

## Actual Result

ASSP always receives all traces and PSP never receives any traces. The spans per second flatlines at the maximum number of spans per second to be allocated to ASSP

### Collector version

v0.116.0

### Environment information

## Environment
OS: Ubuntu 22.04
Compiler(if manually compiled): ocb@v0.116.0 go1.23.4


### OpenTelemetry Collector configuration

```yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "localhost:14317"

processors:
  tail_sampling:
    policies:
      - name: composite-policy
        type: composite
        composite:
          max_total_spans_per_second: 10_000
          policy_order:
            - always-sample-policy
            - probabilistic-policy
          composite_sub_policy:
            - name: always-sample-policy
              type: always_sample
            - name: probabilistic-policy
              type: probabilistic
              probabilistic:
                sampling_percentage: 20
          rate_allocation:
            - policy: always-sample-policy
              percent: 5
            - policy: probabilistic-policy
              percent: 95

exporters:
  otlp:
    endpoint: "localhost:4317"
    tls:
      insecure: true
  debug:
    verbosity: detailed

service:
  telemetry:
    logs:
      level: DEBUG
    metrics:
      level: detailed
      readers:
        - pull:
            exporter:
              prometheus:
                host: "0.0.0.0"
                port: 8888
  pipelines:
    traces:
      receivers: [otlp]
      processors: [tail_sampling]
      exporters: [otlp]
```


### Log output

```shell
2024-12-24T17:29:51.319-0500	debug	tailsamplingprocessor@v0.116.0/processor.go:271	Sampling Policy Evaluation ticked	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}
2024-12-24T17:29:51.320-0500	debug	sampling/always_sample.go:29	Evaluating spans in always-sample filter	{"kind": "processor", "name": "tail_sampling", "pipeline": "traces", "policy": "always_sample"}

```
```


### Additional context

We need two changes

## Return On Global Max Right Away

We should move the check for
```
spansInSecondIfSampled <= c.maxTotalSPS
```
Out of the loop. There's no need to bother with processing any of the subpolicies if the trace will put the global system over global limit regardless of which subpolicy samples it.

I actually think this inequality might be a mistake.
`spansInSecondIfSampled` [only takes into account the number of sampled spans *for this subpolicy* and the number of spans in the current trace](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.116.0/processor/tailsamplingprocessor/internal/sampling/composite.go#L106). It doesn't make sense to compare this to the `maxTotalSPS` of the full `Composite` policy.

## Modify Logic for `Sampled`/`NotSampled` for Subpolicies

[This logic](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.116.0/processor/tailsamplingprocessor/internal/sampling/composite.go#L116-L120) is counter-intuitive and -- I would argue -- restricts the utility of the `Composite` policy. It's saying that if any one of the subpolicies is over its limit then we should give up on this trace.

We should get rid of this `return NotSampled, nil` and allow the remaining subpolicies to process the trace. We still have a `return NotSampled, nil` at the end of the function in case no policy wants to sample the trace or no policy is able to sample the trace based on its SPS limits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite policy doesn't allow for overflow between subpolicies if a policy sets Sampled but has gone over its max rate #36959

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Composite policy doesn't allow for overflow between subpolicies if a policy sets Sampled but has gone over its max rate #36959

Description

Component(s)

What happened?

Description

Steps to Reproduce

Expected Result

Actual Result

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions