Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(otelarrowreceiver): add admission control to otlp path #35021

Merged
merged 28 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
fb4bbe8
add admission control to otlp path
moh-osman3 Sep 5, 2024
9706583
support deprecated option + changelog + unit tests
moh-osman3 Sep 6, 2024
35d9d09
rename
moh-osman3 Sep 6, 2024
b3feccf
replace statement?
moh-osman3 Sep 6, 2024
36fc7b2
lint
moh-osman3 Sep 6, 2024
75788fd
lint
moh-osman3 Sep 6, 2024
0944757
make crosslink?
moh-osman3 Sep 6, 2024
6eeee4b
lint again
moh-osman3 Sep 6, 2024
692484f
address review feedback
moh-osman3 Sep 6, 2024
2dcdd8e
fix tests
moh-osman3 Sep 6, 2024
4e4a4ed
lint
moh-osman3 Sep 6, 2024
4202afd
add yaml
moh-osman3 Sep 6, 2024
2d26469
fix test after rebase
moh-osman3 Sep 10, 2024
d80fff4
fix test
moh-osman3 Sep 10, 2024
58c525e
change admission config level and name
moh-osman3 Sep 10, 2024
1fae410
gofmt
moh-osman3 Sep 10, 2024
57f9d56
add nil check
moh-osman3 Sep 10, 2024
c04c5bd
rebase + review feedback
moh-osman3 Sep 12, 2024
3469d9a
review feedback
moh-osman3 Sep 17, 2024
c5791ae
indent chlog?
moh-osman3 Sep 17, 2024
3a878ac
update readme link
moh-osman3 Sep 17, 2024
f5f170b
move log warning to newOtelArrowReceiver
moh-osman3 Sep 18, 2024
c748c5e
move mutation to Unmarshal
jmacd Sep 20, 2024
73217bf
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Sep 20, 2024
62e8f82
tidy
jmacd Sep 20, 2024
91c1ddc
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Sep 23, 2024
3510b3a
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
jmacd Sep 23, 2024
411d5f7
comments
jmacd Sep 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .chloggen/admission-control-for-otlp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement
crobert-1 marked this conversation as resolved.
Show resolved Hide resolved

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: otelarrowreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: |
Add admission control in the otelarrow receiver's standard otlp data path.
Also moves admission control config options to a separate block.
arrow.admission_limit_mib -> admission.request_limit_mib
arrow.waiter_limit -> admission.waiter_limit

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [35021]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
7 changes: 7 additions & 0 deletions internal/otelarrow/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,19 @@ require (
github.com/fxamacker/cbor/v2 v2.4.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-viper/mapstructure/v2 v2.1.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/snappy v0.0.5-0.20220116011046-fa5810519dcb // indirect
github.com/google/flatbuffers v24.3.25+incompatible // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/cpuid/v2 v2.2.8 // indirect
github.com/knadh/koanf/maps v0.1.1 // indirect
github.com/knadh/koanf/providers/confmap v0.1.0 // indirect
github.com/knadh/koanf/v2 v2.1.1 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/mostynb/go-grpc-compression v1.2.3 // indirect
Expand All @@ -69,6 +75,7 @@ require (
go.opentelemetry.io/collector/config/configretry v1.16.0 // indirect
go.opentelemetry.io/collector/config/configtls v1.16.0 // indirect
go.opentelemetry.io/collector/config/internal v0.110.0 // indirect
go.opentelemetry.io/collector/confmap v1.16.0 // indirect
go.opentelemetry.io/collector/consumer/consumerprofiles v0.110.0 // indirect
go.opentelemetry.io/collector/extension v0.110.0 // indirect
go.opentelemetry.io/collector/extension/auth v0.110.0 // indirect
Expand Down
17 changes: 10 additions & 7 deletions receiver/otelarrowreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,16 @@ Several common configuration structures provide additional capabilities automati
- [gRPC settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md)
- [TLS and mTLS settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configtls/README.md)

### Admission Control Configuration

In the `admission` configuration block the following settings are available:

- `request_limit_mib` (default: 128): limits the number of requests that are received by the stream in terms of *uncompressed request size*. This should not be confused with `arrow.memory_limit_mib` which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.

- `waiter_limit` (default: 1000): limits the number of requests waiting on admission once `admission_limit_mib` is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.

`request_limit_mib` and `waiter_limit` are arguments supplied to [admission.BoundedQueue](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/internal/otelarrow/admission). This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.

### Arrow-specific Configuration

In the `arrow` configuration block, the following settings are available:
Expand All @@ -87,13 +97,6 @@ When the limit is reached, the receiver will return RESOURCE_EXHAUSTED
error codes to the receiver, which are [conditionally retryable, see
exporter retry configuration](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md).

- `admission_limit_mib` (default: 64): limits the number of requests that are received by the stream based on request size information available. This should not be confused with `memory_limit_mib` which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.

- `waiter_limit` (default: 1000): limits the number of requests waiting on admission once `admission_limit_mib` is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.

`admission_limit_mib` and `waiter_limit` are arguments supplied to [admission.BoundedQueue](https://github.com/open-telemetry/otel-arrow/tree/main/collector/admission). This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.


### Compression Configuration

In the `arrow` configuration block, `zstd` sub-section applies to all
Expand Down
51 changes: 43 additions & 8 deletions receiver/otelarrowreceiver/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/configgrpc"
"go.opentelemetry.io/collector/confmap"

"github.com/open-telemetry/opentelemetry-collector-contrib/internal/otelarrow/compression/zstd"
)
Expand All @@ -18,22 +19,30 @@ type Protocols struct {
Arrow ArrowConfig `mapstructure:"arrow"`
}

type AdmissionConfig struct {
crobert-1 marked this conversation as resolved.
Show resolved Hide resolved
// RequestLimitMiB limits the number of requests that are received by the stream based on
// uncompressed request size. Request size is used to control how much traffic we admit
// for processing.
RequestLimitMiB uint64 `mapstructure:"request_limit_mib"`
jmacd marked this conversation as resolved.
Show resolved Hide resolved

// WaiterLimit is the limit on the number of waiters waiting to be processed and consumed.
// This is a dimension of memory limiting to ensure waiters are not consuming an
// unexpectedly large amount of memory in the arrow receiver.
WaiterLimit int64 `mapstructure:"waiter_limit"`
}

// ArrowConfig support configuring the Arrow receiver.
type ArrowConfig struct {
// MemoryLimitMiB is the size of a shared memory region used
// by all Arrow streams, in MiB. When too much load is
// passing through, they will see ResourceExhausted errors.
MemoryLimitMiB uint64 `mapstructure:"memory_limit_mib"`

// AdmissionLimitMiB limits the number of requests that are received by the stream based on
// request size information available. Request size is used to control how much traffic we admit
// for processing, but does not control how much memory is used during request processing.
AdmissionLimitMiB uint64 `mapstructure:"admission_limit_mib"`
// Deprecated: This field is no longer supported, use cfg.Admission.RequestLimitMiB instead.
DeprecatedAdmissionLimitMiB uint64 `mapstructure:"admission_limit_mib"`

// WaiterLimit is the limit on the number of waiters waiting to be processed and consumed.
// This is a dimension of memory limiting to ensure waiters are not consuming an
// unexpectedly large amount of memory in the arrow receiver.
WaiterLimit int64 `mapstructure:"waiter_limit"`
// Deprecated: This field is no longer supported, use cfg.Admission.WaiterLimit instead.
DeprecatedWaiterLimit int64 `mapstructure:"waiter_limit"`

// Zstd settings apply to OTel-Arrow use of gRPC specifically.
Zstd zstd.DecoderConfig `mapstructure:"zstd"`
Expand All @@ -43,6 +52,8 @@ type ArrowConfig struct {
type Config struct {
// Protocols is the configuration for gRPC and Arrow.
Protocols `mapstructure:"protocols"`
// Admission is the configuration for controlling amount of request memory entering the receiver.
Admission AdmissionConfig `mapstructure:"admission"`
}

var _ component.Config = (*Config)(nil)
Expand All @@ -54,3 +65,27 @@ func (cfg *ArrowConfig) Validate() error {
}
return nil
}

func (cfg *Config) Validate() error {
crobert-1 marked this conversation as resolved.
Show resolved Hide resolved
if err := cfg.GRPC.Validate(); err != nil {
return err
}
if err := cfg.Arrow.Validate(); err != nil {
return err
}
return nil
}

// Unmarshal will apply deprecated field values to assist the user with migration.
func (cfg *Config) Unmarshal(conf *confmap.Conf) error {
if err := conf.Unmarshal(cfg); err != nil {
return err
}
if cfg.Admission.RequestLimitMiB == 0 && cfg.Arrow.DeprecatedAdmissionLimitMiB != 0 {
cfg.Admission.RequestLimitMiB = cfg.Arrow.DeprecatedAdmissionLimitMiB
crobert-1 marked this conversation as resolved.
Show resolved Hide resolved
}
if cfg.Admission.WaiterLimit == 0 && cfg.Arrow.DeprecatedWaiterLimit != 0 {
cfg.Admission.WaiterLimit = cfg.Arrow.DeprecatedWaiterLimit
}
jmacd marked this conversation as resolved.
Show resolved Hide resolved
return nil
}
40 changes: 34 additions & 6 deletions receiver/otelarrowreceiver/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,41 @@ func TestUnmarshalConfig(t *testing.T) {
},
},
Arrow: ArrowConfig{
MemoryLimitMiB: 123,
AdmissionLimitMiB: 80,
WaiterLimit: 100,
MemoryLimitMiB: 123,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: 80,
WaiterLimit: 100,
},
}, cfg)

}

// Tests that a deprecated config validation sets RequestLimitMiB and WaiterLimit in the correct config block.
func TestValidateDeprecatedConfig(t *testing.T) {
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "deprecated.yaml"))
require.NoError(t, err)
cfg := &Config{}
assert.NoError(t, cm.Unmarshal(cfg))
assert.NoError(t, cfg.Validate())
assert.Equal(t,
&Config{
Protocols: Protocols{
Arrow: ArrowConfig{
MemoryLimitMiB: 123,
DeprecatedAdmissionLimitMiB: 80,
DeprecatedWaiterLimit: 100,
},
},
Admission: AdmissionConfig{
// cfg.Validate should now set these fields.
RequestLimitMiB: 80,
WaiterLimit: 100,
},
}, cfg)
}

func TestUnmarshalConfigUnix(t *testing.T) {
cm, err := confmaptest.LoadConf(filepath.Join("testdata", "uds.yaml"))
require.NoError(t, err)
Expand All @@ -103,11 +129,13 @@ func TestUnmarshalConfigUnix(t *testing.T) {
ReadBufferSize: 512 * 1024,
},
Arrow: ArrowConfig{
MemoryLimitMiB: defaultMemoryLimitMiB,
AdmissionLimitMiB: defaultAdmissionLimitMiB,
WaiterLimit: defaultWaiterLimit,
MemoryLimitMiB: defaultMemoryLimitMiB,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: defaultRequestLimitMiB,
WaiterLimit: defaultWaiterLimit,
},
}, cfg)
}

Expand Down
14 changes: 8 additions & 6 deletions receiver/otelarrowreceiver/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ import (
const (
defaultGRPCEndpoint = "0.0.0.0:4317"

defaultMemoryLimitMiB = 128
defaultAdmissionLimitMiB = defaultMemoryLimitMiB / 2
defaultWaiterLimit = 1000
defaultMemoryLimitMiB = 128
defaultRequestLimitMiB = 128
defaultWaiterLimit = 1000
)

// NewFactory creates a new OTel-Arrow receiver factory.
Expand All @@ -47,11 +47,13 @@ func createDefaultConfig() component.Config {
ReadBufferSize: 512 * 1024,
},
Arrow: ArrowConfig{
MemoryLimitMiB: defaultMemoryLimitMiB,
AdmissionLimitMiB: defaultAdmissionLimitMiB,
WaiterLimit: defaultWaiterLimit,
MemoryLimitMiB: defaultMemoryLimitMiB,
},
},
Admission: AdmissionConfig{
RequestLimitMiB: defaultRequestLimitMiB,
WaiterLimit: defaultWaiterLimit,
},
}
}

Expand Down
26 changes: 24 additions & 2 deletions receiver/otelarrowreceiver/internal/logs/otlp.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,12 @@ import (
"context"

"go.opentelemetry.io/collector/consumer"
"go.opentelemetry.io/collector/pdata/plog"
"go.opentelemetry.io/collector/pdata/plog/plogotlp"
"go.opentelemetry.io/collector/receiver/receiverhelper"
"go.uber.org/zap"

"github.com/open-telemetry/opentelemetry-collector-contrib/internal/otelarrow/admission"
)

const dataFormatProtobuf = "protobuf"
Expand All @@ -18,13 +22,19 @@ type Receiver struct {
plogotlp.UnimplementedGRPCServer
nextConsumer consumer.Logs
obsrecv *receiverhelper.ObsReport
boundedQueue *admission.BoundedQueue
sizer *plog.ProtoMarshaler
logger *zap.Logger
}

// New creates a new Receiver reference.
func New(nextConsumer consumer.Logs, obsrecv *receiverhelper.ObsReport) *Receiver {
func New(logger *zap.Logger, nextConsumer consumer.Logs, obsrecv *receiverhelper.ObsReport, bq *admission.BoundedQueue) *Receiver {
return &Receiver{
nextConsumer: nextConsumer,
obsrecv: obsrecv,
boundedQueue: bq,
sizer: &plog.ProtoMarshaler{},
logger: logger,
}
}

Expand All @@ -37,7 +47,19 @@ func (r *Receiver) Export(ctx context.Context, req plogotlp.ExportRequest) (plog
}

ctx = r.obsrecv.StartLogsOp(ctx)
err := r.nextConsumer.ConsumeLogs(ctx, ld)

sizeBytes := int64(r.sizer.LogsSize(req.Logs()))
err := r.boundedQueue.Acquire(ctx, sizeBytes)
if err != nil {
return plogotlp.NewExportResponse(), err
}
defer func() {
if releaseErr := r.boundedQueue.Release(sizeBytes); releaseErr != nil {
r.logger.Error("Error releasing bytes from semaphore", zap.Error(releaseErr))
}
}()

err = r.nextConsumer.ConsumeLogs(ctx, ld)
r.obsrecv.EndLogsOp(ctx, dataFormatProtobuf, numSpans, err)

return plogotlp.NewExportResponse(), err
Expand Down
Loading
Loading