Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OTEL kafka receiver/ingester panic #2512

Merged
merged 1 commit into from
Sep 25, 2020

Conversation

pavolloffay
Copy link
Member

Signed-off-by: Pavol Loffay ploffay@redhat.com

Fixes the following message. The description is in the code.

2020-09-21T09:00:55.075Z INFO service/service.go:396 Starting Jaeger OpenTelemetry Ingester... {"Version": "v1.19.2", "GitHash": "a2fb067f01551b47aef4f58bcc0e925ffeb32f5f", "NumCPU": 2}
2020-09-21T09:00:55.075Z INFO service/service.go:240 Setting up own telemetry...
2020-09-21T09:00:55.075Z INFO service/telemetry.go:107 Serving Prometheus metrics {"address": "localhost:8888", "legacy_metrics": false, "new_metrics": true, "level": 3, "service.instance.id": "6a816386-9c00-4531-90a7-819b4b0ff73e"}
2020-09-21T09:00:55.123Z INFO service/service.go:277 Loading configuration...
2020-09-21T09:00:55.138Z INFO service/service.go:288 Applying configuration...
2020-09-21T09:00:55.138Z INFO service/service.go:309 Starting extensions...
2020-09-21T09:00:55.138Z INFO builder/extensions_builder.go:53 Extension is starting... {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-09-21T09:00:55.138Z INFO healthcheckextension/healthcheckextension.go:40 Starting health_check extension {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check", "config": {"TypeVal":"health_check","NameVal":"health_check","Port":14269}}
2020-09-21T09:00:55.138Z INFO builder/extensions_builder.go:59 Extension started. {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-09-21T09:00:55.156Z INFO esclient/client.go:216 Elasticsearch detected {"component_kind": "exporter", "component_type": "jaeger_elasticsearch", "component_name": "jaeger_elasticsearch", "version": 6}
2020-09-21T09:00:55.254Z INFO builder/exporters_builder.go:308 Exporter is enabled. {"component_kind": "exporter", "exporter": "jaeger_elasticsearch"}
2020-09-21T09:00:55.254Z INFO service/service.go:324 Starting exporters...
2020-09-21T09:00:55.254Z INFO builder/exporters_builder.go:90 Exporter is starting... {"component_kind": "exporter", "component_type": "jaeger_elasticsearch", "component_name": "jaeger_elasticsearch"}
2020-09-21T09:00:55.254Z INFO builder/exporters_builder.go:95 Exporter started. {"component_kind": "exporter", "component_type": "jaeger_elasticsearch", "component_name": "jaeger_elasticsearch"}
2020-09-21T09:00:55.254Z INFO builder/pipelines_builder.go:207 Pipeline is enabled. {"pipeline_name": "", "pipeline_datatype": "traces"}
2020-09-21T09:00:55.254Z INFO service/service.go:337 Starting processors...
2020-09-21T09:00:55.254Z INFO builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "", "pipeline_datatype": "traces"}
2020-09-21T09:00:55.254Z INFO builder/pipelines_builder.go:61 Pipeline is started. {"pipeline_name": "", "pipeline_datatype": "traces"}
2020-09-21T09:00:55.261Z INFO builder/receivers_builder.go:235 Receiver is enabled. {"component_kind": "receiver", "component_type": "jaeger_kafka", "component_name": "jaeger_kafka", "datatype": "traces"}
2020-09-21T09:00:55.261Z INFO service/service.go:349 Starting receivers...
2020-09-21T09:00:55.261Z INFO builder/receivers_builder.go:70 Receiver is starting... {"component_kind": "receiver", "component_type": "jaeger_kafka", "component_name": "jaeger_kafka"}
2020-09-21T09:00:55.261Z INFO builder/receivers_builder.go:75 Receiver started. {"component_kind": "receiver", "component_type": "jaeger_kafka", "component_name": "jaeger_kafka"}
2020-09-21T09:00:55.261Z INFO healthcheck/handler.go:128 Health Check state change {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check", "status": "ready"}
2020-09-21T09:00:55.261Z INFO service/service.go:252 Everything is ready. Begin running and processing data.
2020-09-21T09:00:55.261Z INFO consumer/consumer.go:78 Starting main loop {"component_kind": "receiver", "component_type": "jaeger_kafka", "component_name": "jaeger_kafka"}
panic: non-positive interval for NewTicker

goroutine 135 [running]:
time.NewTicker(0x0, 0xc0000b5140)
time/tick.go:23 +0x151
github.com/bsm/sarama-cluster.(*Consumer).cmLoop(0xc000620000, 0xc0003af860)
github.com/bsm/sarama-cluster@v2.1.13+incompatible/consumer.go:458 +0x5a
github.com/bsm/sarama-cluster.(*loopTomb).Go.func1(0xc0005e2d40, 0xc0005e81a0)
github.com/bsm/sarama-cluster@v2.1.13+incompatible/util.go:73 +0x7b
created by github.com/bsm/sarama-cluster.(*loopTomb).Go
github.com/bsm/sarama-cluster@v2.1.13+incompatible/util.go:69 +0x66

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
// However the Jaeger OTEL module pulls in newer samara version (from OTEL collector)
// that does not set saramaConfig.Consumer.Offsets.CommitInterval to its default value 1s.
// then the samara-cluster fails if the default interval is not 1s.
saramaConfig.Consumer.Offsets.CommitInterval = time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check whether the value has already been set, for use in non-OTEL context? Or would it only ever be the default value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One second is always the default value we never override it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the whole kafka implementation will be replaced to OTEL one that does not use samara-cluster #2494

@joe-elliott
Copy link
Member

I had to fix this in my branch as well. I did it by registering receiver flags here:

https://github.com/jaegertracing/jaeger/pull/2494/files#diff-049d063f5338468c88d1c32ad60055efR114

This loaded receiver defaults which includes a non-zero value for this param. I intend to add the requested tests today if we just want to merge that.

@codecov
Copy link

codecov bot commented Sep 25, 2020

Codecov Report

Merging #2512 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2512   +/-   ##
=======================================
  Coverage   95.18%   95.18%           
=======================================
  Files         208      208           
  Lines        9158     9158           
=======================================
  Hits         8717     8717           
  Misses        365      365           
  Partials       76       76           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e1b869...31ab409. Read the comment docs.

@pavolloffay
Copy link
Member Author

https://github.com/jaegertracing/jaeger/pull/2494/files#diff-049d063f5338468c88d1c32ad60055efR114

This seems like a different problem, the commit interval is not exposed via flags.

@pavolloffay pavolloffay merged commit 9ea89d9 into jaegertracing:master Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants