Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmd/telemetrygen] Duration flag is not respected when tls handshake fails #31401

Closed
AlexDCraig opened this issue Feb 23, 2024 · 6 comments · Fixed by #31631
Closed

[cmd/telemetrygen] Duration flag is not respected when tls handshake fails #31401

AlexDCraig opened this issue Feb 23, 2024 · 6 comments · Fixed by #31631
Labels
bug Something isn't working cmd/telemetrygen telemetrygen command

Comments

@AlexDCraig
Copy link
Contributor

Component(s)

cmd/telemetrygen

What happened?

Description

If you use the duration flag but point to an endpoint that will fail tls handshake failure, the command will run indefinitely and not respect the duration flag.

Steps to Reproduce

telemetrygen metrics \
--otlp-endpoint [my-favorite-otel-endpoint]:443 \
--duration 5s

Expected Result

The command exits in failure after 5s

Actual Result

Command runs forever

Collector version

v0.95.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@AlexDCraig AlexDCraig added bug Something isn't working needs triage New item requiring triage labels Feb 23, 2024
@github-actions github-actions bot added the cmd/telemetrygen telemetrygen command label Feb 23, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

I was able to reproduce using both logs and traces, but logs fails right away when it can't connect.

I believe this is because the GRPC connections for metrics and traces are configured to block until the connection is successful. The documentation for the WithBlock method explicitly states using this feature is not recommended. This functionality was added here: #27007.

Since this is a tool used for testing, it may be best to simply fail when a connection cannot be made, rather than blocking until it's successful. Unless the code owners disagree, I think the WithBlock option can be removed.

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Feb 28, 2024
@mx-psi
Copy link
Member

mx-psi commented Feb 29, 2024

@crobert-1 I think the option was already there before the PR you mention, it was just moved to the current file from a different place. I am fine removing WithBlock but would be good to dig where and why we added the option in the first place.

@crobert-1
Copy link
Member

Good catch @mx-psi, I missed that. It looks like WithBlock was added in metrics some time ago, and was re-using a lot of the traces logic. WithBlock was added to traces as a part of moving to telemetrygen from tracegen. WithBlock was added to tracegen with the initial commit, but I'm not seeing any comments related to why the option was originally added.

It looks like tracegen was brought over from the same concept within Jaeger, but the Jaeger implementation only added GRPC exporting less than a year ago, and they don't use WithBlock, so it looks like this was original to the collector's tracegen functionality.

@jpkrohling Do you happen to remember why WithBlock was added for tracegen's grpc trace exporting functionality?

@jpkrohling
Copy link
Member

It's been a while... I don't remember the explicit reason, and I'd be fine removing as well.

@AlexDCraig
Copy link
Contributor Author

Thanks folks! I may put in a PR to remove that block if I free up. I use this tool for testing and added some logic to force quit the command if it hits x seconds which I'd like to remove and just rely on that duration flag

dmitryax pushed a commit that referenced this issue Mar 8, 2024
…31631)

**Description:**

Metrics and traces in telemetrygen were sent with WithBlock() which was
a blocking call waiting on a successful connection to be formed. In the
event a connection could not succeed, like a TLS handshake failure, the
block would loop indefinitely.

**Link to tracking Issue:** Resolves
#31401

**Testing:**

- Run telemetrygen on main pointed at a host that won't resolve TLS and
observe that the command loops indefinitely
- Run telemetrygen on this commit pointed at that same host and observe
it exit after failing

**Documentation:** <Describe the documentation added.>
DougManton pushed a commit to DougManton/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
…pen-telemetry#31631)

**Description:**

Metrics and traces in telemetrygen were sent with WithBlock() which was
a blocking call waiting on a successful connection to be formed. In the
event a connection could not succeed, like a TLS handshake failure, the
block would loop indefinitely.

**Link to tracking Issue:** Resolves
open-telemetry#31401

**Testing:**

- Run telemetrygen on main pointed at a host that won't resolve TLS and
observe that the command loops indefinitely
- Run telemetrygen on this commit pointed at that same host and observe
it exit after failing

**Documentation:** <Describe the documentation added.>
XinRanZhAWS pushed a commit to XinRanZhAWS/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
…pen-telemetry#31631)

**Description:**

Metrics and traces in telemetrygen were sent with WithBlock() which was
a blocking call waiting on a successful connection to be formed. In the
event a connection could not succeed, like a TLS handshake failure, the
block would loop indefinitely.

**Link to tracking Issue:** Resolves
open-telemetry#31401

**Testing:**

- Run telemetrygen on main pointed at a host that won't resolve TLS and
observe that the command loops indefinitely
- Run telemetrygen on this commit pointed at that same host and observe
it exit after failing

**Documentation:** <Describe the documentation added.>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cmd/telemetrygen telemetrygen command
Projects
None yet
4 participants