Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics fail to be exported: One or more TimeSeries could not be written: [...] Distribution |explicit_buckets.bounds| does not have at least one entry #11900

Closed
nioncode opened this issue Jul 25, 2024 · 3 comments
Labels
bug Something isn't working needs triage New issue that requires triage

Comments

@nioncode
Copy link

Describe the bug

I'm not sure if this is the correct repo to report the bug, please let me know if I should post this somewhere else.

We are using auto instrumentation via the opentelemetry-javaagent (v1.33.4) and the auto exporter (exporter-auto-0.31.0-alpha-shaded). We instrument our app with the following command:

ENTRYPOINT ["java", \
  "-javaagent:opentelemetry-javaagent.jar", \
  "-Dotel.javaagent.extensions=exporter-auto-0.31.0-alpha-shaded.jar", \
  "-Dotel.traces.exporter=google_cloud_trace", \
  "-Dotel.metrics.exporter=google_cloud_monitoring", \
  "-Dotel.logs.exporter=none", \
  "-Dotel.service.name=my-service", \
  "-Dotel.javaagent.logging=application", \
  "-Dorg.jboss.logging.provider=slf4j", \
  "-Dlogback.configurationFile=/app/logback.xml",\
  "-cp", \
  "app.jar", \
  "com.example.app.Main" \
]

When deploying our app to GCP, it seems to collect metrics correctly (at least some statistics show up for http.server.duration in Google Cloud Monitoring), but every minute our app prints a warning that it failed to write some TimeSeries (see below for the exact stack trace).

How can we find out which metric could not be written (since at least the http.server.duration seems to work fine) and/or how can we avoid the warning?

Steps to reproduce

Create a java app, instrument it with the agent + auto exporter and deploy it to GCP. Wait a couple of minutes until the app has started and check its logs.

Expected behavior

There should be no warnings.

Actual behavior

We get warnings in the logs that some TimeSeries cannot be exported:

Exporter threw an Exception
shadow.com.google.api.gax.rpc.InvalidArgumentException: shadow.io.grpc.StatusRuntimeException: INVALID_ARGUMENT: One or more TimeSeries could not be written: Field timeSeries[7].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| does not have at least one entry.
    at shadow.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92)
    at shadow.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:98)
    at shadow.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
    at shadow.com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
    at shadow.com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:84)
    at shadow.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1127)
    at shadow.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at shadow.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1286)
    at shadow.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1055)
    at shadow.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:807)
    at shadow.io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568)
    at shadow.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538)
    at shadow.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at shadow.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at shadow.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at shadow.com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:570)
    at shadow.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
    at shadow.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
    at shadow.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
    at shadow.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
    at shadow.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at shadow.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)
caused by: shadow.io.grpc.StatusRuntimeException: INVALID_ARGUMENT: One or more TimeSeries could not be written: Field timeSeries[7].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| does not have at least one entry.
    at shadow.io.grpc.Status.asRuntimeException(Status.java:537)
    ... 14 common frames elided"

Javaagent or library instrumentation version

v1.33.4

Environment

JDK: Temurin 17
OS: Ubuntu 24.04
Undertow 2.3.12

Additional context

No response

@nioncode nioncode added bug Something isn't working needs triage New issue that requires triage labels Jul 25, 2024
@laurit
Copy link
Contributor

laurit commented Jul 25, 2024

@nioncode this is not the correct repository, report it to google, or whoever is the author of that exporter

@nioncode
Copy link
Author

Thank you for the fast reply! I also posted this on open-telemetry/opentelemetry-collector-contrib#34250 now.

I was just wondering if this is actually an issue with the exporter or if something along the chain is not configuring the metric correctly. Since most metrics work fine, I'm just confused why some others don't or how I would find out where the error comes from. Do you have any suggestions how I could find the root of this?

@laurit
Copy link
Contributor

laurit commented Jul 25, 2024

Just follow GoogleCloudPlatform/opentelemetry-operations-java#359 The google engineers that built the exporter have the best chance figuring out why this happens.

@laurit laurit closed this as completed Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New issue that requires triage
Projects
None yet
Development

No branches or pull requests

2 participants