Skip to content
This repository has been archived by the owner on Dec 23, 2023. It is now read-only.

StackdriverTraceExporter exceptions with "Panic! This is a bug!" #1963

Closed
blt-gw opened this issue Aug 18, 2019 · 4 comments
Closed

StackdriverTraceExporter exceptions with "Panic! This is a bug!" #1963

blt-gw opened this issue Aug 18, 2019 · 4 comments

Comments

@blt-gw
Copy link

blt-gw commented Aug 18, 2019

What version of OpenCensus are you using?

0.23.0

What JVM are you using (java -version)?

 > java -version
java version "1.8.0_221"
Java(TM) SE Runtime Environment (build 1.8.0_221-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)

What did you do?

If possible, provide a recipe for reproducing the error.

I've created a small repository to demonstrate the issue, here. From the root of the repository run bazel run //:echo. You will see something like:

> bazel run //:echo
INFO: Analyzed target //:echo (1 packages loaded, 2 targets configured).
INFO: Found 1 target...
Target //:echo up-to-date:
  bazel-bin/echo.jar
  bazel-bin/echo
INFO: Elapsed time: 0.404s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
Aug 18, 2019 12:05:45 AM com.google.auth.oauth2.DefaultCredentialsProvider warnAboutProblematicCredentials
WARNING: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/.
Aug 18, 2019 12:05:48 AM io.opencensus.implcore.trace.RecordEventsSpanImpl finalize
SEVERE: Span Sent.com.google.api.client.http.HttpRequest.execute is GC'ed without being ended.

if you have existing Google Credentials (please be aware that this may write traces into your account). Note the last line:

SEVERE: Span Sent.com.google.api.client.http.HttpRequest.execute is GC'ed without being ended.

To trigger the issue either wait for enough traces to buffer or send traffic to force the issue. I do the following:

>  ghz --insecure --proto=proto/echo.proto --call=com.example.echo.service.Echo.Echo localhost:8080

Running ghz from the root of the echo project.

What did you expect to see?

I expected to see traces in the Stackdriver Trace console.

What did you see instead?

Aug 18, 2019 12:07:21 AM io.opencensus.implcore.trace.export.SpanExporterImpl$Worker onBatchExport
WARNING: Exception thrown by the service export io.opencensus.exporter.trace.stackdriver.StackdriverTraceExporter
com.google.api.gax.rpc.InternalException: io.grpc.StatusRuntimeException: INTERNAL: Panic! This is a bug!
	at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:67)
	at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
	at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
	at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
	at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1052)
	at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:398)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1030)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:871)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:716)
	at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:507)
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:482)
	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
	at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
	at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:510)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:630)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:518)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:692)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:681)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
	Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
		at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
		at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
		at com.google.cloud.trace.v2.TraceServiceClient.batchWriteSpans(TraceServiceClient.java:234)
		at com.google.cloud.trace.v2.TraceServiceClient.batchWriteSpans(TraceServiceClient.java:182)
		at io.opencensus.exporter.trace.stackdriver.StackdriverV2ExporterHandler.export(StackdriverV2ExporterHandler.java:444)
		at io.opencensus.implcore.trace.export.SpanExporterImpl$Worker.onBatchExport(SpanExporterImpl.java:248)
		at io.opencensus.implcore.trace.export.SpanExporterImpl$Worker.exportBatches(SpanExporterImpl.java:350)
		at io.opencensus.implcore.trace.export.SpanExporterImpl$Worker.run(SpanExporterImpl.java:290)
		... 1 more
Caused by: io.grpc.StatusRuntimeException: INTERNAL: Panic! This is a bug!
	at io.grpc.Status.asRuntimeException(Status.java:533)
	... 23 more
Caused by: java.lang.AbstractMethodError: Receiver class io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder$NettyTransportFactory does not define or inherit an implementation of the resolved method abstract newClientTransport(Ljava/net/SocketAddress;Lio/grpc/internal/ClientTransportFactory$ClientTransportOptions;Lio/grpc/ChannelLogger;)Lio/grpc/internal/ConnectionClientTransport; of interface io.grpc.internal.ClientTransportFactory.
	at io.grpc.internal.CallCredentialsApplyingTransportFactory.newClientTransport(CallCredentialsApplyingTransportFactory.java:49)
	at io.grpc.internal.InternalSubchannel.startNewTransport(InternalSubchannel.java:246)
	at io.grpc.internal.InternalSubchannel.access$400(InternalSubchannel.java:65)
	at io.grpc.internal.InternalSubchannel$2.run(InternalSubchannel.java:196)
	at io.grpc.SynchronizationContext.drain(SynchronizationContext.java:95)
	at io.grpc.SynchronizationContext.execute(SynchronizationContext.java:127)
	at io.grpc.internal.ManagedChannelImpl$NameResolverListener.onResult(ManagedChannelImpl.java:1365)
	at io.grpc.internal.DnsNameResolver$Resolve.resolveInternal(DnsNameResolver.java:311)
	at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:213)
	... 3 more

Additional context

I'm unsure if this issue is related to the one I present in #1962. I ran into this issue trying to build a small reproducible case but hit this instead.

@songy23
Copy link
Contributor

songy23 commented Aug 19, 2019

There seem to be two issues here:

  1. Span Sent.com.google.api.client.http.HttpRequest.execute is not ended correctly. This is usually because span.end() is not called before program exits. This is being tracked in com.google.api.client.http.HttpRequest.execute doesn't close Span googleapis/google-http-java-client#788.
  2. The gRPC AbstractMethodError due to shaded netty artifact. My guess is it has something to do with the dependencies in Bazel build. Bazel is not good at resolving shading and transitive dependencies, and if you have incompatible Google Cloud client libraries and gRPC versions you may get this kind of error. Can you paste your Bazel WORKSPACE file here? Also /cc @zhangkun83 @ejona86 any suggestions here?

@ejona86
Copy link

ejona86 commented Aug 19, 2019

The AbstractMethodError looks to be because grpc is getting included twice in the build: once via http_archive()/java_grpc_library() and another time via maven_install() (via io.opencensus:opencensus-exporter-*-stackdriver). Since maven_install is the first time there's been transitive dep tracking in Bazel, I've not really thought about this case at all. It seems overriding targets to point to their Bazel equivalents seems the appropriate solution. gRPC could maybe maintain a mapping that could be added to the override_targets.

I tried to reproduce but couldn't. bazel run //:echo ran fine for me. That's not surprising, given it is dependent on the classloading order and a few things that are "random" (not strictly speaking, but in practice). But something like this should fix the issue:

maven_install(
    ...
    override_targets = {
        "io.grpc:grpc-alts": "@io_grpc_grpc_java//alts",
        "io.grpc:grpc-auth": "@io_grpc_grpc_java//auth",
        "io.grpc:grpc-context": "@io_grpc_grpc_java//context",
        "io.grpc:grpc-core": "@io_grpc_grpc_java//core",
        "io.grpc:grpc-grpclb": "@io_grpc_grpc_java//grpclb",
        "io.grpc:grpc-netty-shaded": "@io_grpc_grpc_java//netty",
        "io.grpc:grpc-protobuf": "@io_grpc_grpc_java//protobuf",
        "io.grpc:grpc-protobuf-lite": "@io_grpc_grpc_java//protobuf-lite",
        "io.grpc:grpc-stub": "@io_grpc_grpc_java//stub",
    },
    ...
)

Note that //netty isn't fully equivalent to grpc-netty-shaded. They have different package names (which should generally be okay; libraries should not refer to any of grpc-netty/grpc-netty-shaded's APIs because they are unstable) but also grpc-netty-shaded includes netty tcnative. So I also needed to add @io_netty_netty_tcnative_boringssl_static//jar to the runtime_deps of //:echo. Note that isn't required in all cases, like if using Java 9+.

@blt-gw
Copy link
Author

blt-gw commented Aug 20, 2019

@ejona86 your override_targets does appear to work on my system and the target deployment system as well. Thanks very much.

ejona86 added a commit to ejona86/grpc-java that referenced this issue Aug 21, 2019
This can be used to prevent duplicate classes in the classpath, one via
Maven and one via Bazel-native.

See census-instrumentation/opencensus-java#1963 and grpc#5359
ejona86 added a commit to grpc/grpc-java that referenced this issue Aug 22, 2019
This can be used to prevent duplicate classes in the classpath, one via
Maven and one via Bazel-native.

See census-instrumentation/opencensus-java#1963 and #5359
@songy23
Copy link
Contributor

songy23 commented Aug 22, 2019

Fix has been made to upstream. Mark this issue as resolved for now.

@songy23 songy23 closed this as completed Aug 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants