-
Notifications
You must be signed in to change notification settings - Fork 829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addresses opencensus-shim trace issues under otel javaagent #4900
Addresses opencensus-shim trace issues under otel javaagent #4900
Conversation
please don't include commits from other streams of work. And, it looks like the build is failing because the code has not been run through the formatter. |
1977b7c
to
a3e2fde
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #4900 +/- ##
============================================
+ Coverage 90.91% 90.93% +0.01%
- Complexity 4884 4902 +18
============================================
Files 550 551 +1
Lines 14476 14495 +19
Branches 1370 1370
============================================
+ Hits 13161 13181 +20
+ Misses 917 915 -2
- Partials 398 399 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Sorry about that -- I was based on a tag bc my local gradle was struggling. Should be better now? |
@jsuereth as our local census guru, can you take a look? |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@trask I wasn't involved in the initial discussion about this, and I don't have any OC expertise. Not sure how we should proceed on this. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Bump. I've moved back onto work that could really use this. It's but one of the alternatives as I've pointed out, but it's really dealer's choice: a) make this a central change (this PR) allowing it to coexist up front with the behavior of otel javaagent and its requirement for the specific type of Span or b) instrument it such as with open-telemetry/opentelemetry-java-instrumentation#6875. a) cleaner but less clear without inspection (+ no out of box tests without a "no op" integration that simply tests it); b) arguably messier (and just more code) but explicit and has the benefit of tests by virtue of being an instrumentation to tell us it no longer works. |
I'm having trouble getting people who are knowledgeable about this issue to review it. Still pinging @trask about it. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this solves the opencensus-javaagent bridging issue I have no objection to it.
if you want to add tests on the javaagent side (but no actual instrumentation), there's precedent for that, e.g. see https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/cdi-testing
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package io.opentelemetry.api.trace; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the right package for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. No, this package is not correct as it is already used in the API itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I move it over to the opencensusshim package or somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, please
*/ | ||
// todo make this unnecessary | ||
@SuppressWarnings("UngroupedOverloads") | ||
public interface ProxyingSpan extends Span { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be clearer as an abstract class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but in this case it can't be as written: the opencensus shim code extends io.opencensus.trace.Span
which is itself an abstract class. I suppose there's other ways of working that out, but as written it's not as easy as just changing it out.
Want me to rework this into an abstract class all the same? It may involve changes upstream though just given how it gets used -- hard to say without digging in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this work?
abstract class ProxyingSpan extends io.opencensus.trace.Span implements io.opentelemetry.api.trace.Span
also, does ProxyingSpan
need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: in this case, no I don't think it needs to be public. (See original thinking below)
That should work, yes. I'll make the change.
My original thinking on this was that ProxyingSpan
could be used by the opentracing setup as well -- necessitating keeping it as decoupled from anything in opencensus -- if only to ensure all shims were correctly proxying the calls to the various otel Span, etc. methods to their agent-minted ApplicationSpan
objects. But, stepping back that seems 1) ahead of myself and 2) acting without evidence that opentracing has a similar issue (I suspect it might, but it can also be fixed later).
So at this point I'm happy with either approach. But do lmk if you concur with my original thinking and think this should be pulled out and made usable for any other shims.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spoke too soon: there's a conflict between the end()
methods of the OC and otel Span
types: OC's is marked final giving us no way to intercept that call for the otel side without plugging in the final proxying method, end()
into the implementation... Bit of a betrayal of the intent I had in mind, I fear, and muddies the implementation details by blurring the responsibility.
Do you still think the abstract class approach is worth it? I'm not so sure anymore and would personally want to keep the types decoupled, but I'm a newcomer to this project and am happy to defer.
OK that push with a ton of extra commits was my bad. Did not pay attention. Cleaning this up now. |
aec6428
to
d20e865
Compare
@trask thanks for the suggestion 👍 ; is open-telemetry/opentelemetry-java-instrumentation#7488 like what you had in mind with adding test-only to the otel-javaagent repo? |
import io.opentelemetry.context.Scope; | ||
import org.junit.jupiter.api.Test; | ||
|
||
// todo remove if/once the underlying issue is agreed upon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this todo
intended to be merged? If so, then we need to have a GH issue created to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reminder; no, it is not intended to be merged. With open-telemetry/opentelemetry-java-instrumentation#7488 in play, this class is solved elsewhere and I can remove it. I'm going to spend a little time adding to any existing tests in both PRs to ensure better overall coverage of cases I've either uncovered or can conceive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is however, this todo, which exists as a result of comments I made in the original issue. Summarized as a question: "can and should the javaagent be changed to not require the ApplicationSpan
type internally and thereby relieve shim/others implementations of needing to proxy calls to the specific javaagent-minted Span
s?"
Not a small body of work, to be sure. But if the policy is one-issue-per-todo, wanted to either a) ensure adequate coverage or b) decide it's not worth tracking and remove it. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this test doesn't seem to have any assertions in it, could we beef it up to assert what is expected, at least? I'm really not familiar with this issue much at all... I just want to make sure we don't leave un-tracked work left to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfectly understandable. This test was just a stand-in to verify some assumptions before we introduced the dedicated integration tests PR. I've now removed it in favor of a unit test which ensures the methods are all appropriately proxied.
Before we would want to merge this, we would need to make sure that we have test coverage on uncovered lines. |
d20e865
to
460495b
Compare
Looks like that last round of tests filled out the coverage, and I've rebased against main. |
opencensus-shim/src/main/java/io/opentelemetry/opencensusshim/ProxyingSpan.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems ok to me; I'm not all that familiar with OC, though.
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Bump |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
opencensus-shim/src/main/java/io/opentelemetry/opencensusshim/ProxyingSpan.java
Outdated
Show resolved
Hide resolved
opencensus-shim/src/main/java/io/opentelemetry/opencensusshim/ProxyingSpan.java
Outdated
Show resolved
Hide resolved
opencensus-shim/src/test/java/io/opentelemetry/opencensusshim/ProxyingSpanTest.java
Outdated
Show resolved
Hide resolved
326f9d2
to
b4290da
Compare
Rebased on main now that it looks like I was able to get to all the comments. |
Follows suggestion from open-telemetry/opentelemetry-java-instrumentation#6875 (comment) and related issue to address OC shim's breakage under otel javaagent utilization.
This approach pushes the concern of delegating calls to the underlying Span upstream rather than instrument the hidden (package-private classes/non-public api) classes. Based on my analysis of the problem and understanding of the shim, the breakage is rooted in the "incomplete" (lack of a better term) proxying of calls to the
io.opentelemetry.api.trace.Span
api by theOpentelemetrySpanImpl
class. In short, the javaagent requires the instance it minted,AgentSpan
, be used in all context-related interactions so the javaagent can bridge the context and changes across the javaagent/application boundary, and the current state of opencensus-shim doesn't do that universally and breaks down.The integration test here is just to prove the case: revert the change on the
OpentelemetrySpanImpl
class and run it and the logger shows the breakage. (The test is also copy-paste from the original PR in the javaagent). I'm not especially sure what's the correct way to handle this: upstream (opentelemetry-java) or instrumentation (opentelemetry-java-instrumentation), as I mention in the source issue.Anyway, appreciate the attention and patience to this issue. Just wanted to put some art to illustrate the preferred alternative to my original PR.
Builds based on
main
weren't working due to jcenter issues (I think?), so I based on the latest release.