Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.26.0 #6

Merged
merged 177 commits into from
May 26, 2023
Merged

V1.26.0 #6

merged 177 commits into from
May 26, 2023

Conversation

lrwh
Copy link
Collaborator

@lrwh lrwh commented May 26, 2023

merge otel V1.26.0

add instrumentation

#1
#2
#3
#4
#5

新增guance change log

Mateusz Rzeszutek and others added 30 commits April 3, 2023 09:13
I applied [this
comment](open-telemetry#8131 (comment))
to the whole codebase and removed some `Optional`s that were used in the
hot path
this is a follow-up to open-telemetry#7043

I tried to add a test when that PR was opened:

*
trask@d2c6399

but it doesn't really verify anything, since the NPE is throw/caught and
same behavior occurs
…ndler chain (open-telemetry#8160)

the current implementation of Start and End around the invocation of a
Jax WS is asymmetric around the JAX-WS Handler Chain.

Current behavior:
(execution of incoming MessageHandlers) -> (TracingStartInInterceptor)
-> (WebService Invocation) -> (execution of outgoing MessageHandlers) ->
(TracingEndInInterceptor)

if I understood the code of this cxf instrumentation correctly, the
intent was to build the span close around the WebService Invocation
(without Handler Chains).

So the desired behavior would look like this:
(execution of incoming MessageHandlers) -> (TracingStartInInterceptor)
-> (WebService Invocation) -> (TracingEndInInterceptor) -> (execution of
outgoing MessageHandlers)

Unfortunately CXF is calling the Outgoing Chain inside the POST_INVOKE
Phase of Cxf (so the outgoing chain is technically a sub-chain in the
incoming chain... which is documented but quite surprising...).

So the solution in the fix at least guarantees the the outgoing chain is
invoked AFTER end of tracing. For any extra Interceptors in the
POST_INVOKE Phase there is still no guarantee of ordering, but I think
this is not a opentelemetry issue but a design-flaw of CXF...

---------

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Co-authored-by: Lauri Tulmin <ltulmin@splunk.com>
the auto-update will be triggered from the doc repo after
open-telemetry/opentelemetry.io#2568
Resolves
open-telemetry#8069
The javaagent instrumentation also supports propagating context into
[BodyHandler](https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpResponse.BodyHandler.html)
implemented in
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/java-http-client/javaagent/src/main/java/io/opentelemetry/javaagent/instrumentation/httpclient/BodyHandlerWrapper.java
I think the initial idea behind it was that this allowed propagating
context into callbacks. Because this didn't work for
`connectionErrorUnopenedPortWithCallback` test later we also added
wrapping completable future to take care of propagating context into
callbacks. Should I also implement context propagation for `BodyHandler`
in library instrumentation or should I just delete it? I guess it could
come handy if someone builds a custom `BodyHandler` and wants to emit
spans from there, though this doesn't feel too likely. I'd like deleting
it more.

---------

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Reverts
open-telemetry#8183
Perhaps the problem is that the gc happens before reading jfr events is
started.
)

Based on open-telemetry#8190 and
open-telemetry#8131 (comment)

---------

Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>
Related discussion open-telemetry#7257
Resolves open-telemetry#3413
Resolves open-telemetry#5059
Resolves open-telemetry#6258
Resolves open-telemetry#7179

Adds a logging implementation that'll collect agent logs in memory until
slf4j is detected in the instrumented application; and when that happens
will dump all the logs into the application slf4j and log directly to
the application logger from that time on.

It's still in a POC state, unfortunately: while it works fine with an
app that uses & initializes slf4j directly, Spring Boot applications
actually reconfigure the logging implementation (e.g. logback) a while
after slf4j is loaded; which causes all the startup agent logs (debug
included) to be dumped with the default logback pattern.

Future work:
* ~~Make sure all logs produces by the agent are sent to loggers named
`io.opentelemetry...`
(open-telemetry#7446
DONE
* Make this work on Spring Boot
* Documentation
* Smoke test?
Resolves open-telemetry#7437.

A few caveats about this. The TL;DR on open-telemetry#7437 is that a non-containerized
process was reporting a `container.id` attribute. The submitter narrowed
it down and I was able to confirm with the test case in this PR.

I hunted for other means for code to determine if it's containerized
with the idea to not even do the parsing if not containerized, but I
couldn't find anything useful. In fact, most approaches of detecting
containerization at all do involve parsing cgroups. Wacky.

So I attempted to verify that container IDs should always be 64
characters. I found:
* podman - docs
[here](https://docs.podman.io/en/latest/markdown/podman-container-inspect.1.html)
"Container ID (full 64-char hash)"
* docker - UID generator source
[here](https://github.com/moby/moby/blob/634a848b8e3bdd8aed834559f3b2e0dfc7f5ae3a/pkg/stringid/stringid.go#L36)
shows 32 bytes (and even guards against fully numeric!)
* lxc [man page
](https://linuxcontainers.org/lxc/manpages/man1/lxc-info.1.html)says
"container identifier format is an alphanumeric string". If this maps
into cgroups (no idea!), it would have already been broken in some cases
because we enforce hex.

I'm a little concerned about this approach because the [otel
spec](https://github.com/open-telemetry/opentelemetry-specification/blob/94c9c75c4f82fbda08bddff02ce9a0c582fdd55c/specification/resource/semantic_conventions/container.md)
suggests that "The UUID might be abbreviated.", but it's
unclear/non-specific about the circumstances that might cause this.

Open to hearing about why the approach presented here is a bad idea. 🙃
Thanks @breedx-splk for all of your efforts in this repository reviewing
PRs!
…lemetry#8208)

Currently server span end time verification is only implemented for
groovy tests.
Currently we run spotless and other checks for each of the parallel test
steps which seems wasteful. Here is an attempt to run only the tests in
given partition without any extra checks in the `test` step and run all
the checks in the `build` step.
I think that the only way zipslip could happen is when name contains
`..` but codeql isn't able to cope with that. Removing the `..` check
gets rid of the code scanning alert.
This pr gives classes defined in agent and extension class loaders all
permissions. Injected helper classes are also defined with all
permissions. Agent startup is altered so that we won't call methods that
require permission before we are able to get those permissions.
This pr does not attempt to address issues where agent code could allow
user code to circumvent security manager e.g.
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/javaagent-bootstrap/src/main/java/io/opentelemetry/javaagent/bootstrap/InstrumentationHolder.java
gives access to `Instrumentation` that could be used to redefine classes
and remove security checks. Also this pr does not address failed
permission checks that could arise from user code calling agent code.
When user code, that does not have privileges, calls agent code, that
has the privileges, and agent code performs a sensitive operation then
permission check would fail because it is performed for all calling
classes, including the user classes. To fix this agent code should uses
`AccessController.doPrivileged` which basically means that, hey I have
done all the checks, run this call with my privileges and ignore the
privileges of my callers.
Resolves
open-telemetry#8143
Resolves
open-telemetry#6081
Resolves
open-telemetry#5137
Using the same approach as in
open-telemetry#6243
and as used by DataDog. Unlike in open-telemetry#6243 this pr does not attempt to
prevent leaking scopes into actors but rather instruments the actor to
reset context to get rid of the leaked scopes (DataDog does the same).
This `package.json` contains outdated dependences, it is only used in a
test.

---------

Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
opentelemetrybot and others added 24 commits May 6, 2023 22:55
Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Co-authored-by: Lauri Tulmin <ltulmin@splunk.com>
…/examples/distro (open-telemetry#8445)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…/examples/extension (open-telemetry#8444)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…en-telemetry#8443)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: opentelemetrybot <107717825+opentelemetrybot@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
….13.1 to 3.13.2 (open-telemetry#8461)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lauri Tulmin <ltulmin@splunk.com>
Co-authored-by: Fabrizio Ferri-Benedetti <fferribenedetti@splunk.com>
Co-authored-by: Mateusz Rzeszutek <mrzeszutek@splunk.com>
@guance-review-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lrwh

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@CLAassistant
Copy link

CLAassistant commented May 26, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 14 committers have signed the CLA.

❌ siyuniu-ms
❌ sfriberg
❌ heyams
❌ laurit
❌ Mateusz Rzeszutek
❌ ericsyh
❌ shuwpan
❌ rahuldimri
❌ trask
❌ dependabot[bot]
❌ opentelemetrybot
❌ kaibocai7
❌ niteshs7
❌ dmarkwat


Mateusz Rzeszutek seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@lrwh lrwh merged commit 843d8ae into guance May 26, 2023
lrwh pushed a commit that referenced this pull request May 29, 2023
lrwh pushed a commit that referenced this pull request May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.