Skip to content

Conversation

@jinjagit
Copy link
Member

@jinjagit jinjagit commented Mar 17, 2021

commit msg:

"First attempt at abstracting some of the verbosity of Tracer.set_attributes to a module def, rather than repeating similar code in each definition we wish to observe.

Currently only setting tracer spans in 2 resolver defs, and only for votes (1 of 2 defs) and create_vote (1 of 3 defs).
Not sure we need to also observe the related core api defs, as the data returned is very similar, and the resolver defs 'wrap' the related core api defs."

UPDATE: This PR now focuses on extending the span attributes & improving the span name. The refactoring mentioned in this description, is not included.

First attempt at abstracting some of the verbosity of
Tracer.set_attributes to a module def, rather than repeating
similar code in each def we wish to observe.

Currently only setting tracer spans in 2 resolver defs,
and only for votes (1 of 2 defs) and create_vote (1 of 3 defs).
Not sure we need to also observe the related core api defs, as
the data returned is very similar, and the resolver defs 'wrap'
the related core api defs.
@jinjagit
Copy link
Member Author

jinjagit commented Mar 17, 2021

It'd be great to get some feedback on this idea / approach. Some questions that cross my mind:

  1. Should such a generic approach have its own module (here, I created LV/tracers module), or better to add defs to application.ex, for example?
  2. Better to unpack the vars field into constituent key-value pairs? We can still filter by key or value on Honeycomb without doing this.
  3. Should we ignore (not observe) sensitive vars (e.g. emails)? Or do we consider Honeycomb (and communication with it) secure?
  4. Any way to abstract to a higher level the Tracer.with_span "span_name" do ... end block that I am using to wrap whatever code we wish to observe? Conceptually, this seems a bit tricky, since the block wraps the content in specific defs we wish to pull def-specific (and request_specific) data from.

@jinjagit
Copy link
Member Author

jinjagit commented Mar 17, 2021

I'm also starting to think that we should also be observing the :ok or :error result from the resolver defs that include these.

@jinjagit jinjagit requested a review from oliverbarnes March 17, 2021 16:17
@oliverbarnes
Copy link
Member

It's hard to tell what the PR is about from the title, and it's the second time we have this one. Would you please make it more descriptive? See how I edited this one (last update, towards the bottom): #211

@jinjagit jinjagit changed the title Add opentelemetry Set general tracer span attributes in separate module. Mar 17, 2021
@jinjagit
Copy link
Member Author

It's hard to tell what the PR is about from the title, and it's the second time we have this one. Would you please make it more descriptive? See how I edited this one (last update, towards the bottom): #211

Yes, sorry. I didn't check the title, and assumed it reflected the commit, forgetting there are 2 commits here. Changed now.

@oliverbarnes
Copy link
Member

much better 👏 thanks :)

@davefrey
Copy link
Member

As a noob reading this PR: I like extracting the tracer details so the functions read better 👍.

But I'm also trusting some magic in how the attributes are passed and assigned. How do we know this works? Is it reasonable to have a spec that asserts the trace behavior?

This is partly a refactor (extraction) and partly extension (more info captured if I read correctly: the whole of binding vars rather than just org id). I don't think it's worth redoing here but often it's nice to isolate the two efforts.

It's really nice to learn by reading PRs here.

@davefrey
Copy link
Member

To elaborate a bit: my reviewer brain reading refactors wants to know there is spec coverage that stays green; reading new functionality I expect new specs too. I don't know Elixir testing well but there should be a way to assert behavior like "when voting expect the tracer service to have been called with ..."

@jinjagit
Copy link
Member Author

But I'm also trusting some magic in how the attributes are passed and assigned. How do we know this works? Is it reasonable to have a spec that asserts the trace behavior?

Good point. There's no specs yet. I think it should be possible to include tests, and I could probably model them on some I have seen in OpenTelemetry repos. I am still wondering if this is even the right approach, but since it does seem to be yielding useful traces (at least better than none), maybe it is time to start developing some.

This is partly a refactor (extraction) and partly extension ... I don't think it's worth redoing here but often it's nice to isolate the two efforts.

True. I should really have separated the 2 different types of change. I think, at the moment, the new observability related code is so minimal that this doesn't really present a problem in terms of understanding or separating (should we wish to, say, remove 1 type of change and leave the other). Point taken, though.

Thanks for the feedback 👍

@jinjagit
Copy link
Member Author

jinjagit commented Mar 18, 2021

I've just noticed that when we include the related core api def(s), rather than just the resolver def(s),in the trace (by also wrapping in a tracer span), we can get this nice waterfall view on Honeycomb:

waterfall

I removed the related core-api traces in this PR, but now I can see how including them, and more importantly wrapping tracer spans around things like db queries / writes / etc. within the core api actions, could give us more granular info.

[Also added to #201]

# Sets general attributes to send in Tracer.with_span block.
def set_attributes(env, vars) do
Tracer.set_attributes([
{:action, env.function},
Copy link
Member Author

@jinjagit jinjagit Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be more useful to name this explicitly as :function. We could then develop this to include the module name + function name, and then name the field something like :mod/func.

@oliverbarnes
Copy link
Member

Though I see the tracer code verbosity, I don't really see much value in abstracting it away if we still need the original Tracer.with_span "resolvers/voting" blocks wrapping it. And abstracting that in itself isn't worth the trouble. In the end I think it's too early for extracting and trying to refactor the tracing. So I'd remove the refactor part and keep the extension part that Dave pointed out, and adapt the PR title and description (with an Update section tacked on to the bottom of it).

I'll review the PR again once the above changes are applied.

I think the same might apply to specs: we should at least wait until we have a stable set of traces to spec them out. If we are to spec them out at all, that is. I'm not so sure it's worth doing it. It's not functionality per se, and nor is it user facing. We're the ones using it, and we can easily assert that it's working on Honeycomb itself - manual testing might be enough. It's a debate though: I've myself spec'd out logs in the past, but changed my stance later. We can discuss it down the line.

As for the questions and comments:

Should we ignore (not observe) sensitive vars (e.g. emails)? Or do we consider Honeycomb (and communication with it) secure?

Good point you brought up. We should anonymize them, so they're kept private while also allowing us to drill down traces based on them. We shouldn't be able to see it, and it's also mandatory to be compliant with GDPR. I'm creating a new issue for this.

I'm also starting to think that we should also be observing the :ok or :error result from the resolver defs that include these.

👍 yes, we should also trace both success and failure scenarios. And all different variations on them, when present

Copy link
Member

@oliverbarnes oliverbarnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested in my last comment

This is more similar to how traces for Ecto and Phoenix are,
and will be, named.

It also means this information is more clearly presented on
Honeycomb, as it is listed first in trace rows, and in large
text, top-right of page when inspecting a particilar trace.
@jinjagit
Copy link
Member Author

I removed the refactoring to a separate module (of Tracer.set_attributes)

I also added the module name + __ENV__.function to the span name, and removed the span attribute {:action, __ENV__.function}

This is now similar to how traces for Ecto and Phoenix are, and will be, named.

It also means this information is more clearly presented on Honeycomb, as it is listed first in trace rows, and in large text, top-right of page when inspecting a particular trace.

Example Ecto trace:
ecto_query

Example of the trace developed here:
resolver_func

@jinjagit
Copy link
Member Author

Should I extend the attributes to include :ok or :error details in this PR, or a subsequent PR?

@jinjagit jinjagit requested a review from oliverbarnes March 19, 2021 12:57
@oliverbarnes
Copy link
Member

I removed the refactoring to a separate module (of Tracer.set_attributes)

Great, thanks. Can you also update the PR title, please

I also added the module name + __ENV__.function to the span name, and removed the span attribute {:action, __ENV__.function}

This is now similar to how traces for Ecto and Phoenix are, and will be, named.

Makes sense

It also means this information is more clearly presented on Honeycomb, as it is listed first in trace rows, and in large text, top-right of page when inspecting a particular trace.

Example Ecto trace:
ecto_query

Example of the trace developed here:
resolver_func

👌 very nice

otel_batch_processor: %{
exporter:
{OpenTelemetry.Honeycomb.Exporter,
write_key: "520527fcecf7c6b38bd1775da111ead3", dataset: "api-telemetry"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be in a environment variable. We'll need to regenerate our Honeycomb api key as well, now that's it's been exposed :)

All good, btw. Happens to all of us, and it's good when it happens early on so we can get it ingrained in our heads ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake. This doesn't need to be in config.ex at all. Just used for local testing. My fault for rushing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No prob, happens

"""
def list_votes(organization_id) do
Tracer.with_span "LV/voting" do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the span being removed here?

Copy link
Member Author

@jinjagit jinjagit Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relates to my general q: Should we be adding traces to the related core functions. E.g. resolver votes/3 calls list_votes. I am not sure a. if list_votes trace gives us useful information over the resolver trace? b. if it is outside the scope of focusing on Absinthe layer?

Copy link
Member

@oliverbarnes oliverbarnes Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if we add it on each layer, we'll see where most processing goes etc. But difference might be negligible, and then we can remove later. I'd stick with it for now

{:action, "votes"},
{:request_id, Logger.metadata()[:request_id]},
{:organization_id, organization_id}
{:vars, binding()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which vars are being sent, here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the args passed into the function and their values.

For example for create_vote, we get:

[args: %{participant_email: "rrr@somedomain.com", proposal_url: "https://other_one", yes: true}, email: "rrr@somedomain.com", organization_id: "b7a9cae5-6e3a-48b1-8730-8b5c8d6c9b5a"]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going with the principle that as much info. as possible is preferable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was worried we'd be getting more internal stuff.

But vars[:args][:participant_email] is the same as vars [:email], and we also need to access this atomically in order to anonymise it. I know it's verbose, but I think it's better to access each one of these atomically.

This will also make it more explicit on what we're trying to track

@oliverbarnes
Copy link
Member

Should I extend the attributes to include :ok or :error details in this PR, or a subsequent PR?

Subsequent PR would be better

@jinjagit jinjagit changed the title Set general tracer span attributes in separate module. Extend the span attributes & improve the span name. Mar 19, 2021
@jinjagit
Copy link
Member Author

Great, thanks. Can you also update the PR title, please

Done. I previously forgot to click 'save' after editing the title.

@jinjagit jinjagit requested a review from oliverbarnes March 19, 2021 15:26
Only specifies trace attributes for args used by function, not
those that are ignored, using '_'.

Include related funcs (list_votes & create_vote) at core layer,
as well as existing resolver layer trace spans.
@jinjagit
Copy link
Member Author

I have (re) included the related 'core' functions. We are currently only developing traces in 2 action paths:

  1. lib/liquid_voting_web/resolvers/voting.ex/votes, which calls lib/liquid_voting/voting.ex/list_votes
  2. lib/liquid_voting_web/resolvers/voting.ex/create_vote, which calls lib/liquid_voting/voting.ex/create_vote

We are not yet exhaustively tracing all forms of a function (only one of 3 forms of resolvers/voting.ex/create_vote, for example).

@jinjagit
Copy link
Member Author

jinjagit commented Mar 19, 2021

The 'vars' attribute is now set atomically - each arg and related value is accessed atomically.
They are still grouped in one tracer attribute, since the attributes fields are listed alphabetically (and thus would be scattered throughout the fields if in individual fields), and it also seems logical to group the vars in one field.

Only specifies args used by function, not those that are ignored, using _.

Example raw data for single 'create vote' action:
create_vote_raw

Example traces of single 'create vote' action - with optional addition of 'vars' field to traces table view & mouse-over of that field:
create_vote_trace

We can still filter / search traces by any part of the 'vars' field.

@jinjagit jinjagit self-assigned this Mar 19, 2021
@jinjagit jinjagit requested review from oliverbarnes and removed request for oliverbarnes March 19, 2021 17:05
Copy link
Member

@oliverbarnes oliverbarnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a couple of more changes

Tracer.with_span "#{__MODULE__} #{inspect(__ENV__.function)}" do
Tracer.set_attributes([
{:request_id, Logger.metadata()[:request_id]},
{:vars,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vars is too vague here, I think. How about params? That's more informative about the vars being the ones submitted in the request.

{:ok, _delegation} -> vote
{:error, changeset} -> Repo.rollback(changeset)
Repo.transaction(fn ->
# TODO: refactor case statements into small functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think refactoring is always in our minds, we don't need to add a TODO. These tend to be there forever and just add noise

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an old TODO. Will remove.

@jinjagit jinjagit requested a review from oliverbarnes March 19, 2021 17:29
@oliverbarnes oliverbarnes merged commit 2eb243a into master Mar 19, 2021
@oliverbarnes oliverbarnes deleted the add-opentelemetry branch March 19, 2021 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants