Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing to MessageDb module in an additive manner #348

Merged
merged 8 commits into from
Nov 15, 2022

Conversation

nordfjord
Copy link
Contributor

@nordfjord nordfjord commented Nov 14, 2022

I'm too used to tracing to not have it 😅 .

Wonder if we can compromise on this one and add tracing in an additive way like here.

There are three activities defined:

  • TrySync: when writing to a stream
  • ReadBatch: when reading from a stream
  • ReadSlice: child of read batch for each slice that's fetched
  • ReadLast: Reading the last event of a stream

Screenshots of the integration tests
Screenshot 2022-11-14 at 8 43 02 PM
Screenshot 2022-11-14 at 8 43 30 PM
Screenshot 2022-11-14 at 8 43 51 PM

@@ -47,7 +54,7 @@ module Log =
| None -> f log
| Some retryPolicy ->
let withLoggingContextWrapping count =
let log = if count = 1 then log else log |> prop contextLabel count
let log = if count = 1 then log else Tracing.addTag("eqx.attempts", count); log |> prop contextLabel count
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made it so we have 1 span for the all the attempts rather than 1 per.

Npgsql has tracing built-in as well so you can see the individual queries made to the DB as well.

I'd prefer putting it on all spans so I could do queries like show p99 duration by attempt count, but you've voiced a preference for not including the default value of 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have I made it like that? 🤦 I don't think so

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its cumbersome to query when irregular, attempt 1 is not a problem to have redundantly in there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there is an outer attempt wrt the OCC retry loop, see https://github.com/jet/equinox/blob/master/src/Equinox.Core/Category.fs#L43
if that needs to be passed inwards as an accumulating context, that can be done... (or does there need to be an outer activity?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting, so we have n*m retries!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say in the vast majority of cases the OCC retry (with reload) is far more important of course.
But when the read/write retry is needed, it's really needed.
Thanks to Async CTs you can 'simply' enforce a timeout when it all goes horribly wrong ;)

@bartelink
Copy link
Collaborator

Great work paving the way for the other stores, much appreciated

@bartelink bartelink merged commit a1e123f into jet:master Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants