Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query logging/tracing #1706

Open
bwplotka opened this issue Nov 1, 2019 · 44 comments
Open

Add query logging/tracing #1706

bwplotka opened this issue Nov 1, 2019 · 44 comments
Assignees
Labels
dont-go-stale Label for important issues which tells the stalebot not to close them feature request/improvement GSoC/Community Bridge/LFX

Comments

@bwplotka
Copy link
Member

bwplotka commented Nov 1, 2019

We should log queries and preserve those optionally in tracing. This is to clearly figure out which query caused unexpected behavior or hit limit!

Nice to have: middleware logging: grpc-ecosystem/go-grpc-middleware#223 cc @adrien-f (: Although we can host something temporarily.

Related Prometheus issue: prometheus/prometheus#6223

AC :

  • use separate request logger derived from base logger (only tags)
  • Use the same request-id and trace id for those.
  • Add as meaningful query statistics (it is only put inside metric system so far)
  • Be careful on sensitive data in labels 🤔
@bwplotka bwplotka self-assigned this Nov 1, 2019
@bwplotka bwplotka changed the title Added query audit logging/tracing Add query audit logging/tracing Nov 1, 2019
@bwplotka bwplotka changed the title Add query audit logging/tracing Add query logging/tracing Nov 10, 2019
@stale
Copy link

stale bot commented Jan 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 11, 2020
@stale stale bot closed this as completed Jan 18, 2020
@GiedriusS GiedriusS added pinned and removed stale labels Jan 18, 2020
@GiedriusS GiedriusS reopened this Jan 18, 2020
@bwplotka bwplotka removed the pinned label Jan 28, 2020
@stale
Copy link

stale bot commented Feb 27, 2020

This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions.

@stale stale bot added the stale label Feb 27, 2020
@bwplotka
Copy link
Member Author

bwplotka commented Feb 27, 2020 via email

@stale stale bot removed the stale label Feb 27, 2020
@stale
Copy link

stale bot commented Mar 28, 2020

This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions.

@stale stale bot added the stale label Mar 28, 2020
@bwplotka
Copy link
Member Author

Still needed, no progress on this yet. Help wanted!

@mkabischev did something promising!

#2321 (comment)

@bwplotka
Copy link
Member Author

Tracing is done, but we might want to log more on debug.

@bwplotka
Copy link
Member Author

This was merged: grpc-ecosystem/go-grpc-middleware#223 so we might want to think about upgrading middleware and getting this in.

However the current version is not fully correct on middleware sides, the v2 works just started, we could do that first: grpc-ecosystem/go-grpc-middleware#275

@vaibhavgulati
Copy link

valgrind-out.txt
Ran valgrind over thano0.12.0 it gave few errors I don't know if its helpful for you to find out memory leaks.

@obiesmans
Copy link
Contributor

Small feedback as a tracing user :

It's working great and it's a really useful tool to troubleshoot our Thanos architecture.
Filtering sensitive data could be a neat addition (our GCS service-account key is in ENV).
Also, having queryiers/stores/sidecar logging slow queries could be useful to those not willing to invest in a tracing infrastructure yet.

@yashrsharma44
Copy link
Contributor

Is this issue grpc-ecosystem/go-grpc-middleware#275 a blocker for solving #1706(Query Logging)?

@bwplotka
Copy link
Member Author

bwplotka commented Apr 30, 2020

Well not blocker, just some dependency we need to work on. I am maintainer on grpc middleware, as well, so yea we might want to push something forward.

TBH the v2 is mainly done so we can just try upgrade Thanos to that (: (just some minor import tweaks I need to fix first, probably tomorrow)

@aymericDD
Copy link
Contributor

Still an issue

@stale stale bot removed the stale label Jun 7, 2021
@stale
Copy link

stale bot commented Aug 7, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Aug 7, 2021
@GiedriusS GiedriusS removed the stale label Aug 11, 2021
@stale
Copy link

stale bot commented Oct 11, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Oct 11, 2021
@bwplotka
Copy link
Member Author

Let's get back to this since @aymericDD want to help in this.

Let's figure out what exactly is missing. Sounds to me like tracing is done. But logging might need more metadata/better usability.

  • Hard to discover logs that are interesting
  • Ideally we have start call and end call with some request id

Feel free @aymericDD to give more details on what's your current problem is.

@stale stale bot removed the stale label Oct 28, 2021
@bwplotka
Copy link
Member Author

Let's make sure we reuse exactly the same statistics and metadata as for tracing.

@bwplotka
Copy link
Member Author

@bwplotka
Copy link
Member Author

Some starting point:

grpc_logging.UnaryServerInterceptor(kit.InterceptorLogger(logger), logOpts...),

@stale
Copy link

stale bot commented Jan 9, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Jan 9, 2022
@GiedriusS GiedriusS removed the stale label Jan 10, 2022
@stale
Copy link

stale bot commented Apr 16, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Apr 16, 2022
@stale
Copy link

stale bot commented Apr 30, 2022

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Apr 30, 2022
@aymericDD
Copy link
Contributor

Still an issue

@GiedriusS GiedriusS reopened this Apr 30, 2022
@stale stale bot removed the stale label Apr 30, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Aug 13, 2022
@GiedriusS GiedriusS removed the stale label Aug 16, 2022
@GiedriusS
Copy link
Member

#5521 done

@GiedriusS GiedriusS added the dont-go-stale Label for important issues which tells the stalebot not to close them label Aug 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dont-go-stale Label for important issues which tells the stalebot not to close them feature request/improvement GSoC/Community Bridge/LFX
Projects
None yet
Development

No branches or pull requests

9 participants