feat: report user query bytes#27188
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds per-user query response bytes tracking to InfluxDB's SHOW STATS output. The feature tracks the total number of bytes sent in query responses for each user, making it easier to monitor user-specific resource consumption.
Changes:
- Added per-user query bytes tracking using a concurrent map that stores atomic counters per username
- Updated statistics reporting to include per-user metrics with keys formatted as
queryRespBytesUser:<username> - Comprehensive test coverage for authenticated users, anonymous users, chunked queries, concurrent access, and SHOW STATS integration
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| services/httpd/handler.go | Adds queryBytesPerUser field, modifies Statistics() to include per-user metrics, adds addQueryBytesForUser() helper, and integrates tracking into query endpoints (InfluxQL, Prometheus, and Flux) |
| services/httpd/handler_test.go | Adds comprehensive test suite covering authenticated/anonymous users, chunked queries, concurrent access, and SHOW STATS output validation |
| pkg/data/gensyncmap/gensyncmap.go | Adds IsEmpty() method to the generic sync.Map wrapper (though unused in this PR) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
services/httpd/service.go
Outdated
| statFluxQueryRequests = "fluxQueryReq" // Number of flux query requests served. | ||
| statFluxQueryRequestDuration = "fluxQueryReqDurationNs" // Number of (wall-time) nanoseconds spent executing Flux query requests. | ||
| statFluxQueryRequestBytesTransmitted = "fluxQueryRespBytes" // Sum of all bytes returned in Flux query responses. | ||
| statQueryRespBytesValue = "queryRespBytes" // Value field for per-user query response bytes. |
There was a problem hiding this comment.
The constant name statQueryRespBytesValue is somewhat redundant since it already contains "value" in the name while being assigned the string "queryRespBytes" (which doesn't include "value"). Consider renaming to statQueryRespBytes for consistency with other constants in this file like statQueryRequestBytesTransmitted.
devanbenz
left a comment
There was a problem hiding this comment.
Just one comment in production code.
Need to take a look at the tests next.
There was a problem hiding this comment.
@davidby-influx asked me to consider the security aspects of this feature. Note, I did not review the PR implementation.
IME, this isn't really different than the types of things an admin in an application can typically do or what might typically be in an audit log[1]. Today in 1.x, an admin (with WITH ALL PRIVILEGES) can SHOW USERS, SHOW STATS and can SHOW QUERIES. Adding the user tag via userquerybytes as in the PR isn't terribly different and well within the purview of an admin's actions. Before reading the PR, I was thinking that making the behavior opt in would be a good idea and then I saw you did this with INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED (nice!). In this light, I think you designed the feature well for the admin user case. If you haven't already, please ensure you have positive and negative tests that the new user tag is only present when INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED=true AND the user is an admin (ie, that the new user tag is not displayed when INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED=false with either admin or non-admin and when INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED=true with non-admin (and the positive that INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED=true with admin works)).
In terms of documentation, in addition to describing the feature, I think we should:
- where the feature documents the opt-in behavior, be clear that admin is required
- update https://docs.influxdata.com/influxdb/v1/tools/api/#debugvars-http-endpoint to document some example of sensitive statistics that
/debug/varsshows (this does not need to be exhaustive; listing one or two and then using 'etc' is fine), with a link to https://docs.influxdata.com/influxdb/v1/administration/config/#pprof-auth-enabled on how to disable. - in the release notes for the feature, note that after upgrading with the feature enabled,
/debug/varswill start showing it and all existing admins will be able to see these new stats viaSHOW STATS(this isn't strictly required as it should be understood when reading the description of the feature, but it might connect the dots for some)
If at some future point you wanted to allow non-admin users access to this information, this could be allowed if SHOW STATS (with and without for 'userquerybytes') only shows stats for the current (non-admin) user.
[1] Aside: a future queryable 'audit log' feature should be limited to admin users or users who are GRANTed access to the audit log
| } | ||
|
|
||
| // TestHandler_QueryBytesPerUser tests that query response bytes are tracked per user. | ||
| func TestHandler_QueryBytesPerUser(t *testing.T) { |
There was a problem hiding this comment.
Semi-related to this PR: I just checked the codebase and I don't think we have any tests to verify that only Admins can call SHOW QUERIES. I know we use the queryAuthorizer to do so but I'm not seeing tests for this.
|
@jdstrand - The whole statistic is omitted when |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
services/httpd/service.go
Outdated
| statFluxQueryRequests = "fluxQueryReq" // Number of flux query requests served. | ||
| statFluxQueryRequestDuration = "fluxQueryReqDurationNs" // Number of (wall-time) nanoseconds spent executing Flux query requests. | ||
| statFluxQueryRequestBytesTransmitted = "fluxQueryRespBytes" // Sum of all bytes returned in Flux query responses. | ||
| statQueryRespBytes = "queryRespBytes" // Value field for per-user query response bytes. |
There was a problem hiding this comment.
statQueryRespBytes duplicates the existing statQueryRequestBytesTransmitted constant value (both are "queryRespBytes"). Having two constants with the same value but different meanings makes it easy to use the wrong one and harder to grep/maintain. Consider reusing the existing constant or renaming/refactoring so there’s a single source of truth for the field key.
| statQueryRespBytes = "queryRespBytes" // Value field for per-user query response bytes. | |
| statQueryRespBytes = statQueryRequestBytesTransmitted // Value field for per-user query response bytes. |
There was a problem hiding this comment.
Changed field name to userQueryRespBytes. See first comment showing _internal measurement and SHOW STATS output.
| for _, stat := range stats { | ||
| if stat.Name == "userquerybytes" && stat.Tags[httpd.StatUserTagKey] == user { | ||
| if v, ok := stat.Values["queryRespBytes"]; ok { | ||
| return v.(int64), true |
There was a problem hiding this comment.
findUserStat does a direct type assertion (v.(int64)) on the statistic value. If the underlying value type changes (e.g., to uint64 or json.Number) this will panic and produce a less-informative test failure. Prefer a checked assertion (type switch / require.IsType + cast) so failures report a clear mismatch instead of panicking.
| return v.(int64), true | |
| switch n := v.(type) { | |
| case int64: | |
| return n, true | |
| case uint64: | |
| if n <= math.MaxInt64 { | |
| return int64(n), true | |
| } | |
| } |
| }} | ||
|
|
||
| // Add per-user query bytes as separate statistics (one per user) if enabled | ||
| if h.Config.UserQueryBytesEnabled && !h.queryBytesPerUser.IsEmpty() { |
There was a problem hiding this comment.
The IsEmpty() check adds an extra full sync.Map traversal before the real Range() traversal. Since Range() is safe on an empty map, this doubles the work on every Statistics() call once any user entries exist. Consider removing the !h.queryBytesPerUser.IsEmpty() guard and just Range() when UserQueryBytesEnabled is true (or track a separate atomic entry count if you need a cheap emptiness check).
| if h.Config.UserQueryBytesEnabled && !h.queryBytesPerUser.IsEmpty() { | |
| if h.Config.UserQueryBytesEnabled { |
There was a problem hiding this comment.
IsEmpty() does not require a full traversal of the sync.Map. This comment is wrong
devanbenz
left a comment
There was a problem hiding this comment.
One more comment. Mostly looks good though.
| return false | ||
| } | ||
|
|
||
| t.Run("enabled=true admin=true: userquerybytes visible in /debug/vars", func(t *testing.T) { |
There was a problem hiding this comment.
These tests are testing the /debug/vars which is good. But just checking if we should add tests for authorization on the query SHOW QUERIES too?
There was a problem hiding this comment.
Let me look at the authorization path for that.
There was a problem hiding this comment.
You could probably extend the tests I wrote for #27196?
There was a problem hiding this comment.
SHOW QUERIES authorization testing should probably be part of your current work. Pertaining to this PR, SHOW STATS is protected by the InfluxQL standard privilege settings. These are tested here, and are unchanged by this PR.
I added a test to verify the privileges require admin users for SHOW STATS
In SHOW STATS, include per-user query response bytes. The httpd service, returns a new statistics object. userquerybytes. Off by default, it can be turned on with the environment variable INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED or the configuration parameter UserQueryBytesEnabled in the HTTP section. This is available through SHOW STATS either when all statistics are reported, or alone with `FOR 'userquerybytes'` It is returned by /debug/vars, and stored in the _internal database when that is enabled. (cherry picked from commit 858a2c6) Fixes #27195
In SHOW STATS, include per-user query response bytes. The httpd service, returns a new statistics object. userquerybytes. Off by default, it can be turned on with the environment variable INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED or the configuration parameter UserQueryBytesEnabled in the HTTP section. This is available through SHOW STATS either when all statistics are reported, or alone with `FOR 'userquerybytes'` It is returned by /debug/vars, and stored in the _internal database when that is enabled. (cherry picked from commit 858a2c6) Fixes #27195
Ok, I phrased that wrong but just wanted to ensure we had positive and negative tests. |
In SHOW STATS, include per-user query response bytes. The httpd service, returns a new statistics object. userquerybytes. Off by default, it can be turned on with the environment variable INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED or the configuration parameter UserQueryBytesEnabled in the HTTP section. This is available through SHOW STATS either when all statistics are reported, or alone with `FOR 'userquerybytes'` It is returned by /debug/vars, and stored in the _internal database when that is enabled. (cherry picked from commit 858a2c6) Fixes influxdata#27195
In
SHOW STATS, include per-user query bytes.A new statistics object is returned by the httpd service,
userquerybytes.Off by default, it can be turned on with the environment variable INFLUXDB_HTTP_USER_QUERY_BYTES_ENABLED=true
or the configuration parameter
UserQueryBytesEnabledin theHTTPsection.This is available through
SHOW STATSeither when all statistics are reported, or alone withFOR 'userquerybytes'It is returned by
/debug/vars, and stored in the_internaldatabase when that is enabled.The username is a tag, as seen here: