-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add log formatting options to Telemetry API #2094
Conversation
😊 Welcome @douglas-reid! This is either your first contribution to the Istio api repo, or it's been You can learn more about the Istio working groups, code of conduct, and contributing guidelines Thanks for contributing! Courtesy of your friendly welcome wagon. |
Skipping CI for Draft Pull Request. |
@mandarjog @kyessenov @howardjohn i'm trying to kick off Telemetry API work for 1.12. please take a look and provide some feedback. I opt'd out of trying to pull in the OTel protos directly for an initial round of discussion, etc., but they can be put back as required. I think it should be relatively straightforward to map into existing providers here, and this avoids having to try to model structured logging formats, etc., in an envoy-specific way. |
// | ||
// TODO(dougreid): discuss mapping into supported providers? | ||
// | ||
// Example: ``body { string_value: "%PROTOCOL%" }``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not yaml so its a bit confusing how a user is supposed to apply it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies, i'll update with some yaml examples soon. i was just trying to put a strawman together quickly to spur discussion.
// allowing mapping into various access logging providers. | ||
message LogFormat { | ||
// Optional. A value containing the body of the log record. Valid uses include | ||
// a human-readable free-form string message and structured data composed of arrays and maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have evidence structure format is useful? As far as I can tell NGINX, HAProxy, and Envoy all use a free form string. It seems quite a bit simpler to understand.
I couldn't tell how to use the structured form in this API (or the free form, really...), and the otel spec is like 20 pages which is a bit of a red flag it may be over complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it makes a bit more sense after poking around the upstream docs a bit. is the otel proto meant for humans writing it though? It seems pretty unfriendly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: structured format utility -- support by major vendors, including cloud logging, and envoy itself, suggests that there are indeed those that use JSON-formatted logs. i know i have open issues against me about structured logging.
This is a bit verbose. That is definitely true. Adopting something like SubstitutionFormatString also seemed a bit onerous.
This is sort of what I'm trying to tease out here. Do we want to provide any sort of structured customization (perhaps even CEL-based), or just say "here's a string, put whatever you want here" ?
I think we do want an ability to provide structured customization (say adding fields to Cloud Logging formats), but I'm not clear on what the "best" way forward is. In past API reviews, we rejected putting the customization directly in the providers (having provider-specific overrides). Trying to generalize across formats unfortunately leads to bloat.
Maybe we could try to slim down the options, or abstract them in a way that makes writing them easier. I'll have to do more thinking about what that might look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the functionality gap between just plain JSON (and raw string) vs the full otel proto? We already support json format today and its not hard to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we support JSON format, but do we support easy customization of the JSON format ? does accessLogFormat
string in proxy config count as easy to use for JSON format cases?
if we want to customize the attributes in any way, having some structured control over the key/values seems useful.
The bulk of my experience is with Cloud Logging. Having a way to separate out the body
from additional attributes
is useful for that case. That actually maps to some of the old Mixer configuration, where we had a payload
and http_mapping
sections.
That may be too much for most users though?
We could have a oneof
with a text format and a json format to allow for slightly simpler config, at the risk of inventing yet another logging API. The initial appeal of OTel to me is two-fold: (1) it exists already and (2) it can be mapped to other formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, for default cases. But you could put anything you want in, potentially. so you could have: body: "serviceA --> serviceB"
and the attributes
be all the other info. The precedent for this would be something like the existing stackdriver TCP logs (example).
I'd be fine with starting without it, if we feel it is overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting... I guess you would need templating in the body as well for it to be useful.
Really it seems like body is just a specialized attribute that may be displayed different in various logging systems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above - is it Istio business to define and maintain a cross-vendor API for configuring the log format ?
And is this something that should be done at workload-level granularity, i.e. each workload sending metrics in a slightly different format and to a different sink ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if stackdriver or opentelemetry don't have an API or config of their own we could create a placeholder
'stackdriver' CRD, with stackdriver-specific configuration including format and anything else that could be configured.
(same for opentelemetry, prometheus or any other integration ), and just refer to it.
The config can be a simple configmap - giving the vendors a chance to create their own CRDs.
Long term, each vendor library should be able to consume his own vendor config, and we don't have to worry about it -
just reference the config we want to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@costinm I don't believe it is necessarily Istio business to define and maintain a cross-vendor API for log formats. This principle has been at the core of the Telemetry API design discussion for a long while. We were able to avoid it for Tracing, and paper over it for Metrics, but logging is much thornier.
I don't know that you'd necessarily want different formats at the individual workload level, but we do see metrics customization at the workload level, so it isn't outside the realm of possibilities. Imagine having extra headers you want to collect (maybe just for a brief debugging, etc.).
The notion of their own CRDs is essentially the Extension Provider
API that currently lives inside of Mesh Config (and is the approach mostly taken by the alternative PR to this one).
I'm intrigued by the idea of having the configuration for extension providers being a reference to a configmap, as that provides a workaround to allowing users to update without access to meshconfig. I worry a bit about the control plane keeping up with those references and updating appropriately.
// LogFormat provides control over the generated access log records. | ||
// It adopts portions of the Open Telemetry model for record formatting, | ||
// allowing mapping into various access logging providers. | ||
message LogFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to make this provider agnostic? It can just be pass thru, no?
I think the main point that needs a decision is whether we want to support Envoy format strings as Istio APIs. The alternative is to make the format string pre-defined and not expose it in API. Otherwise, this is fairly consistent with other telemetry providers. |
I suspect that users will want to customize the log formats -- to add/remove attributes, etc. (based on the forums, open issues, etc.) Pre-defining the format string seems fairly limiting. Providing an Istio layer that translates down into Envoy format strings also seems like a step too far. So, my instinct is to allow users to pass in format strings opaquely, with some control over the structure that results. Curious to hear thoughts of others. |
Yeah, I think pass thru opaque string is fine. We can add an enum later to catalog "standard formats", but we don't need to start there. Freeform opaque string is both necessary and sufficient. |
If you want envoy formatters, those are evolving and we need some way to selectively trigger extensions. E.g. some old envoys do not have REQ_WITHOUT_QUERY while newer ones do. |
|
||
// Optional. Controls the format of the generated log records. | ||
// If not specified, the istio default logging format will be used. | ||
LogFormat log_format = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we need is far from customizing log format, but we also want specify headers to add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent behind LogFormat
is to allow exactly what you ask for. The idea would be to pass in format specifiers that reference headers, etc., that would be included in the log.
Perhaps we need a better name? Or maybe something less opaque to make that more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show we use some config like k8s volume for better usage, something like:
- name: file-log
file:
path: /dev/stdout
format: "[%START_TIME%]"
- name: grpc-http-log
gprc:
grpcService:
address: grpc-als.istio-system.svc:9811
additionalRequestHeadersToLog:
- x-request-user
additionalResponseHeadersToLog:
- x-response-cost
- name: grpc-http-log
grpcOtel:
grpcService:
address: grpc-als.istio-system.svc:9811
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zirain does this mean that you want to log different things per provider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@douglas-reid please take a look on this issue istio/istio#34911
5f72f10
to
0160d7d
Compare
@hzxuzhonghu I added some potential examples to the top comments in the proto file. @hzxuzhonghu @mandarjog @kyessenov @howardjohn @zirain I took a quick stab at an alternate provider-based approach for the same functionality in #2110. Would appreciate your feedback on that approach versus this style. |
@douglas-reid: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
// string_value: "[%START_TIME%] "%REQ(:METHOD)% %RESPONSE_CODE% %RESPONSE_FLAGS%\n" | ||
// ``` | ||
// | ||
// Policy to change the access logging format for JSON loggers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. we can worry about the wording once we decide on approach :).
// accessLogging: | ||
// - log_format: | ||
// body: | ||
// string_value: "[%START_TIME%] "%REQ(:METHOD)% %RESPONSE_CODE% %RESPONSE_FLAGS%\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this format something defined by open telemetry or a common format, or envoy specific ?
It would be great to avoid envoy-specific internals leaking into Istio APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the syntax should be consistent with what we use for VirtualService ( header value format ). Which is based on envoy internals, but now is grandfathered in - so if it is the same syntax we can keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format, as proposed here, is based on OTel's logging format. The values of string_value
(or whatever ends up in body
) would be opaque to Istio. The only job of the control plane would be to map those opaque values to the "correct" parts of the various provider configs. This could prove challenging.
} | ||
|
||
// Taken from OTel ("opentelemetry/proto/common/v1/common.proto") | ||
message AnyValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it viable to just use a Struct type? I believe that would allow the user to just write normal yaml with whatever types they want but not actually declaring them like [{foo: 1, bar: "baz"}]
.
If we need to convert that to OTel proto in the control plane it should be no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also work for both envoy string and envoy json format I believe. Unless Struct must be an object as the root
This PR brings logging customization controls up to parity with Tracing / Metrics, allowing tailoring of log formats on a per-workload basis. The API is based on the existing Open Telemetry Logging API specification, which should enable mapping into existing provider formats (
file
,stream
,open_telemetry
, andstackdriver
).This addresses several open FRs, including istio/istio#33307.