Description
Is your feature request related to a problem? Please describe.
Field debug_redact
has been added around Protobuf 22 to mark sensitive fields as being sensitive. (Amusingly, this means protoc
that ships in Debian, as late as unstable
, does not support this field yet.)
Not using debug_redact
means that fields that are tagged as sensitive will be redacted in some languages such as C++, but not in Go.
Please note that debug_redact
can be used on more than just fields; but, its purpose on fields is clearer than on enums.
- message fields (
FieldOptions
): https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/src/google/protobuf/descriptor.proto#L739 - enum values (
EnumValueOptions
): https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/src/google/protobuf/descriptor.proto#L865
Describe the solution you'd like
A solution should be found so that serialization for purposes of text output results in fields being redacted, but otherwise not.
In C++, TextFormat::Printer::PrintFieldValue
invokes TryRedactFieldValue
which uses this option if the serializer has been initialized with redact_debug_string_
option:
Java computes whether a field is sensitive based on this option: https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/java/core/src/main/java/com/google/protobuf/Descriptors.java#L1837-L1859
This is then used if enablingSafeDebugFormat
is flipped true:
For Go, it is not completely clear to me when this should be used. Adding it to the right place seems trivial: it likely belongs in marshalField
or wherever marshalSingular
is called, and an option can be used at that point: https://github.com/protocolbuffers/protobuf-go/blob/b98563540c0a4edb38526bcd6e6c97f9fac1f453/encoding/prototext/encode.go#L210
However, once the encoder
supports this, should the Stringer
interface (String()
) on a generated message proceed to output the redacted or the non-redacted version of the message?
- I'm leaning towards redacted by default. It would have to be a combination of someone using
str(someMessage)
to generate a textproto and also use the relatively new fielddebug_redact
in order to experience this change in behavior. If I recall correctly, the textpb output is not defined to be stable -- so perhapsMessageStringOf(m protoreflect.ProtoMessage) string
is permitted to change behavior when invokingprototext.MarshalOptions{Multiline: false}.Format(m)
. - The counterpoint to this is that dropping fields in situations when people may be taking shortcuts to writing data could cause data loss. Perhaps a transition period where encoder actually takes two bool options:
RedactFieldsWarns
(toggle warning debugoutput for a field, possibly a comment in generated textpb; for default stringification, defaults to true) andRedactFields
(for default stringification, initially defaults to true, later default false); possibly later switching to panics. - Another way to approach this: deprecate use of
String()
by adding a relevant docstring to the generated code. This feels rather blunt, but ifdebug_redact
is important enough, it would hint to the users that they should avoid it. It would probably not affect the users widely enough, however: would casting withstr()
or using%s
+%v
show up in code health tools to indicate invoking a deprecatedStringer
?
It feels like sensitive fields should be saved only when explicitly requested, but unfortunately, existing code before the field existed or before it was used may have different expectations.
Describe alternatives you've considered
Two main approaches:
- Write a custom serializer that redacts the fields. Never use
str()
,.String()
or%s
/%v
on a generated proto message without using the custom formatter+redactor, just as one would have to do with a custom option. (This would then apply whether logging or not, as there can be no guarantees of what will happen with the proto isfmt.Sprintf
'd.) Process generated.pb.go
code to add// Deprecated: Using String() directly does not redact fields.
to the docstring so the tooling starts to output warnings about use ofString()
. - Choose to consider
debug_redact
broken and ignore the existence of this field option, accepting that potentially sensitive messages will end up in logs or other places where they should not.- Possibly: send a pull request to base protobuf repo to add a comment to
debug_redact
option to clarify that implementing is optional, implementation specific and that depending on it taking effect without examining the behavior in every used language is risky.
- Possibly: send a pull request to base protobuf repo to add a comment to
As is, the option is present and usable, but will not actually result in redaction of fields in sensitive contexts.
Additional context
It may be worth considering otherwise allowing modification of the marshaling of the fields, in case other options / annotations affect the field in other, non-logging contexts.
As a example, customization might be useful because a field might not be sensitive for logging, but it might be too sensitive to display to some types of system administrators, or to send to end users.
This consideration on allowing customizing serialization would likely apply across languages.