-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP. Add a new Anonymizer adjuster #29
Conversation
// This allows real production traces to safely be kept in public repositories for use in test | ||
// suites, after anonymization. | ||
// | ||
// A custom mapping is provided that dictates how to anonymize data. If a value is encountered that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm one "that"
// | ||
// An example of a mapping: | ||
//var mapping = AnonymizerMapping{ | ||
// adjuster.AnonymizedService: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than providing a fixed mapping, another option is to provide a naming pattern like "service-%03d"
that can be used to generate the replacement values (internally it can keep a map to ensure the same service name is mapped to the same replacement value). For even more flexibility, this can be a function that takes the original value and returns the replacement value.
Also, rather than passing this as a map, I would probably define a struct with public fields. In fact, that struct itself can be the anonymizer by implementing the Adjuster interface, e.g.
a := &adjuster.Anonymizer{
ServiceName: CountingStringMapper("service-%03d"),
}
func CountingStringMapper(pattern string) func(in string) string {
values := make(map[string]string)
return func(in string) string {
if out, ok := values[in]; ok {
return out
}
out := fmt.Sprintf(pattern, len(values))
values[in] = out
return out
}
}
Note - if you go with this approach, you may want to move this to a sub-package. In fact, in order to even use this adjuster you need some way to retrieve the trace, so perhaps it should be a standalone util under /cmd/anonymizer
that reads/writes JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with having an Anonymizer struct.
The reason why I had it as an explicit mapping is so that it can be consistent across multiple traces, meaning if trace1 has services foo
and bar
and trace2 has services bar
, then bar
would be anonymized to service2
in both traces. Thinking about it now, I'm not sure if that really matters or not from the programmer's point of view. It certainly does reduce the effectiveness of anonymization, though, so maybe it's not such a great idea.
I do agree that there will need to be a standalone binary for convenience.
@kwojcik do we need this? |
Implemented in these PRs: |
Initial prototype of the semantics of an Anonymizer adjuster, which will make it easy to include anonymized production traces for test suites in public repos.