-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643
Conversation
e58f065
to
db5ab72
Compare
db5ab72
to
03098bc
Compare
Imo, it would be ideal for this to still be configurable to some degree. Certain secret patterns may not play nicely with the default list and may even have their own separate list (e.g., #1953.) |
We're working on centralizing the filtering and can make it more configurable in the future. I have left the detectors that I saw that had custom lists in place. The wordlists were also filtered to 4 char minimum a while back so that resolved the case you identified in #1953, but I recognize that it's just of an example and still can occur in other cases. |
pkg/detectors/falsepositives.go
Outdated
if !IsKnownFalsePositive(string(result.Raw), falsePositives, wordCheck) { | ||
filteredResults = append(filteredResults, result) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would be useful to trace log that something was filtered? There have been a few times — most recently #2620 — where the cause of something not being detected wasn't immediately obvious, and required custom logging to see that the FP check was responsible.
(Incidentally, there's a lot of noise from this specific log)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rgmz in internal discussions we've actually been leaning towards expanding --results
to include these in order to solve that problem. We're not quite there yet but it seems like a potentially good way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that. It would be an intuitive expansion of the flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've extended the results flag with an additional filtered_unverified
option. It just logs that it was omitted, it doesn't actually 'notify' on it though.
switch result.DetectorType { | ||
case detectorspb.DetectorType_CustomRegex: | ||
filteredResults = append(filteredResults, result) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we breaking for these detector types? My understanding is that the effect will be that when a custom regex (or gcp) detector returns multiple unverified results, we'll ignore all but the first one. Is that intended? And even if it is, is this function the place for it? (It doesn't sound like it has anything to do with false positives.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lint action is actually flagging this as a "redundant break" because it is breaking out of the switch
, not the for
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 thanks for correcting me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I went through the detector changes and I made this list of detectors that had their false positive check expanded, i.e. these detectors now perform wordlist-based false positive checking on material that they did not before:
azurebatch
used to filter on account key, but now filters on endpointazurecontainerregistry
used to filter on password, but now filters on endpointcolumn
used to filter on nothing, but now filters on Rawfacebookoauth
used to filter on api id, but now filters on api secretgithubapp
used to filter on nothing, but now filters on private keypaypaloauth
used to filter on key id, but now filters on key secretpostgres
used to filter on the password only, but now filters on the entire connection stringpusherchannelkey
used to filter on the secret, but now filters on the app idshopify
used to filter on the key, but now filters on the key+domaintrufflehogenterprise
used to filter on the secret, but now filters on the key
I don't think that any of these are necessarily wrong but we should ensure that they're all acceptable. (E.g. shopify
now runs word lists against domains.)
It also looks like the url
detector lost all of its custom logic (specifically it no longer checks the allowKnownTestSites
flag, and no longer skips the word list). This seems less likely to have been intentional.
Thank you for your analysis, Cody. I've made the following changes, the rest seemed okay to me: [x] I've also extended the results flag with an additional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good! I'm happy to have been wrong about the work level. I noticed that my previous detector list was missing one: dotmailer
used to run the FP check against the password, but now it runs it against the key ID, which looks like an email address. That might be another one to exclude.
It also looks like you've got some new lint errors, but nothing show-stopping. Thanks for doing this!
This PR adds false positive information to the Result protobuf message in anticipation of us tracking it as first-class secret metadata. We're not doing that yet (it's blocked behind #2643) but setting up the messages now means we'll be able to do it later with less of a code delta.
This PR: Creates an optional interface that detectors can use to customize their false positive detection Implements this interface on detectors that have custom logic In most cases this "custom logic" is simply a no-op because the detector does not participate in false positive detection Eliminates inline (old-style) false positive exclusion in a few detectors that #2643 missed
Description:
Move detectors.IsKnownFalsePositive from the detectors and into the engine.
Refactor-robot was used but result was reviewed and altered in a few cases.
Checklist:
make test-community
)?make lint
this requires golangci-lint)?